Reading a non-english character

Hi, I have a trouble with reading a non-english character from a html page.
I'm taking the word from the html page, and compare it with itself,
like this
string.equals("BİTTİ")
but it returns false.
is it possible to correct this?

specify an encoding for your inputstream reader:
BufferedReader in = new BufferedReader(
            new InputStreamReader(new FileInputStream("infilename"), "8859_1")); for example

Similar Messages

  • Getting a request in a non English character

    Hi ,
    In an attempt to solve a problem of getting a request in a non English character , i use the code , taken from O'Reilly's "Java Servlet programing" First edition:
    import javax.servlet.*;
    import javax.servlet.http.*;
    import java.io.*;
    public class MyServlet extends HttpServlet {
         public void doGet(HttpServletRequest req, HttpServletResponse res)
                                                                               throws ServletException, IOException {
              try {
                                                      //set encoding of request and responce
         req.setCharacterEncoding("Cp1255"); //for hebrew windows
         res.setCharacterEncoding("Cp1255");
         res.setContentType("Text/html; Cp1255");
         String value = req.getParameter("param");
                                                      // Now convert it from an array of bytes to an array of characters.
         // Here we bother to read only the first line.
                                                      BufferedReader reader = new BufferedReader(
         new InputStreamReader(new StringBufferInputStream(value), "Cp1255"));
                                                      String valueInUnicode = reader.readLine();
              }catch (Exception e) {
              e.printStackTrace();
    this works fine , the only problem is that StringBufferInputStream is deprecated .
    is there any other alternative for that ?
    Thanks in advance
    Yair

    Hi Again ..
    To get to the root of things , here is a servlet test and an http client test which demonstrates using the above patch and not using it :
    The servlet :
    import javax.servlet.*;
    import javax.servlet.http.*;
    import java.io.BufferedReader;
    import java.io.IOException;
    import java.io.InputStreamReader;
    import java.io.PrintWriter;
    import java.io.StringBufferInputStream;
    public class Hebrew2test extends HttpServlet {
         public void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
              request.setCharacterEncoding("Cp1255");
              response.setCharacterEncoding("Cp1255");
              response.setContentType("Text/html; Cp1255");
              PrintWriter out = response.getWriter();
              String name = request.getParameter("name");
              //print without any patch
              out.println(name);
              //a try with patch 1 DEPRECATED
              out.println("patch 1:");
              BufferedReader reader =
              new BufferedReader(new InputStreamReader(new StringBufferInputStream(name), "cp1255"));
              String patch_name = reader.readLine();
              out.println(patch_name);
              //a try with patch 2 which doesn't work          
              out.println("patch 2:");
              String valueInUnicode = new String(name.getBytes("Cp1255"), "UTF8");
              out.println(valueInUnicode);
    and now for a test client :
    import java.io.*;
    import java.net.*;
    public class HttpClient_cp1255 {
    private static void printUsage() {
    System.out.println("usage: java HttpClient host port");
    public static void main(String[] args) {
    if (args.length < 2) {
    printUsage();
    return;
    // Host is the first parameter, port is the second
    String host = args[0];
    int port;
    try {
    port = Integer.parseInt(args[1]);
    catch (NumberFormatException e) {
    printUsage();
    return;
    try {
    // Open a socket to the server
    Socket s = new Socket(host, port);
    // Start a thread to send reuest to the server
    new Request_(s).start();
    // Now print everything we receive from the socket
    BufferedReader in = new BufferedReader(new InputStreamReader(s.getInputStream(),"cp1255"));
    String line;
    File f = new File("in.txt");
    FileWriter out = new FileWriter(f);
    while ((line = in.readLine()) != null) {
    System.out.println(line);
    out.write(line);
    out.close();
         catch (Exception e) {
    e.printStackTrace();
    class Request_ extends Thread {
    Socket s;
    public Request_( Socket s) {
    this.s = s;
    setPriority(MIN_PRIORITY); // socket reads should have a higher priority
    // Wish I could use a select() !
    setDaemon(true); // let the app die even when this thread is running
    public void run() {
    try {
                        OutputStreamWriter server = new OutputStreamWriter(s.getOutputStream(),"cp1255");
                        //String query= "GET /userprofiles/hebrew2test?name=yair"; //yair in Englisg ..
                        String query= "GET /userprofiles/hebrew2test?name=\u05d9\u05d0\u05d9\u05e8"; //yair in hebrew - in unicode
                   System.out.println("Connected... your HTTP request is sent");
                        System.out.println("------------------------------------------");
                        server.write(query);
                        server.write("\r\n"); // HTTP lines end with \r\n
                        server.flush();
                        System.out.println(server.getEncoding());
         server =      new OutputStreamWriter(new FileOutputStream("out.txt"),"cp1255");
                        server.write(query);
                        server.flush();
    catch (Exception e) {
    e.printStackTrace();

  • Does querybuilder support non-english character?

    I want to make query using querybuilder with non-english character (Chinese)?
    I tried with http://localhost:4502/libs/cq/search/content/querydebug.html but it is not working.
    below is my query string:
    property=contenttext
    property.value=&#20320;&#22909;&#21966;
    I have converted the chinese character (你好嗎)to unicode.
    Can anyone help me?

    That's a bug in the debugger UI. But it's easy to fix:
    in crxde lite, overlay /libs/cq/search/components/querydebug/querydebug.jsp by copying it to /apps/cq/search/components/querydebug/querydebug.jsp
    open /apps/cq/search/components/querydebug/querydebug.jsp
    find the line "props.load(new ByteArrayInputStream(queryParam.getBytes("ISO-8859-1")));"
    and replace with "props.load(new StringReader(queryParam));"
    Will be fixed in 5.6.1.

  • How to validate for non-english character on a single line text field

    In a "Single Line Text" field we would like to allow the users to enter alpha numeric values only. We should show error when the user enter non-English values like
    carácter
    Vijayaragavan, MCTS

    Hi,
    According to your post, my understanding is that you wanted to validate for non-english character on a single line text field.
    I recommend to use jQuery to attach regular expression validation. Please refer to:
    Using #jQuery to attach regular expression validation to a #SharePoint list form field
    In addition, for custom validations you can create your own Types. Refer to
    this[^] for creating custom field type
    More information:
    SharePoint Custom Field - Regex Validator
    Thanks,
    Linda Li                
    Forum Support
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact
    [email protected]
    Linda Li
    TechNet Community Support

  • Error when import file with non-english character

    Hi,<br /><br />I have images file with non-english character (unicode), for example ABC<X>.png where <X> is non-english character such as japanese, chinese, etc.<br /><br />Whenever I want to import the file to After Effects (right click -> import -> file), I always encounter error:<br /><br />Finding file/dir info for the file "C:\...\ABC?.png" -- file not found (-43) (3::30)<br />Can't import file "ABC?.png": unsupported filetype or extension. (0::1)<br /><br />My PC is Windows XP Professional 2002 SP2 English.<br /><br />How to solve this problem?<br /><br />Thanks

    Adjust your system language settings. Proper file name conventions require a consistent Unicode environment, so install the respective foreign language support files or switch the language system-wide. Mixing different zones/ code ranges is always a bad idea. If your system is not in Japanese, AE will always misinterpret the characters and refuse to import. If that's not feasible, simply rename the files.
    Mylenium

  • Non-english character display as square box

    Hi all,
    I'm not very sure if this question should be asked here or in the JRE board, thus I'm trying here also
    I have been trying an opensourced application called Alliancep2p (could be obtained from www.alliancep2p.com) using JRE 1.6 on an English Windows XP Pro machine.
    The problem:
    all chinese input are displayed as "square box". It looks like the programme "gets" the correct character, only that everything is displayed as "square box".
    It looks like a font issue, though I'm not that sure. Is there anyway the default fonts could be changed, or to get the characters correctly displayed?
    Note: I have east asian fonts installed, and the Java config panel can display chinese or other non-english characters correctly.
    I tried the same application under GNU/Linux (locale is UTF-8) and chinese input/display correctly without any problem at all. Does it mean that it is not the problem of the application, or?
    The original question in the JRE board:
    http://forum.java.sun.com/thread.jspa?threadID=5265369&tstart=0
    Thanks for all the input.

    I'm not really sure if it's a problem of the application or not. But the fact that it works perfectly under Linux makes me think maybe it's not the problem of the program, and actually their developers said that unicode is being used all over the program and seems like they're not CJK users also.
    I'm not a java guru so I can't really tell from the source if there's anything wrong.

  • Linux or JVM: cannot display non english character

    hi,
    i am trying to implement a GUI that supports both turkish and english. user can switch between them on the fly.
    public class SampleGUI {
    JButton trTranslate = new JButton(); /* Button, to translate into turkish */
    /* Label text will be translated */
    JLabel label = new JLable("Text to Be Translated!");
    trTranslate.addActionListener (new ActionListener(){
    void ActionPerformed(ActionEvent e){
    String language="tr";
    String country="TR";
    Locale currentLocale;
    ResourceBundle messages;
    currentLocale = new Locale(language, country);
    messages = ResourceBundle.getBundle("TranslateMessages",currentLocale);
    /* get from properties file turkish match of "TextTranslate "*/
    label.setText(messages.getString("TextToTranslate"));
    Finally, my problem is my application does not display non english chracaters like "� &#351; � &#287; � i" in GUI after triggering translation.However, if i do not use ResourceBundle and instead assign directly the turkish match for that label (i.e. label.setText("&#351;&#351;&#351;&#351;&#351;")), GUI successfully displays turkish characters. what may be the problem? which encoding set does not conform?
    ps : i am using redhat linux8.0, j2sdk1.4.1. current locale = "tr_TR.UTF-8". in /etc/sysconfig/keyboard , keyTable = "trq". There seems no problem for me as i can input and output
    turkish characters. OS supports this. Also jvm gets the current encoding from OS.It seems as if there is a problem in reading properties file in inappropriate encoding.
    thanx for dedicating ur time and effort,
    hELin

    I would suspect it would work in vim only if vim supported the UTF8 character set. I have no idea if it does.
    Here is one blurb I found on google:
    USING UNICODE IN THE GUI
    The nice thing about Unicode is that other encodings can be converted to it
    and back without losing information. When you make Vim use Unicode
    internally, you will be able to edit files in any encoding.
    Unfortunately, the number of systems supporting Unicode is still limited.
    Thus it's unlikely that your language uses it. You need to tell Vim you want
    to use Unicode, and how to handle interfacing with the rest of the system.
    Let's start with the GUI version of Vim, which is able to display Unicode
    characters. This should work:
         :set encoding=utf-8
         :set guifont=-misc-fixed-medium-r-normal--18-120-100-100-c-90-iso10646-1
    The 'encoding' option tells Vim the encoding of the characters that you use.
    This applies to the text in buffers (files you are editing), registers, Vim
    script files, etc. You can regard 'encoding' as the setting for the internals
    of Vim.
    This example assumes you have this font on your system. The name in the
    example is for X-Windows. This font is in a package that is used to enhance
    xterm with Unicode support. If you don't have this font, you might find it
    here:
         http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz

  • Spool non english character names to a file

    Hi There,
    We have a table which has around a million rows. We just need to select two columns(out of which one is a name field with english and non english names) and spool the data to a file. The problem is that through sqldeveloper, if I choose csv or dsv option, the non english names, like chines charaters etc show up as question marks. Is there a better way or format to do this. I tries xls. But although it goes through successfully, when I open the excel file, it has nothing. I was able to do it for around 200000 rows in excel.
    Any suggesstions?
    Thanks,
    Sun
    Edited by: ryansun on Jun 23, 2012 4:24 AM

    ryansun wrote:
    Hi There,
    We have a table which has around a million rows. We just need to select two columns(out of which one is a name field with english and non english names) and spool the data to a file. The problem is that through sqldeveloper, if I choose csv or dsv option, the non english names, like chines charaters etc show up as question marks. Is there a better way or format to do this. I tries xls. But although it goes through successfully, when I open the excel file, it has nothing. I was able to do it for around 200000 rows in excel.
    Any suggesstions?
    Thanks,
    Sun
    Edited by: ryansun on Jun 23, 2012 4:24 AMwhen dealing with non-ASCII characters, two different issues can exist,
    1) data storage - incorrect byte value is stored
    2) data presentation - incorrect character is displayed.
    Can the utility utilized to view the *CSV file actually display the non-ASCII value properly?
    can you inspect the *CSV file using an hexadecimal editor? what do you see inside the file?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

  • How does one install non-English character sets for use with the "find" function in Acrabat Pro 11?

    I have pdf files in European languages and want to be able to enter non-English characters in the "find" function. How does one install other character sets for use with Acrobat Pro XI?

    Have you tried applying the update by going to Help>Updates within Photoshop Lightroom?  The update should be using the same licensing?  Did you perhaps customize the installation location?  Finally which operating system are you using?

  • Csv upload -- suggestion needed with non-English character in csv file

    <p>Hi All,</p>
    I have a process which uploads a csv file into a table. It works with the normal english characters. In case of non-English characters in the csv file it doesn't populate the actual columns.
    My csv file content is
    <p></p>First Name | Middle Name | Last Name
    <p><span style="background-color: #FF0000">José</span> | # | Reema</p>
    <p>Sam | # | Peter</p>
    <p>Out put is coming like : (the last name is coming as blank )</p>
    First Name | Middle Name | Last Name
    <p><span style="background-color: #FF0000">Jos鬣</span> | Reema | <span style="background-color: #FF0000"> blank </span></p>
    <p>Sam | # | Peter</p>
    http://apex.oracle.com/pls/otn/f?p=53121:1
    workspace- gil_dev
    user- apex
    password- apex12
    Thanks for your help.
    Manish

    Manish,
    PROCEDURE csv_to_array (
          -- Utility to take a CSV string, parse it into a PL/SQL table
          -- Note that it takes care of some elements optionally enclosed
          -- by double-quotes.
          p_csv_string   IN       VARCHAR2,
          p_array        OUT      wwv_flow_global.vc_arr2,
          p_separator    IN       VARCHAR2 := ';'
       IS
          l_start_separator   PLS_INTEGER    := 0;
          l_stop_separator    PLS_INTEGER    := 0;
          l_length            PLS_INTEGER    := 0;
          l_idx               BINARY_INTEGER := 0;
          l_quote_enclosed    BOOLEAN        := FALSE;
          l_offset            PLS_INTEGER    := 1;
       BEGIN
          l_length := NVL (LENGTH (p_csv_string), 0);
          IF (l_length <= 0)
          THEN
             RETURN;
          END IF;
          LOOP
             l_idx := l_idx + 1;
             l_quote_enclosed := FALSE;
             IF SUBSTR (p_csv_string, l_start_separator + 1, 1) = '"'
             THEN
                l_quote_enclosed := TRUE;
                l_offset := 2;
                l_stop_separator :=
                       INSTR (p_csv_string, '"', l_start_separator + l_offset, 1);
             ELSE
                l_offset := 1;
                l_stop_separator :=
                   INSTR (p_csv_string,
                          p_separator,
                          l_start_separator + l_offset,
                          1
             END IF;
             IF l_stop_separator = 0
             THEN
                l_stop_separator := l_length + 1;
             END IF;
             p_array (l_idx) :=
                (SUBSTR (p_csv_string,
                         l_start_separator + l_offset,
                         (l_stop_separator - l_start_separator - l_offset
             EXIT WHEN l_stop_separator >= l_length;
             IF l_quote_enclosed
             THEN
                l_stop_separator := l_stop_separator + 1;
             END IF;
             l_start_separator := l_stop_separator;
          END LOOP;
       END csv_to_array;and
    PROCEDURE get_records (p_clob IN CLOB, p_records OUT varchar2_t)
       IS
          l_record_separator   VARCHAR2 (2) := CHR (13) || CHR (10);
          l_last               INTEGER;
          l_current            INTEGER;
       BEGIN
          -- SIf HTMLDB has generated the file,
          -- it will be a Unix text file. If user has manually created the file, it
          -- will have DOS newlines.
          -- If the file has a DOS newline (cr+lf), use that
          -- If the file does not have a DOS newline, use a Unix newline (lf)
          IF (NVL (DBMS_LOB.INSTR (p_clob, l_record_separator, 1, 1), 0) = 0)
          THEN
             l_record_separator := CHR (10);
          END IF;
          l_last := 1;
          LOOP
             l_current := DBMS_LOB.INSTR (p_clob, l_record_separator, l_last, 1);
             EXIT WHEN (NVL (l_current, 0) = 0);
             p_records (p_records.COUNT + 1) :=
                REPLACE (DBMS_LOB.SUBSTR (p_clob, l_current - l_last, l_last),
             l_last := l_current + LENGTH (l_record_separator);
          END LOOP;
       END get_records;Denes Kubicek
    http://deneskubicek.blogspot.com/
    http://www.opal-consulting.de/training
    http://htmldb.oracle.com/pls/otn/f?p=31517:1
    -------------------------------------------------------------------

  • Identify Non English Character in a String

    All,
    We have a requirement to Identify the Non English Characters from the User Key In data and return an error message saying only valid English, Numeric and some special characters are allowed.
    For Example, If the User enters data like "This is a Test data" then the return value should be true. or if he enters something like "My Native Language is inglés" then it should return false. Similarly any Chinese, russian or japansese character entryies should also return false.
    How can we achieve this?
    Thanks,
    Nagarajan.

    Hi Nagarajan,
    You could use Unicode character blocks or simply craft a regular expression that contains all the characters you need. The latter is easy to understand and gives you full control over which characters you want to allow. E.g. I assume you might want something like this:
    if(!"This is a proper input string".matches("[\\s\\w\\p{Punct}]+")) {
      // Issue error message and re-get input string
    The String method matches() takes a regular expression as input parameter. If you haven't dealt with regular expressions before, check out the Java API help for class java.util.regex.Pattern. Here's a short breakdown of the pattern I used:
    <ol>
    <li>The square brackets [] enclose a list of allowed characters; here you can explicitly list all allowed characters.</li>
    <li>You can specify ranges like a-z as a character class, list individual characters like ;:| or utilize predefined character classes (\s for any whitespace character, \w for all letters a-z and A-Z, underscore and 0-9 and the posix class \p for a list of punctuation symbols). For a complete list check Java API help on java.util.regex.Pattern.
    <li>The + at the end indicates that the characters listed can occur once or more.</li>
    </ol>
    There's other ways to achieve what you want, but I think this might be an easy way to start with.
    Cheers, harald

  • Problem with Vcard and non-English character

    VCard feature is what I would like to use, but I have quite a few contacts with Non-English name (Korean).
    I know Ipod can display in Korean, but when I create a v-card with Korean character and copy the vcard file over into /contacts folder, I can see the filename as the person's name (From windows explorer), but I can ONLY see first character of the file when I display contacts in iPod.
    Does anyone have tips/tricks on displaying all the filename in IPod contacts?
    Thanks.
      Windows XP Pro  

    Because i use the string nota into a jsp page and i print the string nota into a textarea and the text is with no newline, example:
    <textarea name="nota" rows="4" cols="60"><%= nota %></textarea>
    the text into textarea is:
    first linesecond linethird line
    but i want that the text displayed into textarea is equal the text into the CDATA section:
    first line
    second line
    third line

  • Read a non english word from text file

    While Reading thai charater from text file which was sent by QAD(a different application),
    We are reading 60 char using substr() function.
    If the data is English word it reads correctly with 60 char.
    But if it is in thai characters it returns more than 60 char.
    In oracle all NLS Char set has been already set.
    Can anyone help in this issue
    thanks in advance

    Maybe you should use SUBSTRC, SUBSTR2 or SUBSTR4 depending on the character set of your database. See http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96540/functions119a.htm#87068
    Message was edited by:
    Pierre Forstmann

  • N91 problem sending SMS with non English character...

    Hello I own a N91 4GB version. I have upgraded to version 2.20.008. I checked recently and have not found a newer version available for download.
    When composing an SMS a counter appears on the upper part of the screen, which shows how many characters remain and how many SMS(s) will be sent.
    When I use my native language, i.e. Greek, the counter starts as usual with 160 characters. When I type the first character it drops to 69 characters and then it works correctly decreasing the counter by 1. The result is that the phone sends more than 1 SMS even if there are no more than 160 characters.
    The problem does not appear if I use English characters.
    Is there a way to fix this? Do other users have the same problem?

    Hello alsanico,
    I'm from Greece too.
    I haven't seen N91's exact menu but i suppose it has similarities with my N95. They both run S60.
    Settings-
    General-
    Personalisation-
    Language-
    Writing Language-Ellinika
    If you have already made these settings then go to:
    Messaging-
    Options (left selection key)-
    Settings-
    Text Message-
    Character encoding-
    Reduced support
    Hope this helps...

  • Using xsgl:insert-request to insert non-English character data from HTML form

    I tried to insert a text written in French and pasted into an HTML form into my database following the recipe from the XSQL release notes.
    The table:
    create table content_object (
    id number(9) constraint pk_content_object primary key,
    author number(9),
    title varchar2(256),
    abstract varchar2(1024),
    object_type_id number(9),
    content_meat clob,
    XML file the form is posted to:
    <?xml version="1.0"?>
    <xsql:insert-request connection="demo"
    xmlns:xsql="urn:oracle-xsql"
    table="content_object"
    transform="article_form_to_content_object.xsl"/>
    article_form_to_content_object.xsl:
    <?xml version = '1.0'?>
    <ROWSET xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xsl:version="1.0">
    <xsl:for-each select="request/parameters">
    <ROW>
    <TITLE><xsl:value-of select="title_field"/></TITLE>
    <ABSTRACT><xsl:value-of select="abstract_field"/></ABSTRACT>
    <CONTENT_MEAT><xsl:value-of select="content_field"/></CONTENT_MEAT>
    </ROW>
    </xsl:for-each>
    </ROWSET>
    The result was:
    <ROW num="2">
    <ID>6</ID>
    <TITLE>Le go</TITLE>
    <ABSTRACT>Vin bouchonn</ABSTRACT>
    <CONTENT_MEAT>Des conditions de conservation au tant d</CONTENT_MEAT>
    </ROW>
    All strings were cropped at the point where the first accented character appeared.
    Any idea why this is happening? With English texts it works just fine.
    Maciej

    Another user hit this problem this week. I debugged it and posted the reply in the discussion thread:
    http://technet.oracle.com:89/ubb/Forum11/HTML/002799.html
    It's a JDK bug in character set conversion that has a simple workaround, just indicate the ISO-8859-1 encoding on your XSQL page. See the end of this other thread for details.

Maybe you are looking for

  • OBI 11G sUnquotedTableName.empty() error while creating a report

    hello guru's We have a problem when trying to build a report in OBI 11G. We have migrated an RPD from 10 to 11 and one 1 installation DEV all is OK, on TEST we get all kind of problems. One of the main issues is when trying to build a report and clic

  • Qosmio G20: How to get higher screen resolution on external 22" monitor

    I have a Qosmio G20 and today I bought a Samsung 226BW external screen. The Samsung has a max/optimal resolution of 1680x1050px. The Toshiba manual sais the laptop supports an external screen resolution up to 2048x1536 ... I have now tried all kinds

  • "Something is wrong on our end"

    Cloud keeps saying there is something wrong on their end and to try again later. Nothing else. No other explanation or anything. I waited a day, and it still says this. I have a deadline to meet tomorrow and have NO TIME to waste on shoddy software d

  • DisplayPort to ViewSonic VG2030wm Issue

    I just received my MacBook Brick and after connecting it to my external desk monitor(ViewSonic VG2030wm) via a DVI cord with the DisplayPort DVI adapter I see nothing. Nothing happens when I click detect displays. Have restarted the computer, unplugg

  • Problem in upgrading Standard RPRTEF00

    As per my requirement copied standard report RPRTEF00 and added "Trip status approval"  & "Trip status settlement" to selection screen. when trying to output "To Be Settled" XYZ Trip details its giving blank output, as the program is trying to import