Converting \u00e0 to appropriate character.

Dear Java folks,
I'm trying to import some Java source code produced on one system into another.
I have a zip file which contains some .java source files in which non-ascii characters have been converted to their equivalent Unicode string, for example:
String s1 = "\u00e0" ;
The original code looked like this:
String s1 = "�" ; <- that's an 'a' with an accent over it.
I'm reading the source code from a zip file using an InputStreamReader.
Is there any way to convert the "\u00e0" and other Unicode characters back into the correct characters as I read the zip entry?
fyi I'm running on WIndows XP and the original code is part of a Lotus Notes database. In Notes the characters look fine but when the code is 'exported' they get converted to \u00e0 by NOtes and there's not much I can do about that. Both the export and the import are running on the same computer.
TIA, Keith

If not, my next question is this, are there any
classes that will convert a Unicode escaped character
of the form \uxxxx to the appropriate Modified UTF-8
2,3 or 4 byte character required by Java?I have no idea what those last two lines mean. Which
part of Java is it that you think requires UTF-8
encoding? Certainly the compiler doesn't.To OP and to DrClap:
Indeed the compiler doesn't require UTF-8.
The compiler is perfectly happy accepting
either UTF-8 or \uxxxx.
However, the OP had lots of files (in UTF8 presumably) in Lotus.
When exported, Lotus turns the UTF8 into \uxxxx, which is very
hard for the OP to read on the screen.
So the OP wants to write a function that translates \uxxxx back into UTF8.
For example: whenever he sees \u05D0,
he would like to write out 2 bytes: 0xD7 and 0x90
(which is the UTF-8 encoding of the code point 0x05D0)
This way, when he loads the file in a text editor,
he would see the international character (rather than a obscure code like \u05D0)
To OP:
It's really simple. The conversion table is in the following link:
http://en.wikipedia.org/wiki/UTF8
There are resources on that page on the encoding.
But here is a slightly more detailed peusodocode:
public void write(int n)
  byte b=(byte)(n&0xFF);
  output.write(b);
  // This output object must be a Stream, not a Writer.
  // Otherwise, Java will interpret the number as a character, and mess it up
Whenever you see \uxxxx
   First, convert xxxx as a hexadecimal number, and store into integer varabie n
   if (n<=0x7F) {
       write(n);
   else if (n<=0x7FF) {
       write(0xC0|(n>>6));
       write(0x80|(n&0x3F));
   else if (n<=0xFFFF) {
       write(0xE0|(n>>12));
       write(0x80|((n>>6)&0x3F));
       write(0x80|(n&0x3F));
   else {
       write(0xF0|(n>>18));
       write(0x80|((n>>12)&0x3F));
       write(0x80|((n>>6)&0x3F));
       write(0x80|(n&0x3F));
   }

Similar Messages

  • How to convert bits into a character

    The code which i could get from this forum is specified below which converts a String into Bits.......i could successfully use this code in BitStuffing method for client side.....at server side i need to convert these bits into characters again.........
    class StringBits
    String str="This is a string";
    char ch[]=str.toCharArray();
    for(int i=0;i<ch.length;i++)
    System.out.printl(Integer.toBinaryString(ch));
    In this code each character is converted into its binary bits form.........now my problem is how to do it function the reverse way......i dont even know the return type of the method Integer.toBinaryString(char ch)..........
    please anyone let me know...which method to use to convert given binarycode into characters..........
    i.e if i have 0111100 how do i convert these bits into character again....
    thanks in advance
    Deepika

    1. Can't you do this by sending whole bytes instead of bits? It would make everything a lot easier.
    2. The solution you have probably is not going to work as is, since Integer.toBinaryString does not pad with 0's. If you have to have the bits, you should do something like:
    String s = ...
    byte bytes[] = s.getByteArray();
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < i bytes.length; i++) {
      int bit = 2;
      while (bit < 256) {
         sb.append(bytes[i] % bit);
         bit *=2;
    }This is of course little-endian, so be aware of that when translating on the other side

  • Convert/Interpret XML to Character/ABAP

    Hi,
    I am consuming a web-Service via client-proxy method. for this I've created the client proxy and called one of the methods in the class in the program.
    After passing the input values (country Code) to the web-service I am receiving the result (country Name) in a varliable (string format) but, the format of the returned data is XML.
    example:
    <NewDataSet> <Table> <countrycode>in</countrycode> <name>India</name> </Table> <Table> <countrycode>in</countrycode> <name>India</name> </Table> </NewDataSet>
    How can I get the vaule mentioned in variable <name> in the output ??
    Is there a way to convert this XML into character format and read the value in the variable <name> or parse every field from the output in the internal table what can be the approch and solution to do this ?
    /Mike

    Try using the code below for xml parsing...
    TYPE-POOLS: ixml.
    TYPE-POOLS : abap.
    FIELD-SYMBOLS: <dyn_table> TYPE STANDARD TABLE,
                   <dyn_table1> TYPE STANDARD TABLE,
                   <dyn_wa>,
                   <dyn_fieldvalue>,
                   <dyn_wa1>,
                   <dyn_field>,
                   <dyn_field1>,
                   <fs_1> TYPE table,
                   <fs_2> TYPE ANY,
                   <fs_3> TYPE ANY,
                   <fs_5> TYPE ANY.
    FIELD-SYMBOLS: <fs_fields>.
    DATA: dy_table TYPE REF TO data,
          dy_line  TYPE REF TO data,
          dy_datatype TYPE REF TO data,
          dy_table1 TYPE REF TO data,
          dy_line1  TYPE REF TO data,
          new_line  TYPE REF TO data,
          xfc TYPE lvc_s_fcat,
          ifc TYPE lvc_t_fcat.
    TYPES: BEGIN OF t_xml_line,
            data(256) TYPE x,
          END OF t_xml_line.
    TYPES: BEGIN OF gs_elem_value,
             element(30) TYPE c,
             value(30) TYPE c,
             recordid TYPE i,
           END OF gs_elem_value.
    DATA: gi_elem_value TYPE TABLE OF gs_elem_value ,
          gw_elem_value TYPE gs_elem_value.
    DATA: l_ixml            TYPE REF TO if_ixml,
          l_streamfactory   TYPE REF TO if_ixml_stream_factory,
          l_parser          TYPE REF TO if_ixml_parser,
          l_istream         TYPE REF TO if_ixml_istream,
          l_document        TYPE REF TO if_ixml_document,
          l_node            TYPE REF TO if_ixml_node,
          l_xmldata         TYPE string.
    DATA: l_elem            TYPE REF TO if_ixml_element,
          l_root_node       TYPE REF TO if_ixml_node,
          l_next_node       TYPE REF TO if_ixml_node,
          l_name            TYPE string,
          l_iterator        TYPE REF TO if_ixml_node_iterator.
    DATA: l_xml_table       TYPE TABLE OF t_xml_line,
          l_xml_line        TYPE t_xml_line,
          l_xml_table_size  TYPE i.
    DATA: l_filename        TYPE string.
    DATA :  gv_projectdetails TYPE string .
    DATA : xref TYPE REF TO cx_dynamic_check .
      PERFORM get_complete_path USING p_path2 p_file2 CHANGING gv_complete_path .
      Creating the main iXML factory
      l_ixml = cl_ixml=>create( ).
      Creating a stream factory
      l_streamfactory = l_ixml->create_stream_factory( ).
      PERFORM get_xml_table CHANGING l_xml_table_size l_xml_table.
      wrap the table containing the file into a stream
      l_istream = l_streamfactory->create_istream_itable( table = l_xml_table
                                                      size  = l_xml_table_size ).
      Creating a document
      l_document = l_ixml->create_document( ).
      Create a Parser
      l_parser = l_ixml->create_parser( stream_factory = l_streamfactory
                                        istream        = l_istream
                                        document       = l_document ).
      Validate a document
      IF pa_val EQ 'X'.
        l_parser->set_validating( mode = if_ixml_parser=>co_validate ).
      ENDIF.
      Parse the stream
      IF l_parser->parse( ) NE 0.
        IF l_parser->num_errors( ) NE 0.
          DATA: parseerror TYPE REF TO if_ixml_parse_error,
                str        TYPE string,
                i          TYPE i,
                count      TYPE i,
                index      TYPE i.
          count = l_parser->num_errors( ).
          WRITE: count, ' parse errors have occured:'.
          index = 0.
          WHILE index < count.
            parseerror = l_parser->get_error( index = index ).
            i = parseerror->get_line( ).
            WRITE: 'line: ', i.
            i = parseerror->get_column( ).
            WRITE: 'column: ', i.
            str = parseerror->get_reason( ).
            WRITE: str.
            index = index + 1.
          ENDWHILE.
          SKIP 2.
          WRITE : 'The input xml ' , p_file , '  is invalid and does not conform to the inset DTD. '.
          EXIT.
        ENDIF.
      Process the document if there are no errors
      ELSEIF l_parser->is_dom_generating( ) EQ 'X'.
        PERFORM process_dom USING l_document.
      ENDIF.
    *&      Form  get_xml_table
    FORM get_xml_table CHANGING l_xml_table_size TYPE i
                                l_xml_table      TYPE STANDARD TABLE.
      Local variable declaration
      DATA: l_len      TYPE i,
            l_len2     TYPE i,
            l_tab      TYPE tsfixml,
            l_content  TYPE string,
            l_str1     TYPE string,
            c_conv     TYPE REF TO cl_abap_conv_in_ce,
            l_itab     TYPE TABLE OF string.
      l_filename = p_file.
      upload a file from the client's workstation
      CALL METHOD cl_gui_frontend_services=>gui_upload
        EXPORTING
          filename   = l_filename
          filetype   = 'BIN'
        IMPORTING
          filelength = l_xml_table_size
        CHANGING
          data_tab   = l_xml_table
        EXCEPTIONS
          OTHERS     = 19.
      IF sy-subrc <> 0.
        MESSAGE ID sy-msgid TYPE sy-msgty NUMBER sy-msgno
                   WITH sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
      ENDIF.
    ENDFORM.                    "get_xml_table
    *&      Form  process_dom
    FORM process_dom USING document TYPE REF TO if_ixml_document.
      DATA: node      TYPE REF TO if_ixml_node,
            iterator  TYPE REF TO if_ixml_node_iterator,
            nodemap   TYPE REF TO if_ixml_named_node_map,
            attr      TYPE REF TO if_ixml_node,
            name      TYPE string,
            prefix    TYPE string,
            value     TYPE string,
            indent    TYPE i,
            count     TYPE i,
            index     TYPE i.
      node ?= document.
      CHECK NOT node IS INITIAL.
      ULINE.
      IF node IS INITIAL. EXIT. ENDIF.
      create a node iterator
      iterator  = node->create_iterator( ).
      get current node
      node = iterator->get_next( ).
      loop over all nodes
      WHILE NOT node IS INITIAL.
        indent = node->get_height( ) * 2.
        indent = indent + 20.
        CASE node->get_type( ).
          WHEN if_ixml_node=>co_node_element.
            element node
            name    = node->get_name( ).
            nodemap = node->get_attributes( ).
            gw_elem_value-element = name.
            IF NOT nodemap IS INITIAL.
              attributes
              count = nodemap->get_length( ).
              DO count TIMES.
                index  = sy-index - 1.
                attr   = nodemap->get_item( index ).
                name   = attr->get_name( ).
                prefix = attr->get_namespace_prefix( ).
                value  = attr->get_value( ).
              ENDDO.
            ENDIF.
          WHEN if_ixml_node=>co_node_text OR
               if_ixml_node=>co_node_cdata_section.
            text node
            value  = node->get_value( ).
            TRANSLATE value TO UPPER CASE.
            gw_elem_value-value = value.
            IF gw_elem_value-element = 'table_name'.
              gv_id = gv_id + 1.
            ENDIF.
            gw_elem_value-recordid = gv_id.
            APPEND gw_elem_value TO gi_elem_value.
            CLEAR gw_elem_value.
        ENDCASE.
        advance to next node
        node = iterator->get_next( ).
      ENDWHILE.
    ENDFORM.                    "process_dom

  • IMP-00069: Could not convert to environment national character set's handle

    While importing database objects from dmp we are getting the following Error
    C:\>imp chem/chem@chemdb full=y file='E:\eiproject\expdat.dmp' log=y;
    Import: Release 8.1.5.0.0 - Production on Thu Sep 13 10:28:54 2001
    (c) Copyright 1999 Oracle Corporation. All rights reserved.
    Connected to: Oracle8i Enterprise Edition Release 8.1.5.0.0 - Production
    With the Partitioning and Java options
    PL/SQL Release 8.1.5.0.0 - Production
    Export file created by EXPORT:V08.01.07 via conventional path
    import done in WE8ISO8859P1 character set and WE8ISO8859P1 NCHAR character set
    IMP-00069: Could not convert to environment national character set's handle
    IMP-00000: Import terminated unsuccessfully
    null

    Hi James,
    IMP-69 can occur if you try to use an EARLIER version of IMPORT against an export (.dmp) file produced by a LATER version of EXPORT.
    How about trying this:
    Use the 8.1.5 EXPORT utility from Win2K to connect to your Solaris 8.1.7 database; then use the 8.1.5 IMPORT utility to import the file into the 8.1.5 W2K database.
    Nat

  • Prevent XI from converting & #163;  to pound character

    The requirement is  in input XML  file & #163; comes..by default in output file xml it is converting  the above grouped of characters to £ sign.
    how do I retain  & # 163; in output xml file without writing any adapter module for solution
    Points will be awarded.
    Edited by: gvs kiran on Nov 7, 2008 6:46 AM
    Edited by: gvs kiran on Nov 7, 2008 6:46 AM
    Edited by: gvs kiran on Nov 7, 2008 6:47 AM

    I do not see a chance. It seems that all & #xxx; are replaced by their character equivalent.
    You have to replace the £ by a Java mapping or adapter module after the graphical mapping again.
    Regards
    Stefan

  • How 2 convert numeric value to character value?

    Hi friends,
    I want to convert numeric value to the character value.
    Is there any FM available?
    Points rewared soon.
    Regards
    Ronn

    REPORT ZSPELL.
    TABLES SPELL.
    DATA : T_SPELL LIKE SPELL OCCURS 0 WITH HEADER LINE.
    DATA : PAMOUNT LIKE SPELL-NUMBER  VALUE '1234510'.
    SY-TITLE = 'SPELLING NUMBER'.
    PERFORM SPELL_AMOUNT USING PAMOUNT 'USD'.
    WRITE: 'NUMBERS', T_SPELL-WORD, 'DECIMALS ', T_SPELL-DECWORD.
    FORM SPELL_AMOUNT USING PWRBTR PWAERS.
      CALL FUNCTION 'SPELL_AMOUNT'
           EXPORTING
                AMOUNT    = PAMOUNT
                CURRENCY  = PWAERS
                FILLER    = SPACE
                LANGUAGE  = 'E'
           IMPORTING
                IN_WORDS  = T_SPELL
           EXCEPTIONS
                NOT_FOUND = 1
                TOO_LARGE = 2
                OTHERS    = 3.
    ENDFORM.                               " SPELL_AMOUNT

  • Converting Japanese two Byte Character...

    Hi,
    I am doing a Scenario outbound from R/3.
    I am triggering the message via proxy using japanese language and sending to XI.
    In XI, we are getting the Mapping Error.
    Some records in the message contains single byte characters and some records having double byte characters.
    For single byte characters, XI is able to generate the target structure in the Mapping. But it is not able to convert the double byte characters to the target structure.
    Can any one help me to resolve this issue....
    Thanks in advance...
    Regards,
    Vasu

    Hi,
    Japanese data are Shift JIS encoded ? Maybe changing the encoding (or encoding declaration) could help ?
    Chris

  • Is it possible to convert � to a hex character?

    I'm looking for a character that represents the hex value of �
    I've been looking through the internet but haven't found any encouraging leads or clues. I'm thinking it may not even be possible.
    Any help will be appreciated.
    Thanks in advance

    At http://www.unicode.org you should be able to download tables. Alternatively you could write the character to a file and read it in as bytes and display the bytes in hex.

  • How to convert ascii value into character and vice versa

    Hello the java world people,
    I want to convert each characters from my array into their corespondent ascii value and vice versa, how can I do that ?

    The term "ASCII" is often used very loosely.
    Java char values are UNICODE and the ASCII codes are indentical to UNICODE characters in the range 0 .. 127. UNICODE values 128 and above don't have coresponding ASCII values, though 128-255 corespond to ISO-8859-1 which is one of the encodings often called "extended ASCII".
    As shown above you can covert between chars and coresponding int value simply with a cast, but you should be aware that the more exotic characters will not give you sensible values.

  • Converting local character styles (overrides) to proper styles

    Hi all,
    I've been handed a document that has no styles defined at all. All styling is done the local character styles ("overrides").
    This tutorial http://www.lynda.com/InDesign-tutorials/160-Convert-local-formatting-character-styles/8532 4/192449-4.html shows how to do a preflight, that detects all of the overrides, which is great, but you still have to manually fix them.
    The fastest way I've found so far is to do a search for a particular font-size/font-type, and then apply a proper character style to each match, until the overrides are all gone. But maybe there's a faster way? E.g. is it possible to "select all matching text" (as you can do in Google Docs), that would select all text matching that font-size/type, and then with one click apply a style to all selected text?
    Or maybe there's another way to convert all text matching e.g. font-size/font-type into a character style?
    Thanks!
    Bjoern

    The auto-char styles script tries to be semi-intelligent: It ignores the underlying paragraph style completely. Instead, for each paragraph, it checks which is the longest continuous run of formatting in that paragraph (textStyleRange). That formatting is considered the "underlying formatting of the paragraph", and any other formatting in that paragraph is considered an override, and has an appropriate character created (if necessary) and applied to it.
    The only exception to that is "italics", which is always considered an override, even there is more italics than regular in the paragraph (something that can happen in a bibliography entry, for instance).
    So, for instance, with the formatting you give as an example ("No style, + Arial Bold + size: 24 + Leading 28"), where everything but the final paragraph return has that applied, Auto-Char Styles will not apply a character style to most of the text, and apply an appropriate char style to the final enter only.
    That way, "all" that remains is to go through the document creating and applying paragraph styles as necessary, knowing that all local overrides have been styled with an appropriate char style.

  • How to convert character streams to byte streams?

    Hi,
    I know InputStreamReader can convert byte streams to character streams? But how to convert the character streams back to byte streams? Is there a Java class for that?
    Thanks in advance.

    When do you have to do this? There's probably another way. If you just start out using only InputStreams you shouldn't have that problem.

  • Approach to converting database character set from Western European to Unicode

    Hi All,
    EBS:12.2.4 upgraded
    O/S: Red Hat Linux
    I am looking for the below information. If anyone could help provide would be great!
    INFORMATION NEEDED: Approach to converting database character set from Western European to Unicode for source systems with large data exceptions
    DETAIL: We are looking to convert Oracle EBS database character set from Western European to Unicode to support Kanji characters. Our scan results show
    both “lossy (110K approx.)” and “truncation (26K approx.)” exceptions in the database which needs to be fixed before the database is converted to Unicode.
    Oracle Support has suggested to fix all open and closed transactions in the source Production instance using forms and scripts.
    We’re looking for information/creative approaches  who have performed similar exercises without having to manipulate data in the source instance.
    Any help in this regard would be greatly appreciated!
    Thanks for yourn time!
    Regards,

    There are two aspects here:
    1. Why do you have such large number of lossy characters? Is this data coming from some very old eBS release, i.e. from before the times of the Java applet interface to Oracle Forms?  Have you analyzed the nature of this lossy data?
    2. There is no easy way around truncation issues as you cannot modify eBS metadata (make columns wider). You must shorten or remove the data manually through the documented eBS interfaces. eBS does not support direct manipulation of data in the database due to complex consistency rules enforced by the application itself (e.g. forms).
    Thanks,
    Sergiusz

  • Import error converting character set

    Hi there, the first time i ever post something, so i hope i give enough information so my question can be answered.
    I have an oracle 8.1.6 database on Windows NT4. On the same machine i have installed Designer 6.
    Everything is working just fine i just can't import a dump file made from repository 6. It's not a full dump, so i first created a whole new repository.
    Now, when i want to import the dump file in command prompt, using the imp.exe (present in C:\<oracle_home>\bin), the next error occurs:
    IMP-00069: Could not convert to environment national character set's handle
    IMP-00000: Import terminated unsuccessfully.
    As the import is due to start, it gives the following message:
    import done in WE8iSO8859P1 character set and WE8IS08859P1 NCHAR character set.
    Then the error occurs.
    Now i wonder if this can be solved. I cannot see in the dump file what character set was used there. I did not make the export myself. Is there a way that i can convert the character set in the dumpfile or something? Or should i solve this problem by making changes in the database?
    Thx for your help, i am trying to solve this problem myself for over a week now and i'm getting tired of it.

    Hi friend,
    Same Problem i have faced, and i found that this is due to only the Version mismatch, u attempting to import data from higher version to lower one.
    Just check at ur end also.
    Bye
    Tehzeeb Ahmed

  • Convert chinese character to unicode

    hello,i have a problem.
    how can i know the unicode of chinese character in a file?
    like this character ' &#31216; ' in a file a.txt
    then how can i do to read the a.txt file then get the unicode of the character?
    really need help..

    i want to know what is the algorithm and the coding
    of that tool.
    thanksYou might consider downloading the open source project and looking directly at the code for that tool. I imagine that you could create something similar in...oh, maybe 25 lines of code.
    retrieve the command line args
    open the specified file using the charset encoding specified
    for (all characters in the file) {
    if character is > 0xFF convert to \uXXXX
    output character to new file
    John O'Conner

  • Convert Character to ASCII Number

    Hello.
    I would greatly apprectiate if someone could tell me how to convert a Number or Character to its ASCII decimal representation. For example if I have the number 9 I would like to save its ASCII decimal number 39 instead.
    Thank YOU!!

    By the way, getNumericValue doesn't return the unicode value of the character but the numeric value - for instance it returns the int 9 for the character '9' and the int 15 for the character 'F' (since 15 is F in hex).

Maybe you are looking for