Converting non-English characters to their unicode representation

I have series of files/templates where each contains a locale specific language such as Chinese, Japanese or German. I need to find out how do I get their unicode representations so I can send as html formatted email?
I can already send one for the English template as html formatted email w/out a problem. I was able to find a sample of unicode representation of Japanese and send that as a test. But how do I get the temaplates that I have and convert their contents into unicode?
Thanks in advance.
please dis-regard. I figured it out.
chehrehk

You need to know what character encoding was used for the template text.
For example, you could have Japanese text encoded using UTF-8 or
encoded using ISO-2022-JP and the same Japanese characters would
be represented as a different sequence of bytes. Without knowing which
charset was used, you won't be able to convert the byte sequence back
into Unicode characters (e.g., to store in a Java String).
If you do know which charset was used, java.io.Reader will convert the
byte stream into Unicode characters.
If the charset information is not available, there are heuristics that you
can use to try to guess the correct charset, but by their nature they're
going to be wrong sometimes.

Similar Messages

  • Problem in converting Spool to PDF file, having non-English characters

    Hi All,
            I have problem in converting Spool to PDF format.
    Scenario : I have a spool which has non-English characters. I am using CONVERT_ABAPSPOOLJOB_2_PDF  FM to perform conversion. But my output is having junk values( ie # ) for non-English characters. Any pointers to solve this issue will be appreciated.
    I even tried with report RSTXPDFT4 , it also gives me the same junk characters.
    Regards,
    Navin.

    Hi All,
            I have problem in converting Spool to PDF format.
    Scenario : I have a spool which has non-English characters. I am using CONVERT_ABAPSPOOLJOB_2_PDF  FM to perform conversion. But my output is having junk values( ie # ) for non-English characters. Any pointers to solve this issue will be appreciated.
    I even tried with report RSTXPDFT4 , it also gives me the same junk characters.
    Regards,
    Navin.

  • Can username, password have unicode(non english) characters

    Does Oracle allow a username, password to have non english characters.

    I found the answer to my own question. In essence it looks like the answer is Yes for UserName and a little confusing for the Password. Password has to be in Single byte characterset, not clear whether 7 bit or 8 bit. 8 bit from my understanding is Ascii + some Western European characters.
    Following is from Oracle 10G Database SQL reference
    user
    Specify the name of the user to be created. This name can contain only characters from your database character set and must follow the rules described in the section "Schema Object Naming Rules". Oracle recommends that the user name contain at least one single-byte character regardless of whether the database character set also contains multibyte characters.
    Note:
    Oracle recommends that user names and passwords be encoded in ASCII or EBCDIC characters only, depending on your platform.
    BY password
    The BY password clause lets you creates a local user and indicates that the user must specify password to log on to the database. Passwords can contain only single-byte characters from your database character set regardless of whether the character set also contains multibyte characters.
    Passwords must follow the rules described in the section "Schema Object Naming Rules", unless you are using the Oracle Database password complexity verification routine. That routine requires a more complex combination of characters than the normal naming rules permit.

  • Encrypting non-English characters

    Hi,
    I have this application which has to do the following
    Scenario (i)
    - Read ENCODED string SE from Network Source NS1 (Native,Non-JAVA)
    - Decode SE to SD using the same charset as NS1
    - Apply some transformation to SD to get SD2
    - Encode SD2 to get SE2 using the same charset as Network Source NS2
    (Native, Non-JAVA)
    - Send SE2 to NS2
    - NS2 gets what it expects without any problems :))
    Scenario (ii)
    - Read ENCODED string SE from Network Source NS1 (Native,Non-JAVA)
    - Decode SE to SD using the same charset as NS1
    - Apply some transformation to SD to get SD2
    - Get the bytes from SD2 as BSD2
    - Encode BSD2 to get BSE2 using the same charset as Network Source NS2
    (Native, Non-JAVA)
    - Encrypt BSE2 to BSE2_Enc
    - Send BSE2_Enc to NS2 (Native,Non-JAVA)
    - NS2 does not gets what it expects :((
    (It recieves English text OK but it gets ???? for non-English)
    The charset being used is windows-1256 (at NS1,NS2 and my application)
    Encryption is being done using BouncyCastle TwoFish w/ 256 bit keys
    Reading/Writing from/to network is being done over SocketChannel
    Get the bytes from SD2 as BSD2 => byte [] BSD2 = SD2.getBytes()It seems the non-Enlish characters are getting lost when I go SD2.getBytes()
    and they get encrypted as 'lost-non-English characters' ;)
    And when they get decrypted at NS2, they are displayed as 'lost-non-English
    characters' :)) i.e. ??????? .. so on
    Is there a way I can encrypt non-English plain text without losing information ?
    (without having to implement a TwoFish engine in my application itself)

    1) Bytes are not characters. Characters are UNICODE
    and have a byte representation defined by an encoding
    scheme. It is usually wrong to use the default
    encoding given by String.getBytes(). One should realy
    use String.getByte(encoding) eg
    "fred".getBytes("UTF-8");Awwllright ... got that :)) Thanks buddy
    2) Not having access to your code makes it difficult
    but make sure you are not converting encrypted bytes
    to a String using new String(encrypted bytes); No .. I am not doing that.
    3) Again, not having access to your code makes it
    difficult , but when you display your Strings make
    sure that you use a Font that has representations for
    all the UNICODE characters you wish to display. It is
    normal for any character that does not have a valid
    glyph in a gien font to display as a box.That infrastructure exists and is working fine ... as I mentioned
    this is working OK when plain text is being used.
    The problem was with using the getBytes() rather than getBytes("windows-1256")
    Its working now ... thanks alot .. again. I wonder how that never occurred to me.

  • URLEneQuery encoding is failing for some non english characters

    While creating a URLEneQuery we are getting error com.endeca.navigation.InternalException: No support for 8-bit urls.
    This error happens when the query string has some non english characters. (eg: Á).

    UrlENEQuery is designed around processing URL data, and URLs are not permitted non-ASCII characters in their production. To represent non-ASCII characters they must be %-encoded in URLs according to their byte(s) representation in a particular character-encoding, and you should prefer UTF-8 for URLs. So your LATIN CAPITAL LETTER A WITH ACUTE (U+00C1) should appear as %C3%81 in your URL, then UrlENEQuery should be able to process that character.

  • Non-English characters

    Hello, I have read several times that since Java uses Unicode, it solves the problems of non-English characters automatically or something like that.
    But my app is not working as expected. Would someone help please?
    I have a client/server combo written in Java. The server can send messages in English or Japanese. The Japanese messages are hard-coded as String literals in the server source code. On the client side, they are displayed on a JEditorPane. But the Japanese characters are all garbled. The OS on the server side and client side are, of course, different.
    My supposition, which is obviously wrong as it is not working, is that since both ends of communication are Java app, I need not worry about any encoding conversions for String literals.
    Suggest me what is wrong here?

    How is the required encoding/decoding supposed to be done?
    When I didn't worry about non-English characters, I did the following, which WORKED.
    // SENDER side
    Socket socket ;
    PrintWriter     out = new PrintWriter(socket.getOutputStream(),true);
    String outMessage = "my message";
    out.println(outMessage);//RECEIVER
    Socket socket ;
    BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
    String inMessage =  in.readLine();When non-English characters are involved, I did the following, which DID NOT WORK. Please someone correct me.
    // SENDER side
    Socket socket ;
    PrintWriter     out = new PrintWriter(socket.getOutputStream(),true);
    String outMessage = "my message";
    String utfString = new String(outMessage.getBytes(),"UTF-8");
    out.println(utfString);//RECEIVER
    Socket socket ;
    InputStreamReader ins = new InputStreamReader(clientSocket.getInputStream(),"UTF-8");
    BufferedReader in = new BufferedReader(ins);
    String inMessage =  in.readLine();The received message is still garbled.

  • Non english characters in DN cannot be retrieved

    We are using Netscape directory server 4, protocal V3. We have a problem related to non-english characters appearing in RDN.
    We publish to Ldap entries using the values from database. For example, we have pubulished an entry to Ldap, based on DB values, the entry should have a DN like: ou=Liege BELGIUM ... LGG1a, <other components of DN>. However, when we call netscape search API (search against uid attribute which does not have non-english characters), the search return the entry, but when further call getDN() method on the returned Ldap Entry, it only returns Li, instead of the complete DN value.
    It seems the entry is corrupted in Ldap. I wanted to delete the corrupted entry and re create new one to test. I tried many ways, but none of them worked, I think it is because DN is corrupted, there is no key value to identify the Ldap entry for any operation(modify, delete).
    You help and insights are much appreciated.
    Thanks.
    Han Shen

    LDAP uses the UTF8 encoding. You must store data in the directory using the UTF8 encoding. This includes DN values. This also means that if you want to be able to view the values in your native character set and font, you must use an application that can convert the UTF8 LDAP data back to the native character encoding. The directory console by default should work for LATIN-1 (ISO 8859) languages if the LOCALE is set correctly.

  • Only VBA does not recognize non-English characters

    Hello guys,
    I have a new laptop with Windows 8.1 bought in the USA and I'm having a difficulties with Excel VBA (Office 365 University-64x bought in the Czech Republic - Central Europe). The VBA does not recognize non-English characters (particularly "ř" and
    "ů") which causes me problem when running some codes that I wrote earlier on my previous laptop (Windows 7, bought in the Czech Republic with the same Office). 
    The problem with non-English characters has occurred only in VBA so far, otherwise I can use these characters normally in Excel cells, Word... I tried to install both English and Czech version of the Office with no change, I also installed Czech proofreading
    tools and set everything to Czech in the Office. The location and language preferences in the Windows are also set up to Czech. And it is not a problem of a font. I also mentioned that when I tried to look up these characters, using Ctrl+F, it changes
    original ř to r after a search and again this is only an issue of the VBA.   
    Thank you very much for any help.
    Tom

    Hi Tom,
    VBA for Excel can only recognize ASCII code from 0 to 255, if you use other special characters like "ř" or "ů", it will returns 63(?) to you. To use this kind of characters, you have to utilize ChrW function to parse a decemal to the
    character.
    http://msdn.microsoft.com/en-us/library/ee177465.aspx
    for example, the hex code and dec code for these two characters are as below:
      Hex   Dec
    ř 159   345
    ů 016F  367
    So to get these two characters in VBA, you could code as below:
    ChrW(&H159) or ChrW(345)
    ChrW(&H16F) or ChrW(367)
    You can get the hex code of the character by searching in the system character map(in the Win8.1 start view, search "character map"), then convert the hex code to decimal code by yourself.
    Range("A1").Value = ChrW(&H159) & ChrW(&H16F)
    Range("A1").Value = ChrW(345) & ChrW(367)
    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click
    HERE to participate the survey.

  • VBA does not recognize non-English characters

    Hello guys,
    I have a new laptop with Windows 8.1 bought in the USA and I'm having a difficulties with Excel VBA (Office 365 University-64x bought in the Czech Republic - Central Europe). The VBA does not recognize non-English characters (particularly "ř" and
    "ů") which causes me problem when running some codes that I wrote earlier on my previous laptop (Windows 7, bought in the Czech Republic with the same Office). 
    The problem with non-English characters has occurred only in VBA so far, otherwise I can use these characters normally in Excel cells, Word... I tried to install both English and Czech version of the Office with no change, I also installed Czech proofreading
    tools and set everything to Czech in the Office. The location and language preferences in the Windows are also set up to Czech. And it is not a problem of a font. I also mentioned that when I tried to look up these characters, using Ctrl+F, it changes
    original ř to r after a search and again this is only an issue of the VBA.   
    Thank you very much for any help.
    Tom

    Hi Tom,
    I would suggest you post the question in the forum of
    Excel for Developers as your query is directly related to VBA. Here we mainly support Office client side issues:
    https://social.msdn.microsoft.com/Forums/office/en-US/home?forum=exceldev
    The reason why we recommend posting appropriately is you will get the most qualified pool of respondents, and other partners who read the forums regularly can either share their knowledge or learn from your interaction with us. Thank you for your understanding.
    Regards,
    Ethan Hua
    TechNet Community Support
    It's recommended to download and install
    Configuration Analyzer Tool (OffCAT), which is developed by Microsoft Support teams. Once the tool is installed, you can run it at any time to scan for hundreds of known issues in Office
    programs.

  • Why Rpad does not work in non-English characters...????

    Ηι ,
    I have the classic table of emp in user scott....
    When i insert another row containing non-English characters the rpad function does not work......
    SQL> select ename , rpad(ename,12,'.') from emp;
    ENAME                          RPAD(ENAME,12,'.')
    SMITH                          SMITH.......
    ALLEN                          ALLEN.......
    WARD                           WARD........
    JONES                          JONES.......
    MARTIN                         MARTIN......
    BLAKE                          BLAKE.......
    CLARK                          CLARK.......
    KING                           KING........
    TURNER                         TURNER......
    JAMES                          JAMES.......
    FORD                           FORD........
    MILLER                         MILLER......
    SCOTT                          SCOTT.......
    ADAMS                          ADAMS.......
    ΠΑΝΑΓΙΩΤΟΥ ΠΑΝΑΓΙWhen i convert it to English characters then it works....
    PANAGIOTOU PANAGIOTOU..
    How can i make it to work....????
    I use 10g v.2
    Many thanks...
    Sim

    Hi ,
    SQL> select length('ΠΑΝΑΓΙΩΤΟΥ') from dual;
    LENGTH('ΠΑΝΑΓΙΩΤΟΥ')
                      10
    SQL> select vsize('ΠΑΝΑΓΙΩΤΟΥ') from dual;
    VSIZE('ΠΑΝΑΓΙΩΤΟΥ')
                     20When i issue the command
    SQL> select ename , rpad(ename,25,'.') from emp;
    ENAME                          RPAD(ENAME,25,'.')
    SMITH                          SMITH....................
    ALLEN                          ALLEN....................
    WARD                           WARD.....................
    JONES                          JONES....................
    MARTIN                         MARTIN...................
    BLAKE                          BLAKE....................
    CLARK                          CLARK....................
    KING                           KING.....................
    TURNER                         TURNER...................
    JAMES                          JAMES....................
    FORD                           FORD.....................
    MILLER                         MILLER...................
    SCOTT                          SCOTT....................
    ADAMS                          ADAMS....................
    ΠΑΝΑΓΙΩΤΟΥ                     ΠΑΝΑΓΙΩΤΟΥ.....It worked.... setting as 25 characters for padding.....
    Thanks....for the useful remark
    Sim

  • Non English characters conversion issue in LSMW BAPI Inbound IDOCs

    Hi Experts,
    We have some fields in customer master LSMW data load program which can
    contain non-English characters. We are facing issues in LSMW BAPI
    method with non-English characters Conversion. LMSW steps read and
    conversion are showing the non-English characters properly with out any
    issue. While creating inbound IDOCs most of the non-English characters
    replaced with '#' and its causing issues in creating customer master data in
    system. In our scenario customer data with non-English characters in
    the first name, last name and address details. Any specific setting
    needs to be done from our side? Please suggest me to resolve this issue.
    Thanks
    Rajesh Yadla

    If your language is a unicode tehn you need to change the options  like IN SAP you need to change it to unicode  in the initial screen Customize local layout(ALT F12) options 118  --> Encoding ....

  • [AS] Problem with non English characters in file path

    I wrote a script that exports a pdf file from ID, rasterizes it in PS, applies an action, saves it as another pdf file, and finally creates a Mail message, and attaches the file to it (the last part is written in AppleScript).
    The problem is that it doesn't work when the path to this file contains non English characters.
    This works:
    make new attachment with properties {file name:"/Volumes/Macintosh HD/BackUp Tetard/Test.pdf"}
    but this doesn't:
    make new attachment with properties {file name:"/Volumes/Macintosh HD/BackUp Têtard /Test.pdf"}
    I remember vaguely that I read somewhere that AppleScript can work with Unicode — in other words with such characters — starting from some version, don't remember which exactly, but it seems to me — Leopard.
    I am on Mac OS X 10.4.11 right now. Will updating solve this problem? Does anybody know any solution to this problem: a scripting addition, some hidden setting, etc.
    I made a little test: used a Russian character — ё and it works, but when I use — ê (Dutch) it doesn't. May it have something to do with the Region setting in International panel?
    Thanks in advance,
    Kasyan

    Kasyan, as of Leopard AppleScript treats all text as Unicode pre this you can specify 'as Unicode text'. Try a test with these.
    -- Leopard
    set x to POSIX path of (path to desktop)
    -- Pre Leopard
    set x to POSIX path of (path to desktop as Unicode text)
    -- Leopard
    set x to POSIX path of (choose file without invisibles)
    -- Pre Leopard
    set x to POSIX path of ((choose file without invisibles) as Unicode text)

  • Handling Non-English characters in the payload

    Hi Gurus,
    We are currently facing an issue in XI with Non-English characters. When we are trying to process Non-English characters in the payload , they are getting converted in to Junk characters at the receiver end. This scenario inbound to SAP ( File to Idoc scenario ) . Is there a way that Non-English characters can be handled in XI.
    Regards,
    Nick

    Hi Nick,
    I have knowledge of some problems when showing some kind of chinese character when
    dusplaying XML messages. If it's the case, check notes:
    #1135671 - ITS HTML viewer: incorrect display of xml documents
    #1072127 - ITS HTML Control: xml file rendering problem
    With regards,
    Caio Cagnani

  • Wrong sorting of non-English characters (iTunes 11.2)

    Since version 11.2 of iTunes, there is a problem with sorting of names with non-English characters (such as ž, š, č, ř or other in Czech). It seems that iTunes completely omits these characters when sorting. As an example, I have Čajkovskij (which is the Czech spelling of Tchaikovsky) among artists in my library and it is currently sorted between Adele and Alan Parsons Project, i.e. it seems that the character Č has been completely omitted while sorting and thus the artist is sorted as Ajkovskij.
    All my ID3 tags are of version 2.3 and in Unicode, thus this should not be any encoding problem. I even tried to change the ID3 tags to version 2.4, i.e. UTF-8, and it did not help.
    I think that there must be many more people experiencing this problem. Not everyone has only English or ASCII names in the library and I am surprised that I could not find anything on this bug anywhere on the net yet.
    Thanks for any help on this.

    Looks like a bug in the latest build. See this recent thread.
    You could report the problem via iTunes Feedback or sign up for a free Apple Developer Connection account and make use of Apple Bug Reporter.
    tt2

  • Entry of non-English characters into the db

    Hi
    We are facing a problem in inserting non-English characters into the database.For example, we have a company name field which can accept German characters. This field has been defined as of varchar2 type of size 50 in the db. When we enter 49 English characters and then one German character, the database is throwing the error that the inserted value is too large for the column.Is it that the German character is taken as equivalent to two English characters ? Or is there any database level setting that can be done for this ? For the time being we have identified certain critical fields and have doubled the size of their fields in the db. But I guess there has to be another solution to this....
    Please help.
    null

    Indeed, your German character is using two bytes to store itself. Consult the Oracle JDBC Developer's Guide.
    null

Maybe you are looking for

  • IPad has suddenly started showing all images as "negatives"?? How do I get it back to normal??

    iPad has suddenly started showing all images as "negatives"?? How do I get it back to normal??

  • Networking only one way...

    I can see every mac and pc on our network, share files, watch video and get internet, but no one can connect to my MacPro on the network. They can see my MacPro but it can not be pinged or connected to in any way shape or form. Any ideas? Thank You,

  • HD TV changing aspect.

    Panasonic 32 inch HD LCD with comcast HD receiver.  Picture aspect seems to randomly change from HD to an almost zoom effect.  For instance, a channel like CNN will show people somewhat out of frame, or the crawl at the bottom doesn't display complet

  • Lost ethernet connection switching locations

    When I change the location of my MBA, even if I had turned the computer off between locations, I often lose my ethernet connection. The button in Network Preferences turns red with text "Not connected." The usual way to get the connection back is to

  • Smart objects become pixelated when re-scaled?

    I have Photoshop CS5.5 running on Windows 7 Home Premium 64bit. When I place vector objects from Illustrator and then go to re-scale them they become pixelated. I check the Anti-alias box before commiting the place, but they still pixelate when re-sc