Non-English Characters (Encoding)

Using the XMP Toolkit I'm having problems reading and writing non-English characters.  For example: keywords read which should be "casa campesina, cultivos agrí colas, zona cafetera, café, plátano" read as "casa campesina, cultivos agrÃcolas, zona cafetera, café, plátano".   I have the same problem with other languages such as Norwegian.
Does anyone else have this type of problem?  Or perhaps a suggestion as to what I might be doing to cause such a problem?
Best regards,
Glenn Rogers
Developer of DBGallery: Photo DATAbase System

Hi Glen,
if you write non ASCII characters using our toolkit, you have to make sure to encode your string in UTF8.
If you see this while reading, the data in the file might not be valid UTF8. If it's local encoding (for example mac local encoding) our library will try to convert it based  on the OS you are running it on. So it you got mac local encoding in the EXIF of the file and you are using the toolkit on windows, this might cause the wrong characters you are seeing.
In order to avoid this please always use UTF8 encoded strings.
Regards,
Samy
XMP Team

Similar Messages

  • URLEneQuery encoding is failing for some non english characters

    While creating a URLEneQuery we are getting error com.endeca.navigation.InternalException: No support for 8-bit urls.
    This error happens when the query string has some non english characters. (eg: Á).

    UrlENEQuery is designed around processing URL data, and URLs are not permitted non-ASCII characters in their production. To represent non-ASCII characters they must be %-encoded in URLs according to their byte(s) representation in a particular character-encoding, and you should prefer UTF-8 for URLs. So your LATIN CAPITAL LETTER A WITH ACUTE (U+00C1) should appear as %C3%81 in your URL, then UrlENEQuery should be able to process that character.

  • My Firefox cannot display non-English characters, even though I have tried every language encoding I have!

    I am a big fan of Japanese songs and websites, so I was very disappointed when I saw that Firefox could not handle any non-English characters. I have tried every encoding I can, but none work and I just see boxes with numbers and letters inside. I have only just got this older laptop for my birthday - my old laptop which ran Windows Vista and had Firefox 4 had no trouble at all. Please help me!

    hello muoshui, please enter '''about:config''' into the firefox location bar (confirm the info message in case it shows up) & search for the preference named '''network.http.accept-encoding''' - right-click and reset that entry to the default value.
    if this does not resolve the issue already, please also go through the steps offered at [[Websites look wrong or appear differently than they should]].

  • Encoding non english characters with utf 8 on jsp (Critical!!)

    I am inserting hebrew characters from JSP into oracle db and everything is fine until this point. But when I try to retrieve the information from the database, the characters are not displayed properly (I get some garbage characters). I am sure that the data stored in the database is correct, but not sure why there is a problem in displaying the data in the JSP.
    I came across a thread on TSS
    http://www.theserverside.com/discussions/thread.tss?thread_id=28944
    and followed the suggestions given there like having
    <%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8" %>
    <META http-equiv="Content-Type" content="text/html; charset=UTF-8">and also this
    <%
    //Some JDBC and sql statement query UTF-8 data and then ...
    String str = rs.getString("utf8_data");
    str = new String(str.getBytes("ISO-8859-1"),"UTF-8");
    %>
    <%= str %>Now, the data getting displayed is partly correct, I mean to say, some characters are still coming as squares.
    Any ideas will be of great help.

    even i doubt the database charset for this issue. But what I dont understand is how only certain hebrew characters are getting stored properly and why others are corrupted?
    Also, can anyone let me know how i can view the Non-English characters present in the database directly, as TOAD is not able to display them

  • Non-English characters

    Hello, I have read several times that since Java uses Unicode, it solves the problems of non-English characters automatically or something like that.
    But my app is not working as expected. Would someone help please?
    I have a client/server combo written in Java. The server can send messages in English or Japanese. The Japanese messages are hard-coded as String literals in the server source code. On the client side, they are displayed on a JEditorPane. But the Japanese characters are all garbled. The OS on the server side and client side are, of course, different.
    My supposition, which is obviously wrong as it is not working, is that since both ends of communication are Java app, I need not worry about any encoding conversions for String literals.
    Suggest me what is wrong here?

    How is the required encoding/decoding supposed to be done?
    When I didn't worry about non-English characters, I did the following, which WORKED.
    // SENDER side
    Socket socket ;
    PrintWriter     out = new PrintWriter(socket.getOutputStream(),true);
    String outMessage = "my message";
    out.println(outMessage);//RECEIVER
    Socket socket ;
    BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
    String inMessage =  in.readLine();When non-English characters are involved, I did the following, which DID NOT WORK. Please someone correct me.
    // SENDER side
    Socket socket ;
    PrintWriter     out = new PrintWriter(socket.getOutputStream(),true);
    String outMessage = "my message";
    String utfString = new String(outMessage.getBytes(),"UTF-8");
    out.println(utfString);//RECEIVER
    Socket socket ;
    InputStreamReader ins = new InputStreamReader(clientSocket.getInputStream(),"UTF-8");
    BufferedReader in = new BufferedReader(ins);
    String inMessage =  in.readLine();The received message is still garbled.

  • Non english characters in DN cannot be retrieved

    We are using Netscape directory server 4, protocal V3. We have a problem related to non-english characters appearing in RDN.
    We publish to Ldap entries using the values from database. For example, we have pubulished an entry to Ldap, based on DB values, the entry should have a DN like: ou=Liege BELGIUM ... LGG1a, <other components of DN>. However, when we call netscape search API (search against uid attribute which does not have non-english characters), the search return the entry, but when further call getDN() method on the returned Ldap Entry, it only returns Li, instead of the complete DN value.
    It seems the entry is corrupted in Ldap. I wanted to delete the corrupted entry and re create new one to test. I tried many ways, but none of them worked, I think it is because DN is corrupted, there is no key value to identify the Ldap entry for any operation(modify, delete).
    You help and insights are much appreciated.
    Thanks.
    Han Shen

    LDAP uses the UTF8 encoding. You must store data in the directory using the UTF8 encoding. This includes DN values. This also means that if you want to be able to view the values in your native character set and font, you must use an application that can convert the UTF8 LDAP data back to the native character encoding. The directory console by default should work for LATIN-1 (ISO 8859) languages if the LOCALE is set correctly.

  • Non-English characters in URL for rwservlet

    I'm having a problem when I try to use non-english characters in a URL request to generate a report.
    This works fine:
    http://...rwservlet?report=r1.jsp&m1=Fred
    But if I try Fréd (e with accent graph) the report does not return any data even though the SQL by itself would find data.
    I tried UTF-8 encoding
    http://...rwservlet?report=r1.jsp&m1=Fr%C3%A9d
    8859-1 encoding
    http://...rwservlet?report=r1.jsp&m1=Fr%E9d
    Or just spell it out (not sure what that gets encode as):
    http://...rwservlet?report=r1.jsp&m1=Fréd
    But noting works. Any ideas?
    Thanks, Andreas

    Suggestions
    1) Try with NLS_LANG as
    SWEDISH_SWEDEN.WE8DEC
    2) Make a paramform and enter via paramform (unencoded)
    (This is just for testing purpose)
    3) Change machine locale to swedish and try
    4) Which reports version is this ?
    Please see
    BUG 2713695 - NLS CHARACTERS FOR PARAMETERS CHANGE TO QUESTION MARKS WHEN PASSED ON URL BAR
    Get in touch with Support to see if this is the issue and if "yes" get a one-off patch.
    [    All Docs for all versions    ]
    http://otn.oracle.com/documentation/reports.html
    [     Publishing reports to web  - 10G  ]
    http://download.oracle.com/docs/html/B10314_01/toc.htm (html)
    http://download.oracle.com/docs/pdf/B10314_01.pdf (pdf)
    [   Building reports  - 10G ]
    http://download.oracle.com/docs/pdf/B10602_01.pdf (pdf)
    http://download.oracle.com/docs/html/B10602_01/toc.htm (html)
    [   Forms Reports Integration whitepaper  9i ]
    http://otn.oracle.com/products/forms/pdf/frm9isrw9i.pdf
    ---------------------------------------------------------------------------------

  • Non English characters conversion issue in LSMW BAPI Inbound IDOCs

    Hi Experts,
    We have some fields in customer master LSMW data load program which can
    contain non-English characters. We are facing issues in LSMW BAPI
    method with non-English characters Conversion. LMSW steps read and
    conversion are showing the non-English characters properly with out any
    issue. While creating inbound IDOCs most of the non-English characters
    replaced with '#' and its causing issues in creating customer master data in
    system. In our scenario customer data with non-English characters in
    the first name, last name and address details. Any specific setting
    needs to be done from our side? Please suggest me to resolve this issue.
    Thanks
    Rajesh Yadla

    If your language is a unicode tehn you need to change the options  like IN SAP you need to change it to unicode  in the initial screen Customize local layout(ALT F12) options 118  --> Encoding ....

  • Wrong sorting of non-English characters (iTunes 11.2)

    Since version 11.2 of iTunes, there is a problem with sorting of names with non-English characters (such as ž, š, č, ř or other in Czech). It seems that iTunes completely omits these characters when sorting. As an example, I have Čajkovskij (which is the Czech spelling of Tchaikovsky) among artists in my library and it is currently sorted between Adele and Alan Parsons Project, i.e. it seems that the character Č has been completely omitted while sorting and thus the artist is sorted as Ajkovskij.
    All my ID3 tags are of version 2.3 and in Unicode, thus this should not be any encoding problem. I even tried to change the ID3 tags to version 2.4, i.e. UTF-8, and it did not help.
    I think that there must be many more people experiencing this problem. Not everyone has only English or ASCII names in the library and I am surprised that I could not find anything on this bug anywhere on the net yet.
    Thanks for any help on this.

    Looks like a bug in the latest build. See this recent thread.
    You could report the problem via iTunes Feedback or sign up for a free Apple Developer Connection account and make use of Apple Bug Reporter.
    tt2

  • Encrypting non-English characters

    Hi,
    I have this application which has to do the following
    Scenario (i)
    - Read ENCODED string SE from Network Source NS1 (Native,Non-JAVA)
    - Decode SE to SD using the same charset as NS1
    - Apply some transformation to SD to get SD2
    - Encode SD2 to get SE2 using the same charset as Network Source NS2
    (Native, Non-JAVA)
    - Send SE2 to NS2
    - NS2 gets what it expects without any problems :))
    Scenario (ii)
    - Read ENCODED string SE from Network Source NS1 (Native,Non-JAVA)
    - Decode SE to SD using the same charset as NS1
    - Apply some transformation to SD to get SD2
    - Get the bytes from SD2 as BSD2
    - Encode BSD2 to get BSE2 using the same charset as Network Source NS2
    (Native, Non-JAVA)
    - Encrypt BSE2 to BSE2_Enc
    - Send BSE2_Enc to NS2 (Native,Non-JAVA)
    - NS2 does not gets what it expects :((
    (It recieves English text OK but it gets ???? for non-English)
    The charset being used is windows-1256 (at NS1,NS2 and my application)
    Encryption is being done using BouncyCastle TwoFish w/ 256 bit keys
    Reading/Writing from/to network is being done over SocketChannel
    Get the bytes from SD2 as BSD2 => byte [] BSD2 = SD2.getBytes()It seems the non-Enlish characters are getting lost when I go SD2.getBytes()
    and they get encrypted as 'lost-non-English characters' ;)
    And when they get decrypted at NS2, they are displayed as 'lost-non-English
    characters' :)) i.e. ??????? .. so on
    Is there a way I can encrypt non-English plain text without losing information ?
    (without having to implement a TwoFish engine in my application itself)

    1) Bytes are not characters. Characters are UNICODE
    and have a byte representation defined by an encoding
    scheme. It is usually wrong to use the default
    encoding given by String.getBytes(). One should realy
    use String.getByte(encoding) eg
    "fred".getBytes("UTF-8");Awwllright ... got that :)) Thanks buddy
    2) Not having access to your code makes it difficult
    but make sure you are not converting encrypted bytes
    to a String using new String(encrypted bytes); No .. I am not doing that.
    3) Again, not having access to your code makes it
    difficult , but when you display your Strings make
    sure that you use a Font that has representations for
    all the UNICODE characters you wish to display. It is
    normal for any character that does not have a valid
    glyph in a gien font to display as a box.That infrastructure exists and is working fine ... as I mentioned
    this is working OK when plain text is being used.
    The problem was with using the getBytes() rather than getBytes("windows-1256")
    Its working now ... thanks alot .. again. I wonder how that never occurred to me.

  • Loading Non-English Characters using VBA and BAPI

    Hi Experts,
    I am trying to load Non-English characters (Chinese, Korean, Japanese, etc.) into a SAP Table using BAPI and VBA. I have set the connection language and codepage values but when I run the tool, the non-English characters display as ????? or #####. Do you know how to fix this issue?
    Thanks!

    If your language is a unicode tehn you need to change the options  like IN SAP you need to change it to unicode  in the initial screen Customize local layout(ALT F12) options 118  --> Encoding ....

  • Flex, xml, and non-English characters

    Hello! I have a Flex web app with AdvancedDataGrid. And I use httpService component to load some data to grid. The .xml file contains non-english characters in attributes (russian in my case) like this:
    <?xml version="1.0" encoding="utf-8" ?>
       <Autoparts>
        <autopart  DESCRIPTION="Барабан">
    </Autoparts>
    And when i run app, AdvancedDataGrid display it like "Ñ&#129;ПÐ". How can i fix it? I try to change encoding="utf-8" with some another charsets, bun unsuccesfully. Thank you.

    Try changing the xml structure by using CDATA instead of having the russian part as an attribute and see if that makes any difference.
    What I meant is use something like this:
    <?xml version="1.0" encoding="utf-8" ?>
       <Autoparts>
        <autopart>
           <description><![CDATA[Барабан]]></description>
      </autopart>
    </Autoparts>
    instead of the current xml.

  • Can username, password have unicode(non english) characters

    Does Oracle allow a username, password to have non english characters.

    I found the answer to my own question. In essence it looks like the answer is Yes for UserName and a little confusing for the Password. Password has to be in Single byte characterset, not clear whether 7 bit or 8 bit. 8 bit from my understanding is Ascii + some Western European characters.
    Following is from Oracle 10G Database SQL reference
    user
    Specify the name of the user to be created. This name can contain only characters from your database character set and must follow the rules described in the section "Schema Object Naming Rules". Oracle recommends that the user name contain at least one single-byte character regardless of whether the database character set also contains multibyte characters.
    Note:
    Oracle recommends that user names and passwords be encoded in ASCII or EBCDIC characters only, depending on your platform.
    BY password
    The BY password clause lets you creates a local user and indicates that the user must specify password to log on to the database. Passwords can contain only single-byte characters from your database character set regardless of whether the character set also contains multibyte characters.
    Passwords must follow the rules described in the section "Schema Object Naming Rules", unless you are using the Oracle Database password complexity verification routine. That routine requires a more complex combination of characters than the normal naming rules permit.

  • Replacing any non english Characters

    How can I Replace any non english characters I have alot of the characters that look like a block.
    --John                                                                                                                                                                                                                   

    Probably the easiest way to code would be to convert the string to a byte array and back again using the ASCII character encoding. That should give you ? for any non ASCII characters.
    Something like;
    String newString = new String(oldString.getBytes("ASCII"), "ASCII");

  • Non English characters in BIP email

    Hi, my report contains Japanese characters, when I view the output in HTML format. It is displayed properly. But when I click on send button , enter email parameters like to, cc, bcc, subject , etc and send it, in the mail I receive, the japanese characters are not getting displayed properly. The same problem occurs for spanish and portugese texts-in general to all non english characters. I am using Oracle Business Intelligence Publisher Release 10.1.3.4. If someone has faced a similar issue, kindly help. Thanks in advance

    Suggestions
    1) Try with NLS_LANG as
    SWEDISH_SWEDEN.WE8DEC
    2) Make a paramform and enter via paramform (unencoded)
    (This is just for testing purpose)
    3) Change machine locale to swedish and try
    4) Which reports version is this ?
    Please see
    BUG 2713695 - NLS CHARACTERS FOR PARAMETERS CHANGE TO QUESTION MARKS WHEN PASSED ON URL BAR
    Get in touch with Support to see if this is the issue and if "yes" get a one-off patch.
    [    All Docs for all versions    ]
    http://otn.oracle.com/documentation/reports.html
    [     Publishing reports to web  - 10G  ]
    http://download.oracle.com/docs/html/B10314_01/toc.htm (html)
    http://download.oracle.com/docs/pdf/B10314_01.pdf (pdf)
    [   Building reports  - 10G ]
    http://download.oracle.com/docs/pdf/B10602_01.pdf (pdf)
    http://download.oracle.com/docs/html/B10602_01/toc.htm (html)
    [   Forms Reports Integration whitepaper  9i ]
    http://otn.oracle.com/products/forms/pdf/frm9isrw9i.pdf
    ---------------------------------------------------------------------------------

Maybe you are looking for

  • My ipod nano 3rd Generation has just stopped working with no warning signs!

    It has a black blank screen and is unresponsive to charging when plugged into a USB port and a standard USB mains powered charger that came with my iphone. I have tried pressing the menu and centre button and nothing happens. Please help if possible.

  • Custom Encryption

    I created an encryption program that is one-way and I would like to convert it to an algorithm. I have been looking for someone that would like to take this callenge but I can't find anyone. It is simple but powerful. Please help. Thank you, Andrew A

  • Help, I just got OS 6 and I want OS 5 back.

    Hey I just got OS 6 and i dislike it, I am much more comfortable with OS 5, so I would like to downgrade back to it. Is this possible? I  really want my OS 5 back, so please help, thanks.

  • 9i; JSP; Find a row using jbo:Row.. ?

    Hi, newbie question: Simply i want to search a RowSet for a specific value. (e.g. Emp_No=8) So, how can i define this condition for <jbo:Row...> tag here is my code: <jbo:DataSource id="dsEmp" appid="app" viewobject="EmployeeView" ></jbo:DataSource>

  • New replacement battery worse than old one

    My Ibook 12" and battery were on the recall list. I didn't notice to much in the way of heat or bad battery life before the recall, but I thought this was a good chance to get a new battery. Well the new one came and it is way worse than the old one.