International characters getting mangled

Hi all,
All of this character set stuff is new to me. I have the following bit of code which, when it encounters a non-english character, substitutes with a question mark. For example, the spanish �.
The data comes from an XML file which, when opened, properly displays the �.
InputStreamReader ir = new InputStreamReader(new FileInputReader(strContents),"utf8");
BufferedReader br = new BufferedReader(ir);
String curLine = "";
while( (curLine=br.readLine)!=null){
System.out.println(curLine);
Wheter I write curLine to a file, insert it into a DB as a CLOB or just dump it to the console, non-english characters become ?. I tried using charsets other than utf8, such as iso8859_1, and get other characters instead, like ^Z or a box.
Any help appreciated
JW
p.s: (Sorry for the cross post, I decided this was a better/busier venue that the ILN8 forum after I posted there...)

utf8 is ok. I tried iso1859_1, given that the XML docs have that declared as the charset, but had no further luck. Others had the same effect. Will go away and read the tutorial pointed out on the other post. Will likely be back.
Thanks

Similar Messages

  • Problem with international characters showing up as junk

    Hi All,
    Little question.
    I've made a xml data template which executes a query to fetch person names from the e-business suite tables.
    However there are international characters in the names which are showing up incorrectly. When executing the query in the database everything shows up correctly. But when the query is executed via XML publisher the produced XML contains junk characters.
    This is happening with for example o umlaut characters.
    The database characterset is: WE8ISO8859P1
    Version of XML publisher: 5.6.3
    Patrick

    This turned out to be an extra property which was set in the data template:
    property scalable_mode with value "on"
    This caused the special characters to be mangled.
    Patrick

  • Parsing International Characters

    Hi folks,
    I am trying to parse an xml document which has international characters like "�" (accentuated e used in french). But my parser crashes trying to parse a document containing these characters:
    System.out.println("******************* 1");
    DocumentBuilderFactory lFactory = DocumentBuilderFactory.
    newInstance();
    System.out.println("******************* 2");
    DocumentBuilder lDB = lFactory.newDocumentBuilder();
    System.out.println("******************* 3");
    lDoc = lDB.parse(new FileInputStream(pFileName));
    System.out.println("******************* 4");
    The exception occures after 3rd println. Here is what I get:
    [17/May/2005 08:50:14:640] info: The Exception Stack Trace is : The element type "FirstName" must be terminated by the matching end-tag "</FirstName>".: org.xml.sax.SAXParseException: The element type "FirstName" must be terminated by the matching end-tag "</FirstName>".
         at org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:1213)
         at org.apache.xerces.framework.XMLDocumentScanner.reportFatalXMLError(XMLDocumentScanner.java:579)
         at org.apache.xerces.framework.XMLDocumentScanner.abortMarkup(XMLDocumentScanner.java:628)
         at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1136)
         at org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381)
         at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1098)
         at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:195)
         at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:76)
         at com.exult.andy.mbcommon.utilities.AncestorXMLUtil.fileToDoc(AncestorXMLUtil.java:328)
         at com.exult.andy.importadapter.base.AncestorDeployImportAdapter.main(AncestorDeployImportAdapter.java:62)
    The element is indeed correctly terminated.
    I appreciate any help. Thanks in advance.
    -r

    Then you don't have a well-formed XML document. If it doesn't declare its encoding in its prolog <?xml version="1.0" ?> then it should be encoded in UTF-8 (or, less likely, some variant of UTF-16) and it's probably encoded in ISO-8859-1 or something like that. If that's the case then fix the prolog to declare the encoding: <?xml version="1.0" encoding="ISO-8859-1" ?> or encode the document in UTF-8.

  • SAPSCRIPT: Printing international characters on ZEBRA; How to do?

    Hi,
    I use software NiceLabel software to design barcode forms. I upload the design to so10 Sapscript text and print it on the Zebra ptinter. I used device tape ASCIIPRI. The SAP system is unicode.
    Now I need to print chinese pallet labels and I get unexpected problems. I found a lot information but no solution. Is it possible to print international charcters form SAPScript on Zebra?
    I got the information from Zebra's White Paper: Solution for Printing International Characters. There it says:
    "Unicode UTF-8 is embedded within Zebra printers."
    "SAP Forms can be universal. Labels and forms ... do not need to be modified or recreated to print in different languages."
    "SAP-developed UTF-8 device type and code page support for SAPscript users"
    "Label design software that can generate ZPL with support for Unicode ZPL commands"
    Do you now which device type I have to use? I think I need an UTF-8 device type. Do you know how to go on?
    Please help. Thanks
    Frank

    Hi Frank,
    as far as I know, it might be possible when using SMARTFORMS instead of SAPScript!
    In that case, it depends of the device type and the printer type, of course.
    Have a look on SAP Note 750002 SmartForms: Support für Zebra Etikettendrucker (ZPL2).
    Cheers
    Klaus

  • Firefox Chrome Showing ???? instead of International Characters

    Hi,
    I have a flash movie at this site
    http://preview.tinyurl.com/5289rz
    When you click on the rendered thumb nail (with a link
    containing international characters) it takes you to a URL with
    those same characters in IE 7 it displays international characters
    and takes you to a correct page but in Firefox, Chrome and Opera
    the international characters are displayed correctly in the flash
    movie, but the URL turns those characters into question marks ????
    and the pages shows a 404 not found error.
    Firefox
    http://www.site.com/video/??????_??????_???????_???????_???????_???_????????

    Does that mean we should ignore those millions of users
    around the world that would be plain foolish. I am here to get this
    problem solved not to be told to just ignore users who use other
    browsers.

  • Problem with the International characters u00E0, u00E8, u00EC, u00F2, u00F9

    Dear Experts,
    My requirement is to send data files from SAP to Hyperion(Data Warehouse Tool) via application server. Here few fields(can be material description/ Name/ Address) contain international characters like à, è, ì, ò, ù (this is an example). So I need to send their equivalent characters (i.e a, e, i , o, u) to Hyperion. that is when I create a file in application server characters a, e, i, o, u should
    contain in the file.
    I used  ENCODING NON-UNICODE, UTF-8, DEFAULT but no use.
    Pls assist me.
    Thanks,
    Dharmendra Gali

    If you just have the couple of characters mentioned, use Jürgen's suggestion. Otherwise I'd recommend usage of SAP function module SCP_REPLACE_STRANGE_CHARS, which is much more comprehensive. Note that depending on your invocation though you might get multiple characters for some, e.g. ä to ae. To some degree you can control this, see my comments in Re: Removing diacritical (special & accented) characters in SAP.
    Cheers, harald

  • Displaying International Characters

    Some users have been concerned about the fact that Buzzword
    does not display some international characters - ranging from Greek
    to Russian. This is accentuated by the fact that we have Buzzword
    users in well over 100 countries.
    The problem occurs when users attempt to insert some
    international characters - say, the Greek letter omega - and
    Buzzword instead displays a dot on the screen. Here's what's going
    on, for anyone interested:
    Like virtually all modern software, Buzzword adheres to the
    Unicode standard, where characters are defined with 16 bits,
    resulting in a total of over 65,000 possible characters.
    However, unlike most desktop software, Buzzword must use
    something called "embedded fonts". This means that we can't read
    fonts off a user's computer, but instead we have to download fonts
    from our server.
    This is where our challenge begins. A font family contains
    characters - called "glyphs" when drawn on the screen - for some
    portion of the 65,000 possible characters defined by Unicode. Each
    available character is downloaded as a small program containing
    instructions on how to draw the glyph. The instructions are
    relatively small, but each takes time to download - you can see
    evidence of this in our "loading fonts" progress bar.
    For Buzzword to load relatively quickly, we need to limit the
    number of characters downloaded with each of our seven font
    families. Most people use far fewer than 65,000 characters, so for
    our first phase of deployment, we identified a couple hundred
    characters to download for each font family. Because our initial
    market focus was North America, we chose characters from Latin-1,
    the Western European character set.
    The result: when a user attempts to enter the Greek letter
    omega, Buzzword recognizes the Unicode character but does not have
    the downloaded instructions to display the glyph on the screen. The
    little dot that is displayed instead is an indication that the
    requested glyph has not been downloaded with the font set.. If the
    user were to export the document to be read by a desktop program,
    the glyph would probably be displayed using the computer's fonts.
    Longer term, we'll handle this differently by downloading
    fonts dynamically, based on the document's contents and a user's
    settings. In the meantime, we apologize to everyone who uses
    characters outside the Western European set. We will work to get
    you a solution as soon as we possibly can.

    quote:
    Like virtually all modern software, Buzzword adheres to the
    Unicode standard, where characters are defined with 16 bits,
    resulting in a total of over 65,000 possible characters.
    Actually, Unicode (the standard) does not care about the
    number of bits.
    It has enough space to encode more than one million
    characters, and the current version (Unicode 5.1) already encodes
    more than 100,000 characters (
    http://www.unicode.org/versions/Unicode5.1.0/)
    quote:
    Buzzword must use something called "embedded fonts".
    Nothing prevents Flash/Flex from using fonts "html style".
    In fact, Buzzword can add a "Generic sans-serif" font as an
    option (font-family: Verdana, Arial, Helvetica, sans-serif;) with
    zero effort.
    The document will not look the same on all computers, but
    this might be better than the current bullets.
    So this is not a "must".

  • International Characters with Netmail

    Hi all,
    I'm using Sun One portal server 6. I have set the platform charset to Iso-8859-7 so that every portal page displays greek characters correctly.
    My only problem is with NetMail. When I get mail I can see greek characters correctly with NetMail Lite. However when I'm sending email using NetMail Lite if I write Greek characters they turn to question marks when I read the email with any client (Netmail Lite, Outlook express etc).
    Any ideas
    Thanks
    -George

    Has anyone know if you can type international characters with the iPhone keyboard.
    Yes.
    http://m10lmac.blogspot.com/2007/09/iphone-input-keyboard-gets-accented.html

  • International characters in IOS filenames

    Hello,
    My movie compiles fine but when I add this file name to the package:
    ahí1.wav
    I get an applicationverificationfailed message when I try to send it to the device from flash cc
    if I rename it to:
    ahi1.wav
    It will work without failure.
    I've got hundreds of files that have international characters do I have to rename them all or is there a special flag, switch, or trick I can use to get around this?

    Stefan,
    It's good news that you are not having this problem, as it means that perhaps I won't shortly either. If we can characterize the differences between our setups, maybe I can have the same result as you do.
    I've just run the obvious case - I've created a file using TextEdit with a German name out on the volume from the Mac, stopped TextEdit, and successfully retrieved it. So it doesn't look like a filesystem mounting issue. I wonder what is so weird about these files. There must be something odd in the header, because it is definitely at the file info level that it is going off the rails. While the name is the obvious differentiator, maybe something else is odd as well.
    One thing I could try is to zip one of the directories affected on the Windows side and then try unzipping it into place there, then boot over to the Mac side and see if things have improved. If that doesn't resolve the problem, I could try unzipping it into place on the Mac side, but first I'll boot over to the Windows side and make sure it can read the file I just created in TextEdit from that end.
    By the way, the KB article you referenced was about shares and about problems with punctuation mounting Mac shares on Windows, so I don't think it pertains. In any case, I'm mounting a FAT volume, not a share, so the drivers would be completely different.
    Anyway, thank you for your help. Now that I'm no longer chasing phantoms, I can attend the real problem.
    Thanks,
    Ralph

  • International characters from a DB

    I'm having problems using a MySQL server with Swedish characters. I'm using latin1 as the character set. Everything works through the mysql client but not through JBDC. I'm getting out all international characters (English) but not the Swedish characters - they just appear like %'s and other signs.
    Does anyone use a database with international characters? Do you do anything special to make it work?
    Thank you!

    I'm using MySQL, with latin1 as the character set (that's the default, I just left it that way). I don't have any problems with non-ASCII characters such as '&eacute;'. But then I put them into the DB using Java and I took them out using Java. I haven't tried the MySQL text-based client but I doubt that it would work correctly with those characters, since my computer is running Windows, and the DOS command line uses a non-standard character set.

  • International characters apear with ? instead of ó or á or é

    After upgrade from 10.4.7 to 10.4.8 my email in yahoo apear with ? instead of the international characters.
    What i can do?
    I get this
    y mis correr�as con alguien que m�s all�
    instead of
    y mis correrías con alguien que más allá
    Thanks a lot.
    Jm

    Sorry, realized too late. Now that I now how bad it is, I promise not to do it again. Sorry for another useless note for this apology. Go forth and find other evil dooers...

  • International Characters Turn to ?'s

    I'm using Kodo 2.5.4 with MySQL. Any international characters (such as
    "__") in MySQL get translated to a "?". Thanks for any clues on how to
    prevent this -
    Sam

    Sam-
    There are 3 possible places where something might be going wrong:
    1. the database might not be storing the international characters
    correctly
    2. the JDBC driver might not be handling the international characters
    properly
    3. Kodo might be doing something wrong with the international characters
    Since Kodo just uses the JDBC driver's String handling, I think #1 and #2
    are more likely.
    Some quick searching on the internet reveals that MySQL did not support
    unicode until version 4.1:
    http://www.mysql.com/doc/en/Charset-Unicode.html
    You might want to try upgrading to see if your problems are magically
    solved.
    Otherwise, you are going to be limited by the capabilities of the JDBC
    driver, and Kodo doesn't do any special handling for unicode characters
    (beyond what is provided by the Java language itself). One solution
    would be to perform your own encoding into the String field, and then
    perform the decoding when you retrieve the field.
    In article <bn0bdj$eec$[email protected]>, Sam wrote:
    I forgot to mention that I'm also using MySQL version 3.23.58.
    I'm using Kodo 2.5.4 with MySQL. Any international characters (such as
    "__") in MySQL get translated to a "?". Thanks for any clues on how to
    prevent this -
    Sam
    Marc Prud'hommeaux [email protected]
    SolarMetric Inc. http://www.solarmetric.com

  • International characters in unix

    Anyone have any idea how to get international characters to work in a unix with java 1.3.1? All I get when i try to print one of the scandinavian letters (������ dunno if they show right here) is a ? character. What is it that i need to change to get them to work? Some setting in the unix or java perhaps?
    -teka

    Well the chars came out wrong as i expected. Anyways I managed to solve this on my own so I'll just post the solution here in case someone comes this way looking with a similar problem..
    The problem was that the environment variable LANG needed to be set for the shell. You can get the available locales you can set it to with "locale -a" command in your unix shell. in my case it was LANG=fi.ISO8859-15
    This was with JRE 1.3.1, with JRE 1.2.2 in the same shell it worked fine without the variable.. I think with 1.3.0 too.
    -teka

  • International characters in iPhone

    Has anyone know if you can type international characters with the iPhone keyboard. I send email to clients in Spanish and Portuguese and I find myself missing how easy it was to type accent marks in my previous phone, a blackberry.

    Has anyone know if you can type international characters with the iPhone keyboard.
    Yes.
    http://m10lmac.blogspot.com/2007/09/iphone-input-keyboard-gets-accented.html

  • Messenger and international characters

    I posted this earlier this month on the blackberryforums.com and crackberry.com web sites but didn't really get a answer. Hopefully it will have a better outcome here 
    Hi, we switched from the 7250 to the 8330 model (what an upgrade!). I have a question though regarding the 83xx messenger, I can't get international characters to work (by holding the key and moving the trackball). It works in other application (like while sending an email or in the address book) but not in messenger. We did not have that problem with the 7250 model.
    Is this a known issue/limitation? My firmware version is 4.3.0.124 and my carrier is Telus.
    Thanks.

    Hello,
    I just joined this community, and of course I started my search for this same issue.
    After spending more than 4 hours with Tech Support from my Carrier, then going to the Carrier's Store and wasting 2 more hours, then back to the phone with the carriers tech support this time with a spanish speaking cust. svc. agent, this issue is still unresolved, I think, and everyone else who tried to help as well, that this is a GLITCH in Blackberry's phones, I have a Curve 8330, and a Storm, and they both do not allow you to type international characters in SMS ....GRRRRRRRR! yes it works in everything else except there.
    I tried the new Tour at the store and it does allow the international characters in SMS messages, but of course they wanted me to dig into my pocked to pay for the difference.
    I think we should all come together and write something to Blackberry and let them know that we need this fixed, it's aggravating having to compensate through apologies and typing extra characters just to be able to communicate in a different language with these phones.
    I think that these Smart Phones, are not so smart after all....LOL

Maybe you are looking for

  • How can I remove an old phone number from iMessage

    I have an old iphone 3GS that has been upgraded to IOS5 and I am letting my son use.  He is loving the iMessage, however there is one problem - the phone number that used to be associated with that 3GS is still associated with the iMessage settings a

  • Dynamic table with radio buttons

    I need to load a dynamic table with radio buttons that allow the user to select an item in the table. I need to have it add a radio button with a unique identifier for each table item. For example: the table will load all of the seminars locations av

  • Screen looks normanl; Printing gibberish

    When I print messages from Earthlink.webmail they come out partially garbled in Firefox (not a problem if I use Internet Explorer. I don't think it's the printer, because it doesn't matter if I send it to a different printer.

  • Issue with Jdeveloper 11g R1

    Hi All, I am using Jdeveloper 11.1.1.6.0 version installed in Windows 7. The problem i am facing is - any changes made in my application is not reflecting. I am able to generate the EAR file successfully and able to deploy to the weblogic server succ

  • I have a Macbook I purchased in 2006. It is running version 10.4.11. It says I do not have any software updates. How can I get this computer updated?

    I have a Macbook that I bought in 2006. I am operating on version 10.4.11. I'd like to find out how to upgrade this laptop so I don't have to buy a new one. I can't do much on here with this version, and when I go to Software Updates, it tells me the