International characters getting mangled

Hi all,
All of this character set stuff is new to me. I have the following bit of code which, when it encounters a non-english character, substitutes with a question mark. For example, the spanish �.
The data comes from an XML file which, when opened, properly displays the �.
InputStreamReader ir = new InputStreamReader(new FileInputReader(strContents),"utf8");
BufferedReader br = new BufferedReader(ir);
String curLine = "";
while( (curLine=br.readLine)!=null){
System.out.println(curLine);
Wheter I write curLine to a file, insert it into a DB as a CLOB or just dump it to the console, non-english characters become ?. I tried using charsets other than utf8, such as iso8859_1, and get other characters instead, like ^Z or a box.
Any help appreciated
JW
p.s: (Sorry for the cross post, I decided this was a better/busier venue that the ILN8 forum after I posted there...)

utf8 is ok. I tried iso1859_1, given that the XML docs have that declared as the charset, but had no further luck. Others had the same effect. Will go away and read the tutorial pointed out on the other post. Will likely be back.
Thanks

Similar Messages

Problem with international characters showing up as junk

Hi All,
Little question.
I've made a xml data template which executes a query to fetch person names from the e-business suite tables.
However there are international characters in the names which are showing up incorrectly. When executing the query in the database everything shows up correctly. But when the query is executed via XML publisher the produced XML contains junk characters.
This is happening with for example o umlaut characters.
The database characterset is: WE8ISO8859P1
Version of XML publisher: 5.6.3
Patrick

This turned out to be an extra property which was set in the data template:
property scalable_mode with value "on"
This caused the special characters to be mangled.
Patrick

Parsing International Characters

Hi folks,
I am trying to parse an xml document which has international characters like "�" (accentuated e used in french). But my parser crashes trying to parse a document containing these characters:
System.out.println("******************* 1");
DocumentBuilderFactory lFactory = DocumentBuilderFactory.
newInstance();
System.out.println("******************* 2");
DocumentBuilder lDB = lFactory.newDocumentBuilder();
System.out.println("******************* 3");
lDoc = lDB.parse(new FileInputStream(pFileName));
System.out.println("******************* 4");
The exception occures after 3rd println. Here is what I get:
[17/May/2005 08:50:14:640] info: The Exception Stack Trace is : The element type "FirstName" must be terminated by the matching end-tag "</FirstName>".: org.xml.sax.SAXParseException: The element type "FirstName" must be terminated by the matching end-tag "</FirstName>".
     at org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:1213)
     at org.apache.xerces.framework.XMLDocumentScanner.reportFatalXMLError(XMLDocumentScanner.java:579)
     at org.apache.xerces.framework.XMLDocumentScanner.abortMarkup(XMLDocumentScanner.java:628)
     at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch(XMLDocumentScanner.java:1136)
     at org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:381)
     at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:1098)
     at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:195)
     at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:76)
     at com.exult.andy.mbcommon.utilities.AncestorXMLUtil.fileToDoc(AncestorXMLUtil.java:328)
     at com.exult.andy.importadapter.base.AncestorDeployImportAdapter.main(AncestorDeployImportAdapter.java:62)
The element is indeed correctly terminated.
I appreciate any help. Thanks in advance.
-r

Then you don't have a well-formed XML document. If it doesn't declare its encoding in its prolog <?xml version="1.0" ?> then it should be encoded in UTF-8 (or, less likely, some variant of UTF-16) and it's probably encoded in ISO-8859-1 or something like that. If that's the case then fix the prolog to declare the encoding: <?xml version="1.0" encoding="ISO-8859-1" ?> or encode the document in UTF-8.

SAPSCRIPT: Printing international characters on ZEBRA; How to do?

Hi,
I use software NiceLabel software to design barcode forms. I upload the design to so10 Sapscript text and print it on the Zebra ptinter. I used device tape ASCIIPRI. The SAP system is unicode.
Now I need to print chinese pallet labels and I get unexpected problems. I found a lot information but no solution. Is it possible to print international charcters form SAPScript on Zebra?
I got the information from Zebra's White Paper: Solution for Printing International Characters. There it says:
"Unicode UTF-8 is embedded within Zebra printers."
"SAP Forms can be universal. Labels and forms ... do not need to be modified or recreated to print in different languages."
"SAP-developed UTF-8 device type and code page support for SAPscript users"
"Label design software that can generate ZPL with support for Unicode ZPL commands"
Do you now which device type I have to use? I think I need an UTF-8 device type. Do you know how to go on?
Please help. Thanks
Frank

Hi Frank,
as far as I know, it might be possible when using SMARTFORMS instead of SAPScript!
In that case, it depends of the device type and the printer type, of course.
Have a look on SAP Note 750002 SmartForms: Support für Zebra Etikettendrucker (ZPL2).
Cheers
Klaus

Firefox Chrome Showing ???? instead of International Characters

Hi,
I have a flash movie at this site
http://preview.tinyurl.com/5289rz
When you click on the rendered thumb nail (with a link
containing international characters) it takes you to a URL with
those same characters in IE 7 it displays international characters
and takes you to a correct page but in Firefox, Chrome and Opera
the international characters are displayed correctly in the flash
movie, but the URL turns those characters into question marks ????
and the pages shows a 404 not found error.
Firefox
http://www.site.com/video/??????_??????_???????_???????_???????_???_????????

Does that mean we should ignore those millions of users
around the world that would be plain foolish. I am here to get this
problem solved not to be told to just ignore users who use other
browsers.

Problem with the International characters u00E0, u00E8, u00EC, u00F2, u00F9

Dear Experts,
My requirement is to send data files from SAP to Hyperion(Data Warehouse Tool) via application server. Here few fields(can be material description/ Name/ Address) contain international characters like à, è, ì, ò, ù (this is an example). So I need to send their equivalent characters (i.e a, e, i , o, u) to Hyperion. that is when I create a file in application server characters a, e, i, o, u should
contain in the file.
I used ENCODING NON-UNICODE, UTF-8, DEFAULT but no use.
Pls assist me.
Thanks,
Dharmendra Gali

If you just have the couple of characters mentioned, use Jürgen's suggestion. Otherwise I'd recommend usage of SAP function module SCP_REPLACE_STRANGE_CHARS, which is much more comprehensive. Note that depending on your invocation though you might get multiple characters for some, e.g. ä to ae. To some degree you can control this, see my comments in Re: Removing diacritical (special & accented) characters in SAP.
Cheers, harald

Displaying International Characters

Some users have been concerned about the fact that Buzzword
does not display some international characters - ranging from Greek
to Russian. This is accentuated by the fact that we have Buzzword
users in well over 100 countries.
The problem occurs when users attempt to insert some
international characters - say, the Greek letter omega - and
Buzzword instead displays a dot on the screen. Here's what's going
on, for anyone interested:
Like virtually all modern software, Buzzword adheres to the
Unicode standard, where characters are defined with 16 bits,
resulting in a total of over 65,000 possible characters.
However, unlike most desktop software, Buzzword must use
something called "embedded fonts". This means that we can't read
fonts off a user's computer, but instead we have to download fonts
from our server.
This is where our challenge begins. A font family contains
characters - called "glyphs" when drawn on the screen - for some
portion of the 65,000 possible characters defined by Unicode. Each
available character is downloaded as a small program containing
instructions on how to draw the glyph. The instructions are
relatively small, but each takes time to download - you can see
evidence of this in our "loading fonts" progress bar.
For Buzzword to load relatively quickly, we need to limit the
number of characters downloaded with each of our seven font
families. Most people use far fewer than 65,000 characters, so for
our first phase of deployment, we identified a couple hundred
characters to download for each font family. Because our initial
market focus was North America, we chose characters from Latin-1,
the Western European character set.
The result: when a user attempts to enter the Greek letter
omega, Buzzword recognizes the Unicode character but does not have
the downloaded instructions to display the glyph on the screen. The
little dot that is displayed instead is an indication that the
requested glyph has not been downloaded with the font set.. If the
user were to export the document to be read by a desktop program,
the glyph would probably be displayed using the computer's fonts.
Longer term, we'll handle this differently by downloading
fonts dynamically, based on the document's contents and a user's
settings. In the meantime, we apologize to everyone who uses
characters outside the Western European set. We will work to get
you a solution as soon as we possibly can.

quote:
Like virtually all modern software, Buzzword adheres to the
Unicode standard, where characters are defined with 16 bits,
resulting in a total of over 65,000 possible characters.
Actually, Unicode (the standard) does not care about the
number of bits.
It has enough space to encode more than one million
characters, and the current version (Unicode 5.1) already encodes
more than 100,000 characters (
http://www.unicode.org/versions/Unicode5.1.0/)
quote:
Buzzword must use something called "embedded fonts".
Nothing prevents Flash/Flex from using fonts "html style".
In fact, Buzzword can add a "Generic sans-serif" font as an
option (font-family: Verdana, Arial, Helvetica, sans-serif;) with
zero effort.
The document will not look the same on all computers, but
this might be better than the current bullets.
So this is not a "must".

International Characters with Netmail

Hi all,
I'm using Sun One portal server 6. I have set the platform charset to Iso-8859-7 so that every portal page displays greek characters correctly.
My only problem is with NetMail. When I get mail I can see greek characters correctly with NetMail Lite. However when I'm sending email using NetMail Lite if I write Greek characters they turn to question marks when I read the email with any client (Netmail Lite, Outlook express etc).
Any ideas
Thanks
-George

Has anyone know if you can type international characters with the iPhone keyboard.
Yes.
http://m10lmac.blogspot.com/2007/09/iphone-input-keyboard-gets-accented.html

International characters in IOS filenames

Hello,
My movie compiles fine but when I add this file name to the package:
ahí1.wav
I get an applicationverificationfailed message when I try to send it to the device from flash cc
if I rename it to:
ahi1.wav
It will work without failure.
I've got hundreds of files that have international characters do I have to rename them all or is there a special flag, switch, or trick I can use to get around this?

Stefan,
It's good news that you are not having this problem, as it means that perhaps I won't shortly either. If we can characterize the differences between our setups, maybe I can have the same result as you do.
I've just run the obvious case - I've created a file using TextEdit with a German name out on the volume from the Mac, stopped TextEdit, and successfully retrieved it. So it doesn't look like a filesystem mounting issue. I wonder what is so weird about these files. There must be something odd in the header, because it is definitely at the file info level that it is going off the rails. While the name is the obvious differentiator, maybe something else is odd as well.
One thing I could try is to zip one of the directories affected on the Windows side and then try unzipping it into place there, then boot over to the Mac side and see if things have improved. If that doesn't resolve the problem, I could try unzipping it into place on the Mac side, but first I'll boot over to the Windows side and make sure it can read the file I just created in TextEdit from that end.
By the way, the KB article you referenced was about shares and about problems with punctuation mounting Mac shares on Windows, so I don't think it pertains. In any case, I'm mounting a FAT volume, not a share, so the drivers would be completely different.
Anyway, thank you for your help. Now that I'm no longer chasing phantoms, I can attend the real problem.
Thanks,
Ralph

International characters from a DB

I'm having problems using a MySQL server with Swedish characters. I'm using latin1 as the character set. Everything works through the mysql client but not through JBDC. I'm getting out all international characters (English) but not the Swedish characters - they just appear like %'s and other signs.
Does anyone use a database with international characters? Do you do anything special to make it work?
Thank you!

I'm using MySQL, with latin1 as the character set (that's the default, I just left it that way). I don't have any problems with non-ASCII characters such as 'é'. But then I put them into the DB using Java and I took them out using Java. I haven't tried the MySQL text-based client but I doubt that it would work correctly with those characters, since my computer is running Windows, and the DOS command line uses a non-standard character set.

International characters apear with ? instead of ó or á or é

After upgrade from 10.4.7 to 10.4.8 my email in yahoo apear with ? instead of the international characters.
What i can do?
I get this
y mis correr�as con alguien que m�s all�
instead of
y mis correrías con alguien que más allá
Thanks a lot.
Jm

Sorry, realized too late. Now that I now how bad it is, I promise not to do it again. Sorry for another useless note for this apology. Go forth and find other evil dooers...

International Characters Turn to ?'s

I'm using Kodo 2.5.4 with MySQL. Any international characters (such as
"__") in MySQL get translated to a "?". Thanks for any clues on how to
prevent this -
Sam

Sam-
There are 3 possible places where something might be going wrong:
1. the database might not be storing the international characters
correctly
2. the JDBC driver might not be handling the international characters
properly
3. Kodo might be doing something wrong with the international characters
Since Kodo just uses the JDBC driver's String handling, I think #1 and #2
are more likely.
Some quick searching on the internet reveals that MySQL did not support
unicode until version 4.1:
http://www.mysql.com/doc/en/Charset-Unicode.html
You might want to try upgrading to see if your problems are magically
solved.
Otherwise, you are going to be limited by the capabilities of the JDBC
driver, and Kodo doesn't do any special handling for unicode characters
(beyond what is provided by the Java language itself). One solution
would be to perform your own encoding into the String field, and then
perform the decoding when you retrieve the field.
In article <bn0bdj$eec$[email protected]>, Sam wrote:
I forgot to mention that I'm also using MySQL version 3.23.58.
I'm using Kodo 2.5.4 with MySQL. Any international characters (such as
"__") in MySQL get translated to a "?". Thanks for any clues on how to
prevent this -
Sam
Marc Prud'hommeaux [email protected]
SolarMetric Inc. http://www.solarmetric.com

International characters in unix

Anyone have any idea how to get international characters to work in a unix with java 1.3.1? All I get when i try to print one of the scandinavian letters (�� dunno if they show right here) is a ? character. What is it that i need to change to get them to work? Some setting in the unix or java perhaps?
-teka

Well the chars came out wrong as i expected. Anyways I managed to solve this on my own so I'll just post the solution here in case someone comes this way looking with a similar problem..
The problem was that the environment variable LANG needed to be set for the shell. You can get the available locales you can set it to with "locale -a" command in your unix shell. in my case it was LANG=fi.ISO8859-15
This was with JRE 1.3.1, with JRE 1.2.2 in the same shell it worked fine without the variable.. I think with 1.3.0 too.
-teka

International characters in iPhone

Has anyone know if you can type international characters with the iPhone keyboard. I send email to clients in Spanish and Portuguese and I find myself missing how easy it was to type accent marks in my previous phone, a blackberry.

Has anyone know if you can type international characters with the iPhone keyboard.
Yes.
http://m10lmac.blogspot.com/2007/09/iphone-input-keyboard-gets-accented.html

Messenger and international characters

I posted this earlier this month on the blackberryforums.com and crackberry.com web sites but didn't really get a answer. Hopefully it will have a better outcome here
Hi, we switched from the 7250 to the 8330 model (what an upgrade!). I have a question though regarding the 83xx messenger, I can't get international characters to work (by holding the key and moving the trackball). It works in other application (like while sending an email or in the address book) but not in messenger. We did not have that problem with the 7250 model.
Is this a known issue/limitation? My firmware version is 4.3.0.124 and my carrier is Telus.
Thanks.

Hello,
I just joined this community, and of course I started my search for this same issue.
After spending more than 4 hours with Tech Support from my Carrier, then going to the Carrier's Store and wasting 2 more hours, then back to the phone with the carriers tech support this time with a spanish speaking cust. svc. agent, this issue is still unresolved, I think, and everyone else who tried to help as well, that this is a GLITCH in Blackberry's phones, I have a Curve 8330, and a Storm, and they both do not allow you to type international characters in SMS ....GRRRRRRRR! yes it works in everything else except there.
I tried the new Tour at the store and it does allow the international characters in SMS messages, but of course they wanted me to dig into my pocked to pay for the difference.
I think we should all come together and write something to Blackberry and let them know that we need this fixed, it's aggravating having to compensate through apologies and typing extra characters just to be able to communicate in a different language with these phones.
I think that these Smart Phones, are not so smart after all....LOL

International characters getting mangled

Similar Messages

Maybe you are looking for