Servlet encoding for non UTF-8 characters

Hi whenever i ran my following code its working in the standalone application
While i am trying in the servlet its showing invalid(?) characters in the output format.
Standalone: String temp="Buôn Ma Thuột";
System.out.println("Test Buôn Ma Thuột:"+temp);
out put:
Test Buôn Ma Thuột:Buôn Ma Thuột
Web application:
String temp="Buôn Ma Thuột";
System.out.println("Test Buôn Ma Thuột:"+temp);
out put:
Test Buôn Ma Thu?t:Buôn Ma Thu?t
i changed the content type in the two ways
a.UTF-8
b.ISO-8859-1
Even though i wont get required output.
Kindly help me.
Edited by: user12145487 on Aug 10, 2011 12:37 AM

Is the device you are printing on capable of displaying those characters? and why are you calling System.out.println() in a Webapp?

Similar Messages

Problem figuring out the encoding for filenames with special characters

I'm not sure if this is the right forum, but this does seem like an OS issue.
I brought in a lot of mp3 and m3u files from a Windows machine to my new Mac. Some of the mp3 files have accented characters in their names, and these names appear in the m3u files. But if I add the m3u file to iTunes, it fails to recognize these names and so I lose all the mp3's with special characters in their names.
I tried to fix this by grabbing the files name in Python, but that didn't work either!
Here's an example: the file's name is "Voilà l'été.mp3"
The m3u files says "Voil\xe0 l'\xe9t\xe9.mp3" -- this doesn't work.
From os.listdir(), I get Voila\xcc\x80 l'e\xcc\x81te\xcc\x81.mp3", but sticking it in an m3u files doesn't work either. (Note that here the characters are encoded as unaccented letter + two byte code for the accent).
When I try these strings from python, e.g. doing os.stat(), they both work; but iTunes doesn't understand any of them!
I'd appreciate any hints on how to enter these names in the m3u file so that iTunes can read it. Thanks!

I know nothing about "m3u" files and how iTunes interprets the file names in them, but if it is not a relative/absolute path problem, then how about just putting the raw file names (not the ones with backslash escape) in m3u file? For example, just put
Voilà l'été.mp3
in m3u?
As for Unicode encoding, HFS+ file system uses the "decomposed form" for accented characters. This means, as you write, à is hex "61 cc 80" in UTF-8, i.e., "a + COMBINING GRAVE ACCENT". The pre-composed form is hex "c3 a0". But my experience is that in most cases both pre-composed and decomosed forms work at the user level (not at the lowest file system level).

Parsing xml in clob field remove non utf-8 characters

hi all,
i have an issue where a stored procedure runs during a nightly process and parses xml contained in a clob field. There are some records that contain non-utf8(they paste characters from word)characters and therefore the parse fails when using performing an xquery select against the clob field as xmltype. I was wondering if anyone knew of a handy way to handle such a situation? I have looked around a bit and not found anything that seemed tailored to my situation, and I would like something a bit more generic than doing a replace on individual characters that i have found causing issues...thx in advance! -- jp

Hi,
Like BluShadow I'm curious to see a test case...
Depending on the way it's been created, the encoding declared in the XML prolog doesn't necessarily reflects the actual encoding of the content.
What's your database character set, and version?
Please also post the error message you get (LPX-00200 probably?).

IDOMServices Parse giving error for non-alpha numeric characters in content

Hi All,
Using Adobe InDesign CS4 SDK 557, I want to create IIDXMLDOMDocument * from a xml stored in a PMString variable.
I used the following code to parse the xml:-
InterfacePtr<IK2ServiceRegistry> servReg(GetExecutionContextSession(), UseDefaultIID());
InterfacePtr<IK2ServiceProvider> provider(servReg->QueryServiceProviderByClassID (kDOMParserService, kDOMParserServiceBoss ));
InterfacePtr<IDOMServices> domService( provider, UseDefaultIID() );
if(!domService)
break;
std::string stdString = myXMLString.GetPlatformString();
const char * buff = stdString.c_str();
InterfacePtr<IPMStream> pmStream(StreamUtil:: CreatePointerStreamRead( (char *)buff, stdString.size()));
IIDXMLDOMDocument * parsedDom = domService->Parse( pmStream );
- Now the problem is when myXMLString have some special character like “0x27” , “0x14” etc. then IDOMServices::Parse fails.
- I tried replacing these characters with “'”, “” but still IDOMServices::Parse fails.
I also tried to used ISAXServices::ParseStream, but it also gives error for the same character.
Also tried setting ISAXParserOptions::SetAbortAfterWarning(kFalse), but not changed in result.
Please let me know if I am missing something.
Thanks,
Jitendra Kumar Singh

Hi Nitika Saini,
Please let me know what's your patch level of BI 7 system.
I am also facing problem with transformations, I didn't see any transper routines in my system for 0IC_C03 - 2LIS_03_BX, LIS_03_BF, 2LIS_03_UM.
Here, my BI 7 patch level BI content 8 and BW pathc 16.
Thanks,
Chandra

Validation for Non-AlphaNumeric characters

Hi All,
I want to do Validation for Non-Alpha Numeric characters.
While saving record, Name should only allow alphanumeric(letters and numbers only) characters, No special characters.
How to do this?
Plz help
Thanks,
Sk

SK
In EOImpl file in setter method of Name you can write below logic
import java.util.regex.*;
public void setLastName(String value)
    Pattern p = Pattern.compile("[^a-zA-Z0-9]");
    Matcher m = p.matcher(value);
        if (m.find()){
            System.out.println("last Name"+value);
            throw new OAException("Special Characters Not allowed in Name", OAException.ERROR);
       setAttributeInternal(LASTNAME, value);
} Hope it helps!!!!
Thanks
AJ

Removing non UTF-8 character set from xml in OSB

Hi,
We have a OSB service where we are receiving lot special characters like (~, â€“ ) in the data between xml tags. As a result these messages are failing in our subsequent EDI systems (though they are getting processed successfully in our OSB). How do I remove these non UTF-8 characters when processing a xml message in OSB?
If I set the Request-Encoding to UTF-8 to the proxy service that is receiving these messages, would these messages be rejected?
Thanks,
Aditya

Hi,
No silver bullet here... I think you will need a java call in order to clean up the special characters from your message...
Cheers,
Vlad

How to change UTF-8 encoding for XML parser (PL/SQL) ?

Hello,
I'm trying to parse xml file stored in CLOB.
p := xmlparser.newParser;
xmlparser.parseCLOB(p, CLOB_xmlBody);
Standard PL/SQL parser encoding is UTF-8. But my xml CLOB contain ISO-8859-2 characters.
Can you advise me, please, how to change encoding for parser?
Any help would be appreciated.
null

Do you documents contain an XML Declaration like this at the top?
<?xml version="1.0" encoding="ISO-8859-2"?>
If not, then they need to. The XML 1.0 specification says that if an XML declaration is not present, the processor must default to assume its in UTF-8 encoding.

Why did TB set the char encoding for a reply to charset=UTF-16LE ?

I got a message from Google AdWords in HTML format and wrote a reply. When I sent it I got a timeout trying to send it. I use AVG antivirus to scan outgoing messages using a "local server" at address 127.0.0.1.
Note that this was my second reply to such a message from AdWords. The only thing I could see that was different was a longer subject line (ending with three periods or dots) and a longer HTML message to which I was replying.
I tried many things to fix the problem and I cannot remember them, sorry.
Finally, I got success after truncating the message and the long subject line. The reply was sent instantly, as usual, instead of timing out.
But it wasn't really success. When I looked at the sent message, the characters in my reply (only) were in Chinese. Looking at the raw (source) message, I see that the charset was set as follows: Content-Type: text/plain; charset=UTF-16LE; format=flowed .
This seems like a strange charset; nowhere in my settings do I specify anything other than Western (ISO-8859-1).
I finally was able to send the message successfully (I think) by using Options > Character Encoding > Western (ISO-8859-1), which seems to force the message to be sent using this standard encoding instead of Little Endian.
What caused this problem to happen? Is there a TB overflow bug for long subject lines?
I realize that TB is an old and unsupported product, but it seems to be the only good email client to use with Windows 8, so I'm just hoping someone knows something about this.

Originally posted by: warren.tang.nospam.com
Warren Tang wrote:
> Warren Tang wrote:
>> Warren Tang wrote:
>>> Hi everybody,
>>>
>>> I've been trying to set the default encoding of new files as UTF-8.
>>> Here are the two settings I've set:
>>>
>>> 1. Windows > Preferences > General > Content Types, set UTF-8 as the
>>> default encoding for all content types.
>>> 2. Windows > Preferences > General > Workspaces, set "Text file
>>> encoding" to "Other : UTF-8".
>>>
>>> However when I create a new text file, the encoding is always
>>> ANSI/ISO-8859-1. What did I missed? Thanks.
>>>
>>> Regards,
>>> Warren
>>
>> I've also tried
>> Project Properties > Resource > Text file encoding = UTF-8
>> However it doesn't work either.
>>
>> The only thing that works is changing the file's encoding property,
>> but I don't want to change it every time I create a new file.
>>
>> Is it a bug?
>
> It turns out that there are other places I need to set up for HTML and
> CSS files:
>
> Windows > Preferences > Web > CSS Files > Encoding = UTF-8
> Windows > Preferences > Web > HTML Files > Encoding = UTF-8
I'm getting mad... The file (on the disk) is still not encoded in UTF-8
but ANSI.

Any recommendations for an editor to help find invalid UTF-8 characters?

I'm generating some XML programmatically. The content has some old invalid characters which are not encoded as UTF-8. For example, in one case there is a long dash character that I can clearly see is ISO-8859-1 in a browser, because if I force the browser to view in ISO-8859-1 I can see the character, but if I view in UTF-8 the characters look like the black-diamond-with-a-question-mark.
BBEdit will warn me that the the file contains invalid UTF-8. But it doesn't show me where those characters are.
If I try to load the XML in a browser, like Chrome, it will tell me where the first instance is of an invalid character, but not where all of them are. So I was able to locate the one you see in the screenshot and go in and manually fix that one entry.. But in BBEdit it again just shows a default invalid character symbol.
What I'd like to be able to do are two things:
(1) Find all invalid characters so I can then go and fix them all at once without repeated "find the first invalid character" fails when loading the XML in a browser.
(2) Know what the characters are (rather than generically seeing a bad character symbold) so I can programmatically strip them out or substitute them when generating the XML. So I need to know what the character values (e.g. the hex values) are for those characters so I can use the replace() method in my server-side JavaScript to get rid of them.
Anybody know a good editor I can use for these purposes?
Thanks,
Doug

Well, now BBEdit doesn't complain anymore about invalid UTF-8 characters. I've gotten rid of all as far as BBEdit is concerned. But Chrome and other browsers still report a few. I'm trying to quash them now, but because the browsers only report them one-by-one I have to generate the XML multiple times to track them down, which takes one hour per run.
I think there are only a few left. One at line 180,000. The next at like 450,000. There are only about 600,000 lines in the file, so I think I'll be done soon. Still... it would be nice to have a Mac tool that would locate all the invalid characters the browsers are choking on so I could fix them in one sweep. It would save hours.
Does anybody know of such a tool for the Mac?
Thanks,
Doug

How to store UTF-8 characters in an iso-8859-1 encoded oracle database?

How can we store UTF-8 characters in an iso-8859-1 encoded oracle database? We can NOT change the database encoding but need to store e.g. Polish or Russian characters besides other European languages.
Is there any stable sollution with good performance?
We use Oracle 8.1.6 with iso-8859-1 encoding, Bea WebLogic 7.0, JDK 1.3.1 and the following thin driver: "Oracle JDBC Driver version - 9.0.2.0.0".

There are a couple of unsupported options, but I wouldn't consider using them on a production database running other critical applications. I would also strongly discourage their use unless you understand in detail how Oracle National Language Support (NLS) works, otherwise you could end up with corrupt data or worse.
In a sense, you've been asked to do the impossible. The existing databas echaracter sets do not support encoding the data you've been asked to store.
Can you create a new database with an appropriate database character set and deploy your application there? That's probably the easiest solution.
If that isn't an option, and you really need to store data in this database, you could use one of the binary data types (RAW and BLOB), but that would mean that it would be exceptionally difficult for applications other than yours to extract the data. You would have to ensure that the data was always encoded in the same character set, otherwise you wouldn't be able to properly decode it later. This would also add a lot of complexity to your application, since you couldn't send or recieve string data from the database.
Unfortunately, I suspect you will have to choose from a list of bad options.
Justin
Distributed Database Consulting, Inc.
http://www.ddbcinc.com/askDDBC

SetMnemonic for non-english characters

Does anybody knos how to set JButtons mnemonic for non-english characters?
My mnemonic is loaded from a resource bundle, and in the documentation the setMnemonic(char) is only limited to english and it is written that the user should call setMnemonic(int) instead.
So what value should this int contains in order to display the non-english char which is loaded from resource bundle?
Thanks in advanve,
Hanoch

It seems that this is an issue that has popped up in various forums before, here's one example from last year:
http://forum.java.sun.com/thread.jspa?forumID=16&threadID=490722
This entry has some suggestions for handling mnemonics in resource bundles, and they would take care of translated mnemonics - as long as the translated values are restricted to the values contained in the VK_XXX keycodes.
And since those values are basically the English (ASCII) character set + a bunch of function keys, it doesn't solve the original problem - how to specify mnemonics that are not part of the English character set. The more I look at this I don't really understand the reason for making setMnemonic (char mnemonic) obsolete and making setMnemonic (int mnemonic) the default. If anything this has made the method more difficult to use.
I also don't understand the statement in the API about setMnemonic (char mnemonic):
"This method is only designed to handle character values which fall between 'a' and 'z' or 'A' and 'Z'."
If the type is "char", why would the character values be restricted to values between 'a' and 'z' or 'A' and 'Z'? I understand the need for the value to be restricted to one keystroke (eliminating the possibility of using ideographic characters), but why make it impossible to use all the Latin-1 and Latin-2 characters, for instance? (and is that in fact the case?) It is established practice on other platforms to be able to use characters such as '�', '�' and '�', for instance.
And if changes were made, why not enable the simple way of specifying a mnemonic that other platforms have implemented, by adding an '&' in front of the character?
Sorry if this disintegrated into a rant - didn't mean to... :-) I'm sure there must be good reasons for the changes, would love to understand them.

Question marks in PDF for non-english characters.

I'm get report from APEX 3.0.1 (Default Report Layout) with BI Publisher 10.1.3.3.1 Base.
In Adobe Reader 7.0.8 instead of non-english(cyrillic) characters see question marks.
How to tune BI Publisher?

After installation BI Publisher 10.1.3.3.1 Base (standalone, OC4J) :
Directory of F:\bip\jdk\lib\fonts
13/10/2007 21:16 15 196 128R00.TTF
13/10/2007 21:16 18 473 348 ALBANWTJ.ttf
13/10/2007 21:16 18 777 132 ALBANWTK.ttf
13/10/2007 21:16 18 676 084 ALBANWTS.ttf
13/10/2007 21:16 18 788 600 ALBANWTT.ttf
13/10/2007 21:16 276 384 ALBANYWT.ttf
13/10/2007 21:16 12 860 B39R00.TTF
13/10/2007 21:16 18 800 MICR____.TTF
13/10/2007 21:16 6 580 UPCR00.TTF
Directory of F:\bip\jdk\jre\lib\fonts
01/08/2006 19:25 75 144 LucidaBrightDemiBold.ttf
01/08/2006 19:25 75 124 LucidaBrightDemiItalic.ttf
01/08/2006 19:25 80 856 LucidaBrightItalic.ttf
01/08/2006 19:25 344 908 LucidaBrightRegular.ttf
01/08/2006 19:25 317 896 LucidaSansDemiBold.ttf
01/08/2006 19:25 698 236 LucidaSansRegular.ttf
01/08/2006 19:25 234 068 LucidaTypewriterBold.ttf
01/08/2006 19:25 242 700 LucidaTypewriterRegular.ttf
Directory of F:\bip\jre\1.4.2\lib\fonts
24/03/2004 19:12 75 144 LucidaBrightDemiBold.ttf
24/03/2004 19:12 75 124 LucidaBrightDemiItalic.ttf
24/03/2004 19:12 80 856 LucidaBrightItalic.ttf
24/03/2004 19:12 344 908 LucidaBrightRegular.ttf
24/03/2004 19:12 317 896 LucidaSansDemiBold.ttf
24/03/2004 19:12 698 236 LucidaSansRegular.ttf
24/03/2004 19:12 234 068 LucidaTypewriterBold.ttf
24/03/2004 19:12 242 700 LucidaTypewriterRegular.ttf
What is wrong?
In Adobe Reader's Document Properties -> Fonts
+Helvetica:
Type: Type1
Encoding: Ansi
Actual Font: ArialMT
Actual Font Type: TrueType
I feel BIP use wrong encoding . . .

PDF generation for Non English Characters from ADF

Hi
We are using below piece of code to generate pdf from ADF Managed bean. It works fine. However for non English Characters(eg. Japanese,Vietnamese,Arabic) it puts
I got few blogs
https://blogs.oracle.com/BIDeveloper/entry/non-english_characters_appears
However we are not using BI Publisher product . We are using its API's
Can anyone tell where do we need to setup fonts within ADF or Weblogic or Server ?
Input Parameters are
a)xml Data
b)InputStream ie rtf Template
import oracle.apps.xdo.XDOException;
import oracle.apps.xdo.template.FOProcessor;
import oracle.apps.xdo.template.RTFProcessor;
    public static byte[] genPdfRep(String pOutFileType,byte[] pXmlOut ,InputStream pTemplate)
        byte[] dataBytes = null;
        try {
            //Process RTF template to convert to XSL-FO format
            RTFProcessor rtfp = new RTFProcessor(pTemplate);
            ByteArrayOutputStream xslOutStream = new ByteArrayOutputStream();
            rtfp.setOutput(xslOutStream);
            rtfp.process();
            //Use XSL Template and Data from the VO to generate report and return the OutputStream of report
            ByteArrayInputStream xslInStream = new ByteArrayInputStream(xslOutStream.toByteArray());
            FOProcessor processor = new FOProcessor();
            ByteArrayInputStream dataStream = new ByteArrayInputStream((byte[])pXmlOut);
            processor.setData(dataStream);
            processor.setTemplate(xslInStream);
            ByteArrayOutputStream pdfOutStream = new ByteArrayOutputStream();
            processor.setOutput(pdfOutStream);
            byte outFileTypeByte = FOProcessor.FORMAT_PDF;
            processor.setOutputFormat(outFileTypeByte); //FOProcessor.FORMAT_HTML
            processor.generate();
            dataBytes = pdfOutStream.toByteArray();
        } catch (XDOException e) {
            e.printStackTrace();
        return dataBytes;
Appreciate your help.
Thanks,
Abhijit

Fonts are defined in the template you use to generate the pdf. Your application add the data and both is processed yb the FOP processor. Now there are two possible causes of the '???' :
1. the data you sent to the template contains the '???' already
2. the template can't digest the data (the special characters) and puts '???' in the pdf.
Before going on you have to find out which one is your problem. The 2nd is the problem you better ask this in a FOP forum as you have to solve it by changing the template.
Timo

Word Replacements for Non- English Characters

Hi
Does anyone have an idea on implementing Word Replacements for non- english characters in TCA- DQM 11i.
We are trying to identify, capture and cleanse common accented characters like à, â , ê
However, the default language for replacement is American English , So even if we add these in the existing lists it will not take any effect
Is creating a new Word replacement list for every language the solution ?? any patch recommendations???
Thanks in advance

It seems that this is an issue that has popped up in various forums before, here's one example from last year:
http://forum.java.sun.com/thread.jspa?forumID=16&threadID=490722
This entry has some suggestions for handling mnemonics in resource bundles, and they would take care of translated mnemonics - as long as the translated values are restricted to the values contained in the VK_XXX keycodes.
And since those values are basically the English (ASCII) character set + a bunch of function keys, it doesn't solve the original problem - how to specify mnemonics that are not part of the English character set. The more I look at this I don't really understand the reason for making setMnemonic (char mnemonic) obsolete and making setMnemonic (int mnemonic) the default. If anything this has made the method more difficult to use.
I also don't understand the statement in the API about setMnemonic (char mnemonic):
"This method is only designed to handle character values which fall between 'a' and 'z' or 'A' and 'Z'."
If the type is "char", why would the character values be restricted to values between 'a' and 'z' or 'A' and 'Z'? I understand the need for the value to be restricted to one keystroke (eliminating the possibility of using ideographic characters), but why make it impossible to use all the Latin-1 and Latin-2 characters, for instance? (and is that in fact the case?) It is established practice on other platforms to be able to use characters such as '�', '�' and '�', for instance.
And if changes were made, why not enable the simple way of specifying a mnemonic that other platforms have implemented, by adding an '&' in front of the character?
Sorry if this disintegrated into a rant - didn't mean to... :-) I'm sure there must be good reasons for the changes, would love to understand them.

SERVLET-put_nonserial error for non-Distributable WebApp

iPlanet App Server 6.0 SP3 Test Drive
I am trying to deploy a web app that is NOT "Distributable." The Deployment Descriptor (web.xml file) does not have a "distributable" element. Furthermore, I set the Mode to Local in the iAS Admin Tool, after I found that had assumed Distributed!
This should work, according to the Servlet Spec v2.2, right?
In spite of the above, the kjs log shows:
[22/Jan/2002 23:34:05:3] error: SERVLET-put_nonserial: Putting non-serializable object when using NAS session
This suggests that the NAS session logic does not pay any attention to the Distributable attribute of the servlet. Since the Admin Tool didn't pay attention to it, I'm not sure I should be surprised. Nevertheless, the problem remains that I have an application that I should be able to deploy, but which is not functioning properly! I look forward to any helpful suggestions.

If you have to deal with none serializeable objects, add this to your ias-web.xml file:
<session-info>
<impl>lite</impl>
</session-info>

Servlet encoding for non UTF-8 characters

Similar Messages

Maybe you are looking for