Latin Charset Problems UTF-8 and ISo-8859-1

Hello all,
I'm having problems displaying latin characters in my application such as ~, ^, ´, ` and so on.
Using jinitiator works fine, but i'm using java 1.5.0_04, it's a client requisit.
I'm using IAS 10gr2 and forms 10gr2 also.
Is there any configuration property that i can change to allow a correct char displaying.
Can anyone help me please?
Thanks in advanced
João Antunes

Hello Francois,
Thanks for replying.
It only happens when i'm writing, all the values queried to the DB are returned ok.
My NLS_LANG=AMERICAN_AMERICA.WE8ISO8859P1.
Thanks in advanced
João Antunes

Similar Messages

Nerving Problem with UTF-8 and ISO-8859-1

Hi,
I´m looking for a solution to serve this problem for many hours now, maybe someone can help me:
1.) We need to send our Mails with the ISO-8859-1-Charset because otherwise Windows-Users get the text in the message twice: once as plain, and after a question mark formated. So I changed the NSPreferredMailCharset in the com.apple.mail.plist to ISO-8859-1:
defaults write com.apple.mail NSPreferredMailCharset "ISO-8859-1"
2.) So far so good. It works until I add an attachment to a message. Adding an attachment forces the sending of the mail again as Unicode (UTF-8). I could change the encoding manual, but thats not the way we can work in our company.
My question is: is there any way to force mail to encode as ISO-8859-1? It can´t be that we have to change the encoding for every message.
Thanks a lot
florian
PS: I´m not sure if this is important: we use the osx in German.

I was thinking that since he is from Austria & references a company, there is a very strong possibility that the character "€" (the Euro currency symbol, Unicode 20AC, UTF-8 E2 82 AC) would frequently appear in messages.
Even if he sets a preference for ISO-8859-1 as the default with Terminal, or manually changes messages to ISO-8859-1, it would not be possible to include this symbol in such messages, since there is no "€" in ISO-8859-1.
Similar problems would occur with other symbols sometimes used in business (for example "™"), in engineering ("Ω"), in mathematics ("∑"), or even with some general punctuation marks such as the dagger ("†").
Other possible problems are the use of other currency symbols the Euro replaced (the franc's "₣" or the lira's "₤") or others still in use (the Israeli new sheqel's "₪ or rupee's "₨"). Ligatures in an international environment would really complicate things as well, as this Wikipedia article about the Œthel illustrates.
Note that in none of these cases would the presence or absence of an attachment matter -- ISO-8859-1 simply isn't up to the task.
I suspect that in some cases, if it is possible, setting the default to Windows-1252 (Windows Latin 1 in Mail's list?) would help, since it does include at least the Euro & dagger. I haven't played around with this much, but I do note that in a new message window containing "€" in the body, if I set the text encoding to Windows Latin 1, Automatic, or UTF-8, Mail doesn't complain, but if I set it to ISO Latin 1, I get an error saying the message can't be saved & an "Invalid Text Encoding" alert if I try to send it.
As for how messages are received at the other end, Windows apps (not just Outlook) are notorious for continuing to use non-Unicode API's even after the OS itself has long since moved to Unicode as its internal standard. Some of them employ bass-ackwards fixes like deciding ISO-8859-1 declarations are supposed to be Windows-1252 ones. Worse, Windows itself sometimes seems to interpret a few Windows-1252 code positions as their ISO-8859-1 control equivalents!
All this makes life that much more complicated for people trying to avoid problems like the above.

British Pound Sterling with UTF-8 and ISO-8859-15

Please excuse my long-windedness ... I'm simply trying to answer all possible questions up front and give the most possible information. I've searched through tons of forums and all over various sites and references and am not able to come up with a concrete solution to this. I'd appreciate any help anyone has.
I'm having some trouble with character sets and international currencies.
Our server was recently upgraded from Red Hat 7.3 to Red Hat 8.0. I understand that the default system encoding thus changed from ISO-8859-15 to UTF-8. I have verified this by executing the following:
public class WhichEncoding {
public static void main(String args[])
    String p = System.getProperty("file.encoding");
    System.out.println(p);
}I have two machines, one which represents the old system (7.3) and one representing the new (8.0), which I will call machine73 and machine80 respectively.
[machine73:~]# java WhichEncoding
ISO-8859-15
[machine80:~]# java WhichEncoding
UTF-8I have also verified that the JVM is using the correct default character set by executing the following:
import java.io.ByteArrayOutputStream;
import java.io.OutputStreamWriter;
public class WhichCharset {
    public static void main (String[] args) {
        String foo = (String)(new OutputStreamWriter(new ByteArrayOutputStream())).getEncoding();
        System.out.println(foo);
}which yields:
[machine73:~]# java WhichCharset
ISO-8859-15
[machine80:~]# java WhichCharset
UTF8Here comes the problem. I have the following piece of code:
import java.text.NumberFormat;
import java.util.Locale;
public class TestPoundSterling
    public static void main (String[] args)
        NumberFormat nf = NumberFormat.getCurrencyInstance(new Locale("en", "GB"));
        System.out.println(nf.format(1.23));
}When I compile and execute this, I see mixed results. On machine73, I see what I would expect to see, the British Pound Sterling followed by 1.23. To be sure, I outputted the results to a file which I viewed in a hex editor, and observed [A3 31 2E 32 33 0A], which seems to be correct.
However, when I execute it on machine80, I see a capital A with a circumflex (carat) preceding the British Pound Sterling and the 1.23. The hex editor shows [C2 A3 31 2E 32 33 0A].
I looked up these hexadecimal values:
Extended ASCII
0xC2 = "T symbol"
0xA3 = lowercase "u" with grave
ISO-8859-1
0xC2 = Capital "A" with circumflex (carat)
0xA3 = British Pound Sterling
Unicode Latin-1
0x00C2 = Capital "A" with circumflex (carat)
0x00A3 = British Pound Sterling
(This explains why, when I remove /bin/unicode_start and reboot, I see a "T symbol" and "u" with a grave in place of what I saw before ... probably an irrelevant sidenote).
I found a possible answer on http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8 under the Examples section. Apparently, a conversion between Unicode and UTF-8 acts differently based on the original Unicode value. Since the Pound Sterling falls between U-00000080 � U-000007FF (using the chart on the mentioned site), the conversion would be (as far as I can tell):
U-000000A3 = 11000010 10101001 = 0xC2 0xA3
This appears to be where the extra 0xC2 pops up.
Finally, to the whole point of this: How can I fix this so that things work as they should on machine80 like they did on machine73. All I want to see at the command line is the Pound Sterling. Getting the 0xC2 preceding the Pound Sterling causes some parts of my applications to fail.
Here's some additional information that might be of use:
[machine73:~]# cat /etc/sysconfig/i18n
LANG="en_US.iso885915"
SUPPORTED="en_US.iso885915:en_US:en"
SYSFONT="lat0-sun16"
SYSFONTACM="iso15"
[machine73:~]# echo $LANG
en_US.iso885915
[machine80:~]# cat /etc/sysconfig/i18n
LANG="en_US.UTF-8"
SUPPORTED="en_US.UTF-8:en_US:en"
SYSFONT="latarcyrheb-sun16"
[machine80:~]# echo $LANG
en_US.UTF-8Any help is very, very much appreciated. Thanks.

you didn't look hard enough, this is a faq...
there three options:
1) change the system encoding by setting LANG or LC_CTYPE environment variables... assuming you use bash:bash$ export LC_CTYPE=en_GB.iso88591 you can check the available locales with locale -a ... pipe it to grep en_GB to filter out the non-british english locales
-OR-
2) change the java default encoding from the command line with -Dfile.encoding... run with$ java -Dfile.encoding=ISO-8859-1 yourclass-OR-
3) set the encoding from within the program with OutputStreamWriter, or use a PrintStream that has the encoding set..PrintStream out = new PrintStream(new FileOutputStream(FileDescriptor.out), true, "ISO-8859-1");
System.setOut(out);see also the internationalization tutorial & the javadoc of the related classes....

Changing charset from UTF-8 to iso-8859-1

Hi,
I am deploying my web service in Weblogic Server 9.2.2.
The result is returned using encoding "UTF-8". How can I
alter this so that the result wil be returned using encoding "iso-8859-1"? Could some one help please?
Thanks in advacne
Mike

What's the source of the "generated content"? I think you'd be better off investigating how to get that source to generate UTF-8 content.
Changing the meta tag won't give you what you want, because it will cause all the accented characters (and anything else outside ASCII) that are entered directly in Muse to be handled incorrectly by the browser.

Decode UTF-8 to ISO-8859-1

I am using the Google Maps API, it return in utf-8,
so for some countries, caracters are wrong,
My server is ISO-8859-1
So, how to convert the result from utf-8 to iso-8859-1 ?
I tried :
<cfprocessingdirective pageEncoding="UTF-8">
<cfcontent type="text/html; charset=UTF-8">
<cfset setEncoding("URL", "UTF-8")>
<cfset setEncoding("FORM", "UTF-8")>
and
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
No change. always the wrong caracters,
Thanks for any help with this mess.
Pierre.

I am using CF v7.
My server and CF are ISO-8859-1.
But I am using Google Maps API , which returns country names in UTF-8.
I tried the Google Maps directives, with adding parameter eo=ISO-8859-1 in the Google Key <script>,
same result.
Then I have to convert to ISO-8859-1 , at least in that page for country names.
(I have a specific problem with Thailande, Google Maps returns : Tha๏lande
and in the data base it writes : tha&#3663 ;lande (an Access database)
I answer to injun [576871] ,
Diificult to insert that code, because I am in a JavaScript code already.
So it will not accept <cfscript> again inside ? I will try.
Thanks for any help.
Pierre.

Mail Sender - Encoding (I need to change from UTF-8 to ISO-8859-1)

Hi,
I'm getting data from email (in ms exchange) using the Mail Sender Adapter.
In the e-mails exists characters as ç (ccedil), ã (atilde), õ (otilde) and others. The XI cannot read this characters because the encode in XML is UTF-8.
How I do to change the encode in XI from UTF-8 to ISO-8859-1 ?
Thank you!

Unfortunately most mail server do not apply the codepage to the content type of a mail.
In this case you have to set the content type with help of the MessageTransformBean:
Transform.ContentType text/plain;charset="ISO-8859-1"
Regards
Stefan

Xml Parse throws SaxParseException.Encoding is UTF-8 insteadof ISO-8859-1 ?

Hi All,
I'm having some korean characters in my xml. when i tried to parse the xml i'm getting SaxParseException .
<?xml version="1.0" encoding="UTF-8"?> --- Throwing Exception
<?xml version="1.0" encoding="ISO-8859-1"?> --- No Exception, successfully parsed
I'm not sure why UTF-8 is failing and ISO is passing. But I'm always getting xml with UTF-8 format? Can anyone know the reason?
I also like to know the differences between UTF-8 and ISO, i don't find any good article/document for this.
Thanks,
J.Kathir

When SAX throws an exception when the encoding is set to UTF-8, then the XML contains something that is not a valid UTF-8 code (i.e. your source file is not encoded using UTF-8). Also: whenever you ask about an exception you should definitely post the entire exception, including message and stack trace.
If it doesn't throw an exception when it is set to ISO-8859-1, then it does not mean that this is the correct choice. ISO-8859-1 is defined from 0 up to 255, so any byte stream is correct in that encoding ('though not necessarily meaninful).
You absolutely have to find out which encoding the file really is, before you can parse it. If it should contain Korean characters then it is definitely not ISO-8859-1 (or any other encoding from the ISO-8859 family), as those only support latin, cyrillic and similar scripts.

XML Encoding Issue - Format UTF-16 to ISO-8859-1

Dear Groupmates,
I have data in my Internal table which i am converting to XML using custom Transformation.
Data is going to third party.The third party system requires data in ISO-8859-1 Format but SAP is generating the same in UTF-16 Format.I have been able to change the format of file from
utf-16 to ISO-8859-1 format but after conversion i am getting invalid tag information in form of characters
like &lt , &gt etc..in my file.
Here is the code i have used to set the encoding to ISO-8859-1 :-
DATA: xmlout TYPE xstring.
DATA: ixml TYPE REF TO if_ixml,
streamfactory TYPE REF TO if_ixml_stream_factory,
encoding TYPE REF TO if_ixml_encoding,
ixml_ostream TYPE REF TO if_ixml_ostream.
ixml = cl_ixml=>create( ).
streamfactory = ixml->create_stream_factory( ).
ixml_ostream = streamfactory->create_ostream_xstring( xmlout ).
encoding = ixml->create_encoding(
character_set = 'ISO-8859-1' byte_order = 0 ).
ixml_ostream->set_encoding( encoding = encoding ).
Sample Output :-
<?xml version="1.0" encoding="iso-8859-1"?>
<AMS_DOC_XML_EXPORT_FILE><AMS_DOCUMENT AUTO_DOC_NUM="FALSE" DOC_CAT="CA" DOC_CD="CA" DOC_DEPT_CD="045" DOC_ID="XR10281060830400001" DOC_IMPORT_MODE="OE" DOC_TYP="CH" DOC_UNIT_CD ="NULL" DOC_VERS_NO="01">
<CH_DOC_HDR AMSDataObject="Y">
<DOC_CAT Attribute="Y"><![CDATA[CA]]></DOC_CAT>
<DOC_TYP Attribute="Y"><![CDATA[CH]]></DOC_TYP>
Please let me know if anyone has idea how i can get rid of the invalid tag information.
Thanks !
With Regards,
Darshan Mulmule

Darshan,
Did you get an answer for this question? We have same requirement to create XML file in ISO-8859-1 format with Attributes is set to "Y" and CDATA is being used for data.
Can you please let me know if you still remember how did you achieve it?
Satyen...

HTTP-Receiver: Code page conversion error from UTF-8 to ISO-8859-1

Hello experts,
In one of our interfaces we are using the payload manipulation of the HTTP receiver channel to change the payload code page from UTF-8 to ISO-8859-1. And from time to time we are facing the following error:
u201CCode page conversion error UTF-8 from system code page to code page ISO-8859-1u201D
Iu2019m quite sure that this error occurs because of non-ISO-8859-1 characters in the processed message. And here comes my question:
Is it possible to change the error behaviour of the code page converter, so that the error will be ignored?
Perhaps the converter could replace the disruptive character with e.g. u201C#u201D?
Thank you in advance.
Best regards,
Thomas

Hello.
I'm not 100% sure if this will help, but it's a good Reading material on the subject (:
[How to Work with Character Encodings in Process Integration (NW7.0)|http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42]
The part of the XSLT / Java mapping might come in handy in your situation.
you can check for problematic chars in the code.
Good luck,
Imanuel Rahamim.

Abap Proxy Convert UTF-8 to ISO-8859-1

Dear,
I have the following scenario:
Abap Proxy-> PI -> WebService.
I need to change the encondig UTF-8 to ISO-8859-1 when the SAP ECC sends data to the PI.
How do I do this?
I have a XSLT program that performs the this conversion, UTF-8 to ISO-8859-1, between PI and WebService.
Regards,
Sérgio Salomã

To understand more about Character Encoding in SAP Process Integration. You can refer the following link:
http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42?quicklink=index&overridelayout=true

Convert utf-8 to iso-8859-1

Hello,
sorry for my very bad english
i use httpxmlrequest to answer a database and show resultin a
div
the string means utf-8 encoded by my javascript fonction and,
of course,
no result are found in the database.
How can i convert the string to iso-8859-1 before request the
database ?
Thank if you have an idea
JiBé (France)

PaulH **AdobeCommunityExpert** a Ã©crit :
> JibÃ© wrote:
>> PaulH **AdobeCommunityExpert** a Ã©crit :
>> I work with a MS SQL server database encoding in
iso-8859-1
>
> data stored in plain text,char,varchar datatypes (ie not
"N")?
datatype of "titre" is varchar(250) and "contenu" is text
using the
> ODBC or JDBC (it would be listed as ms sql server in the
db drivers
> list) driver?
I think it's jdbc driver (case of my test computer)
>
>> The code :
>
> you're not following good i18n practices. while my
preference is for
> unicode ("just use unicode" has been my motto for
years), if you're
> really only ever going to use french & never need
the euro symbol then i
> guess iso-8859-1 (latin-1) is fine.
Here is a part of the content of my application.cfm
<cfprocessingdirective pageencoding="iso-8859-1">
<cfcontent type="text/html; charset=iso-8859-1">
<cfset setEncoding("URL", "iso-8859-1")>
<cfset setEncoding("Form", "iso-8859-1")>
>
> if you think you might need other languages, including
the euro symbol,
> then you should consider unicode. change your text
columsn to "N" type
> (nText, nChar, nVarChar) & swap the latin-1
encodings in the tags above
> to utf-8.
I'm going to test that....
JiBÃ©

Mail adapter module UTF-8 to ISO-8859-1 conversion

Hi!
I've a problem with a mail attachment which is generated by an adapter module for the mail adapter. The content type is set to "Application/EDIFACT; charset=iso-8859-1" when I add the attachment, but the mail adapter ignores the charset-setting.
Therefore german "umlauts" like ü are displayed in a wrong way: Ã¼
When I set the content, I transform it in ISO-8859-1 : attachment.setContent(edifactString.getBytes("ISO-8859-1"),"ISO-8859-1");
When I test the result of edifactString.getBytes("ISO-8859-1"), I get the String in the right character encoding, but the mail adapter seems to "fix" the encoding
I also tried to use the messageTransformBean, but it doesn't worked.
Anyone knows how to solve this issue?
Best regards,
Daniel

Hi all!
I found a solution for this problem: First I used the TextPayload-Object for the Attachment which should be added. It seems that the TextPayload-Object has some bugs handling different encodings (handels only Unicode and deletes the charset=... setting from the ContentType).
When using the Payload object for the attachment (which handles binary data), there is no conversion to Unicode, so I get my attachment as desired (but still without the charset-setting).
Best regards,
Daniel

Changing character encoding in ps xml pub. from utf-8 to iso-8859-1

I am using xml publisher to generate a report in a pdf format, now my problem is user has entered a comment which is not supported by utf but in iso-8559-1 its working fine,
I tried to change the encoding in people code, xml doc file ,schema and xliff file but still the old formatting exist,should I change somewhere else.
Following the error i get when trying to generate pdf:"Error generating report output: (235,2309)Error occurred during the process of generating the output file from template file, XML data file, and translation XLIFF file.".The parser is not able to recognise with utf-8 encoding.

I had the same issue. I created the xml through rowset and used string substitute function and its working.
Sample:
&inXMLDoc = CreateXmlDoc("");
&ret = &inXMLDoc.CopyRowset(&rsHdr);
&sXMLString = &inXMLDoc.GenFormattedXmlString();
&sXMLString = Substitute(&sXMLString, "<?xml version=""1.0""?>", "<?xml version=""1.0"" encoding=""ISO-8859-1""?>");
hope this helps!
GN.

Reverting from UTF-8 to ISO-8859-1

Hi,
i have a database installed in UTF-8, it´s a new instalation and the guides i had didnt mention any restrictions on characterset for the teams that were migrating.
Well the problem is some teams are moving some of their projects to the new server and can´t insert in a VARCHAR2 (3), for example the word "não".
My question is: Can i change the whole database to ISO-8859-1 instead of UTF-8 in order to have words like "não" inserted correctly? If so, is it a simple alter database or a more complicated operation?
Another question, is there any possibility of letting the database as is and make it work without expanding the fields value restriction?
Alx

You can't change a database character set from ISO-8859-1 to UTF8. You can only move from one character set to a strict superset, which doesn't apply here. The supported way to change the character set here would be to create a new database with the ISO-8859-1 character set, export the existing data, and import it into the new system. That assumes, of course, that all the existing characters have an ISO-8859-1 representation (characters like the Euro symbol or Microsoft's curly quotes do not).
By default, a VARCHAR2(3) allocates 3 bytes of space for data. That gets complicated when you use a multi-byte character set like UTF-8 where a character like 'ã' requires 2 bytes of storage. You can define the columns as VARCHAR2(3 CHAR) to allocate 3 characters of storage regardless of the character set. You can also set the parameter NLS_LENGTH_SEMANTICS to CHAR to make the default when you create a table that character rather than byte length semantics are set. Personally, if I'm creating a UTF8 database, I'd want to set NLS_LENGTH_SEMANTICS to CHAR.
Justin

Change encoding from utf-8 to iso-8859-1 in JMS receiver!

Hi.
I have some problems regarding encoding.
The simple setup: dummy datatype as input, XSLT mapping and standard XI output(to JMS).
Are there any way to tell the JMS adapter to deliver the message in iso-8859-1 and not utf-8?
Regards Peter

> Hi Henrique.
>
> This sounds like an idea. Can you guide me to some
> documentation, that describes adding mapping in the
> jms adapter module?
>
> Regards Peter
To use modules in JMS adapter: http://help.sap.com/saphelp_nw2004s/helpdata/en/0f/80243b4a66ae0ce10000000a11402f/frameset.htm
Now, you add the MessageTransforBean module, to use the XSLT mapping. Check the end of this blog to learn how to use XSLT mapping on MessageTransformBean: /people/michal.krawczyk2/blog/2005/11/01/xi-xml-node-into-a-string-with-graphical-mapping
Regards,
Henrique.

Latin Charset Problems UTF-8 and ISo-8859-1

Similar Messages

Maybe you are looking for