Encoding UTF-8; ISO 8859-1

Hello,
It`s a big problem for me, but i think just simple for You.
I get a String, which i want to write into a file with ISO-88591-;
I don`t know with which encoding it was created;
I know a String in Java is alway in unicode; is this unicode always the same, no matter of which encoding the former byte-stream was ??
How can I find out the former encoding ??
Thank You very much for your answer...

As you said, String are Unicode in Java. Period. You can not know the encoding of the byte stream that was used to create it or will be used to store it...
You problem lies here : "I get a String". Where is it coming from? Was it well created (ie was the right encoding specified when creating the String) ?

Similar Messages

Encoding autodetection chooses ISO-8859-5 instead of UTF-8

A certain site producing HTML in UTF-8 without meta tags. The Firefox detects the encoding as ISO-8859-5, an absolutely dead and useless 'standard'. How to exclude it from the autodetection and maki it proper unicode?

It's a strange guess, if it's a guess. Are you sure the server isn't sending a header specifying that encoding? To view the content-type header, you can use an add-on (or if it's not a secure page, an external proxy):
* Live HTTP Headers extension: https://addons.mozilla.org/en-us/firefox/addon/live-http-headers/
* Firebug extension: https://addons.mozilla.org/en-US/firefox/addon/firebug/
* Fiddler debugging proxy: http://www.fiddler2.com/fiddler2/

Encoding XML in ISO-8859-1 from a unicode system

Hello
I want to generate an XML with an encoding ISO-8859-1. I'm on a unicode platform.
I've done the following program :
It works well with the line 'encoding UTF-16.
With the line encoding 'encoding ISO ...', I have special characters in the sting xml_string.
NB : The program works correctly on a non-unicode platform.
Can you help me ?
Thank you
REPORT .
DATA : xml_string TYPE string.
DATA : BEGIN OF l_id,
         numero(10),
         systeme   TYPE gsval,
         date      TYPE d,
         heure     TYPE uzeit,
         type(7),
         nb_nid TYPE i,
       END OF l_id.
DATA: ixml            TYPE REF TO if_ixml,
      streamfactory   TYPE REF TO if_ixml_stream_factory,
      encoding        TYPE REF TO if_ixml_encoding,
      ixml_ostream    TYPE REF TO if_ixml_ostream.
START-OF-SELECTION.
l_id-date    = sy-datum.
l_id-heure   = sy-uzeit.
l_id-type    = 'BATCH'.
ixml = cl_ixml=>create( ).
streamfactory = ixml->create_stream_factory( ).
ixml_ostream = streamfactory->create_ostream_cstring( xml_string ).
encoding = ixml->create_encoding( character_set = 'ISO-8859-1' byte_order = 0 ).
encoding = ixml->create_encoding( character_set = 'UTF-16' byte_order = 0 ).
ixml_ostream->set_encoding( encoding = encoding ).
CALL TRANSFORMATION ztest_xml
        SOURCE id   = l_id
        RESULT XML ixml_ostream.
BREAK-POINT.

Forum rules say: no mail (we must share the solution)
I didn't understand what was exactly his issue, and what he exactly meant by "then to convert with the good encoding".
His first sentence means that he used the following program (using Xstring instead of string):
REPORT .
DATA : xml_xstring TYPE xstring.
DATA : BEGIN OF l_id,
numero(10),
systeme TYPE gsval,
date TYPE d,
heure TYPE uzeit,
type(7),
nb_nid TYPE i,
END OF l_id.
DATA: ixml TYPE REF TO if_ixml,
streamfactory TYPE REF TO if_ixml_stream_factory,
encoding TYPE REF TO if_ixml_encoding,
ixml_ostream TYPE REF TO if_ixml_ostream.
START-OF-SELECTION.
l_id-date = sy-datum.
l_id-heure = sy-uzeit.
l_id-type = 'BATCH'.
ixml = cl_ixml=>create( ).
streamfactory = ixml->create_stream_factory( ).
ixml_ostream = streamfactory->create_ostream_xstring( xml_xstring ).
encoding = ixml->create_encoding( character_set = 'ISO-8859-1' byte_order = 0 ).
ixml_ostream->set_encoding( encoding = encoding ).
CALL TRANSFORMATION id
SOURCE id = l_id
RESULT XML ixml_ostream.
* in debug here, you'll see that xml_xstring contains
* XML result in ISO-8859-1 encoding
BREAK-POINT.

Character set conversion UTF-8 -- ISO-8859-1 generates question mark (?)

I'm trying to convert an XML-file in UTF-8 format to another file with character set ISO-8859-1.
My problem is that the ISO-8859-1 file generates a question mark (?) and puts it as a prefix in the file.
?<?xml version="1.0" encoding="UTF-8"?>
<ns0:messagetype xmlns:ns0="urn:olof">
<underkat>testv��rde</underkat>
</ns0:messagetype>
Is there a way to do the conversion without getting the question mark?
My code looks as follows:
public class ConvertEncoding {
     public static void main(String[] args) {
          String from = "UTF-8", to = "ISO-8859-1";
          String infile = "C:\\temp\\infile.xml", outfile = "C:\\temp\\outfile.xml";
          try {
               convert(infile, outfile, from, to);
          } catch (Exception e) {
               System.out.println(e.getMessage());
               System.exit(1);
     private static void convert(String infile, String outfile,
                                        String from, String to)
                         throws IOException, UnsupportedEncodingException
          //Set up byte streams
          InputStream in = null;
          OutputStream out = null;
          if(infile != null) {
               in = new FileInputStream(infile);
          if(outfile != null) {
               out = new FileOutputStream(outfile);
          //Set up character streams
          Reader r = new BufferedReader(new InputStreamReader(in, from));
          Writer w = new BufferedWriter(new OutputStreamWriter(out, to));
          /*Copy characters from input to output.
           * The InputSreamreader converts
           * from Unicode to the output encoding.
           * Characters that cannot be represented in
           * the output encoding are output as '?'
          char[] buffer = new char[4096];
          int len;
          while((len = r.read(buffer))!= -1) { //Read a block of output
               w.write(buffer, 0, len);
          r.close();
          w.flush();
          w.close();
}

Yes the next character is the '<'
The file that I read from is generated by an integration platform. I send a plain file to it (supposedly in UTF-8 encoding) and it returns another file (in between I call my java class that converts the characterset from UTF-8 to ISO-8859-1). The file that I get back contains the '��' if the conversion doesn't work and '?' if the conversion worked.
My solution so far is to skip the first "junk-characters" when reading from the inputstream. Something like:
private static final char UTF_BOM = '\uFEFF'; //UTF-BOM = ?
String from = "UTF-8", to = "ISO-8859-1";
if (from != null && from.toLowerCase().startsWith("utf-")) { //Are we reading an UTF encoded file?
/*Read first character of the UTF-Encoded file
It will return '?' in the first position if we are dealing with UTF encoding If ? is returned we skip this character in the read
try {
r.mark(1); //Only allow to read one char for the reset function to work
char c;
int i = r.read();
c = (char) i;
if (String.valueOf(UTF_BOM).equalsIgnoreCase(String.valueOf(c))) {
r.reset(); //reset to start position
r.skip(1); //Skip first character when reading from the stream
else {
r.reset();
} catch (IOException e) {
e.getMessage();
//return null;
}

Conversion ISO-8859-7- UTF-8 and UTF-8 - ISO-8859-7

Hi, I written this function to do a Charset conversion
from ISO-8859-7 to UTF-8 and vice versa
void ChangeChersetEncoding(String EncodingType)
String GrammarText;
try
GrammarText = Editor.getText();
b = GrammarText.getBytes(LastEncoding);
String strTemp = new String(b,EncodingType);
Editor.setText(strTemp);
LastEncoding = EncodingType;
catch (UnsupportedEncodingException e)
JOptionPane.showMessageDialog(this, "Error: " + e.getMessage
() , "Error", JOptionPane.ERROR_MESSAGE);
The steps followed are:
1)I initialize Editor (that is a JEditorPane) with a InputStreamReader, that use by default "CP1252"(window - latin1) charset encoding.
2)When I call the function the first time with EncodingType = "ISO-8859-7" and LastEncoding = "CP1252"(window - latin1), Editor shows greek character as I aspected.
3)When I call the function the second time with EncodingType = "UTF-8" and LastEncoding = "ISO-8859-7", Editor shows unknown character ('�') as I aspected.
4)The problem is when I call the function the third time with EncodingType = "ISO-8859-7" and LastEncoding = "UTF-8" Editor don't show the original greek text, as I didn't aspect.
Thank you for all.

b = GrammarText.getBytes(LastEncoding);
String strTemp = new String(b,EncodingType);Here you take a String (which is in Unicode) and convert it to bytes, using "LastEncoding". Next you take those bytes and convert them back to a String, assuming that they were encoded using "EncodingType". But they weren't, so at best this will do nothing and at worst it will produce garbage. It certainly won't do anything useful.
As I said all Java strings are in Unicode. If you want to convert something from one encoding to another encoding, you can only convert an array of bytes to a String using the first encoding, then convert that back to bytes using the second encoding. Converting a String to a String just makes no sense.

SPA504G SPA514G Default Character Encoding stay in ISO-8859-1

Hi,
I have configure like:
<Dictionary_Server_Script ua="na">serv=http://{{ provisioning.server }}/telecom/language/;d0=English;x0=spa50x_30x_en_v754.xml;d1=French;x1=spa50x_30x_fr_v754.xml;</Dictionary_Server_Script>
<Language_Selection ua="na">French</Language_Selection>
<Default_Character_Encoding ua="na">UTF-8</Default_Character_Encoding>
<Locale ua="na">fr-FR</Locale>
Dictionary and Provisioning Profile are encoded in UTF-8.
but when the phone start after provisioning the Default_Character_Encoding set to ISO-8859-1
and the lines labels are misprinted.
Ligne 1
Ligne 2
Olivier
FranÃ§oise
instead of
Ligne 1
Ligne 2
Olivier
Françoise
Any idea ?

I got an answer from the developer.
Pasted here.
I think the default encoding is set back to ISO8859 after customer download the dictionary.
Here is the reason: after 7.5.3, SPA 50x will parse the trkLocaleName in dictionary, for French it will set the phone’s default encoding to iso8859-1 since it is good to French.
French
=================================
•1. If the customer want to use UTF-8 after xml downloading, please modify the trkLocaleName in the French dictionary xml as following:
croatian
It is a workaround, but it's strange why French user will use UTF8. Thanks.
•2. Another way is that user can manually set the default encoding value to UTF-8 after xml downloading.

Any Other "standard encoding character Like ISO 8859-1 "

hi all
Can any body suggest some standard sncoding character set like
ISO-8859-1 .
Its giving some problem.
Its not replacing "Space" with %20.
Thanks

see what i am doing here is first getting the default URL and then appended some parameter like msgtxt="BAL SO000IN" with the URL.
to encode this "space" i am using
EncodingUtil.formUrlEncode(nameValuePairs,Constants.ISO_8859_1);
But the the Http not supporting this type of charcter encode formar
like ISO 8859_1

Error(3): Invalid encoding specified, expecting ISO-8859-1, got windows-125

What to do about this??Don't know why this isn't working even when I change the charset to ISO-8859-1in my code file,,,what's the solution??please help here.....

I was missing the pageEncoding parameter in my included line but even after having included it it's giving me this error msg,
" Error: recursive include directive "
I just feel some conflict might be arising between my code file and web.xml.The web.xml file's code is:-
<?xml version = '1.0' encoding = 'ISO-8859-1'?>
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd" version="2.4" xmlns="http://java.sun.com/xml/ns/j2ee">
<description>Empty web.xml file for Web Application</description>
<session-config>
<session-timeout>35</session-timeout>
</session-config>
<mime-mapping>
<extension>html</extension>
<mime-type>text/html</mime-type>
</mime-mapping>
<mime-mapping>
<extension>txt</extension>
<mime-type>text/plain</mime-type>
</mime-mapping>
<jsp-config>
     <jsp-property-group>
     <url-pattern>*.jsp</url-pattern>
     <include-prelude>/header2.jsp</include-prelude>
     <include-coda>/footer2.jsp</include-coda>
     </jsp-property-group>
</jsp-config>
</web-app>
-----------------------------------------------------------------------------------------------

Xml Parse throws SaxParseException.Encoding is UTF-8 insteadof ISO-8859-1 ?

Hi All,
I'm having some korean characters in my xml. when i tried to parse the xml i'm getting SaxParseException .
<?xml version="1.0" encoding="UTF-8"?> --- Throwing Exception
<?xml version="1.0" encoding="ISO-8859-1"?> --- No Exception, successfully parsed
I'm not sure why UTF-8 is failing and ISO is passing. But I'm always getting xml with UTF-8 format? Can anyone know the reason?
I also like to know the differences between UTF-8 and ISO, i don't find any good article/document for this.
Thanks,
J.Kathir

When SAX throws an exception when the encoding is set to UTF-8, then the XML contains something that is not a valid UTF-8 code (i.e. your source file is not encoded using UTF-8). Also: whenever you ask about an exception you should definitely post the entire exception, including message and stack trace.
If it doesn't throw an exception when it is set to ISO-8859-1, then it does not mean that this is the correct choice. ISO-8859-1 is defined from 0 up to 255, so any byte stream is correct in that encoding ('though not necessarily meaninful).
You absolutely have to find out which encoding the file really is, before you can parse it. If it should contain Korean characters then it is definitely not ISO-8859-1 (or any other encoding from the ISO-8859 family), as those only support latin, cyrillic and similar scripts.

Xml payload encoding from utf to iso

Hi Experts,
Could you please let me know how can I encode he xml payload from utf-8 to ISO-8859-1.
its bit urgent any help is appreciated.
Thanks & Regards,
Ranganath.

Hi Ranganath,
Here is the java mapping for PI 7.1 and above which will transform encoding type from utf-8 to ISO-8859-1.
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.Map;
import com.sap.aii.mapping.api.AbstractTransformation;
import com.sap.aii.mapping.api.StreamTransformationException;
import com.sap.aii.mapping.api.TransformationInput;
import com.sap.aii.mapping.api.TransformationOutput;
public class addAttributeToTag2 extends AbstractTransformation {
      * @param args
     public void execute(InputStream in, OutputStream out)
               throws StreamTransformationException {
          // TODO Auto-generated method stub
          try
               int c;
               int count=0;
               String s="";
               while(1>0)
                    c=in.read();
                    if(c<0)
                         break;
                    if(count<=2 && (char)c=='?')
                         count++;
                    if(count<=2)
                         s=s+(char)c;
                         if(count==2)
                              s=s.replaceAll("utf-8","ISO-8859-1");
                              s=s.replaceAll("UTF-8","ISO-8859-1");
                              count=3;
                              out.write(s.getBytes());
                         continue;
                    out.write(c);
                    //System.out.print((char)c);
               in.close();
               out.close();
          catch(Exception e)
     public void setParameter(Map arg0) {
          // TODO Auto-generated method stub
     public static void main(String[] args) {
          // TODO Auto-generated method stub
          try{
               addAttributeToTag2 genFormat=new addAttributeToTag2();
               FileInputStream in=new FileInputStream("C:\\Apps\\my folder\\sdn\\copy.xml");
               FileOutputStream out=new FileOutputStream("C:\\Apps\\my folder\\sdn\\copy1.xml");
               genFormat.execute(in,out);
               catch(Exception e)
               e.printStackTrace();
     public void transform(TransformationInput arg0, TransformationOutput arg1)
               throws StreamTransformationException {
          this.execute(arg0.getInputPayload().getInputStream(), arg1.getOutputPayload().getOutputStream());
if you are working in PI 7.0 the you need following code
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.Map;
import com.sap.aii.mapping.api.StreamTransformation;
import com.sap.aii.mapping.api.StreamTransformationException;
public class addAttributeToTag2 implements StreamTransformation {
      * @param args
     public void execute(InputStream in, OutputStream out)
               throws StreamTransformationException {
          // TODO Auto-generated method stub
          try
               int c;
               int count=0;
               String s="";
               while(1>0)
                    c=in.read();
                    if(c<0)
                         break;
                    if(count<=2 && (char)c=='?')
                         count++;
                    if(count<=2)
                         s=s+(char)c;
                         if(count==2)
                              s=s.replaceAll("utf-8","ISO-8859-1");
                              s=s.replaceAll("UTF-8","ISO-8859-1");
                              count=3;
                              out.write(s.getBytes());
                         continue;
                    out.write(c);
                    //System.out.print((char)c);
               in.close();
               out.close();
          catch(Exception e)
     public void setParameter(Map arg0) {
          // TODO Auto-generated method stub
     public static void main(String[] args) {
          // TODO Auto-generated method stub
          try{
               addAttributeToTag2 genFormat=new addAttributeToTag2();
               FileInputStream in=new FileInputStream("C:\\Apps\\my folder\\sdn\\copy.xml");
               FileOutputStream out=new FileOutputStream("C:\\Apps\\my folder\\sdn\\copy1.xml");
               genFormat.execute(in,out);
               catch(Exception e)
               e.printStackTrace();
However as Krish has pointedf out file adapter has option to set encoding type, you can try that option first.
regards
Anupam

File adapter ISO-8859-1 encoding problems in XI 3.0

We are using the XI 3.0 file adapter and are experiencing some XML encoding troubles.
A SAP R/3 system is delivering an IDoc outbound. XI picks up the IDoc and converts it to an external defined .xml file. The .xml file is send to a connected ftp-server. At the remote FTP server the file is generating an error, as it is expected to arrive in ISO-8859-1 encoding. The Transfer Mode is set to Binary, File Type Text, and Encoding ISO-8859-1.
The .xml file is encoded correctly in ISO-8859-1, but the problem is that the XML encoding declaration has the wrong value 'UTF-8'.
Does anybody know of a work around, to change the encoding declaration to ISO-8859-1 in the message mapping program?

An example of the XSL code might be as follow:
<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method='xml' encoding='ISO-8859-1' />
<xsl:template match="/">
<xsl:copy-of select="*" />
</xsl:template>
</xsl:stylesheet>

How to set the Xml Encoding ISO-8859-1 to Transformer or DOMSource

I have a xml string and it already contains an xml declaration with encoding="ISO-8859-1". (In my real product, since some of the element/attribute value contains a Spanish character, I need to use this encoding instead of UTF-8.) Also, in my program, I need to add more attributes or manipulate the xml string dynamically, so I create a DOM Document object for that. And, then, I use Transformer to convert this Document to a stream.
My problme is: Firstly, once converted through the Transformer, the xml encoding changed to default UTF-8, Secondly, I wanted to check whether the DOM Document created with the xml string maintains the Xml Encoding of ISO-8859-1 or not. So, I called Document.getXmlEncoding(), but it is throwing a runtime error - unknown method.
Is there any way I can maintain the original Xml Encoding of ISO-8859-1 when I use either the DOMSource or Transformer? I am using JDK1.5.0-12.
Following is my sample program you can use.
I would appreciate any help, because so far, I cannot find any answer to this using the JDK documentation at all.
Thanks,
Jin Kim
import java.io.*;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.w3c.dom.Attr;
import org.xml.sax.InputSource;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Templates;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;
public class XmlEncodingTest
    StringBuffer xmlStrBuf = new StringBuffer();
    TransformerFactory tFactory = null;
    Transformer transformer = null;
    Document document = null;
    public void performTest()
        xmlStrBuf.append("<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>\n")
                 .append("<TESTXML>\n")
                 .append("<ELEM ATT1=\"Yes\" />\n")
                 .append("</TESTXML>\n");
        // the encoding is set to iso-8859-1 in the xml declaration.
        System.out.println("initial xml = \n" + xmlStrBuf.toString());
        try
            //Test1: Use the transformer to ouput the xmlStrBuf.
            // This shows the xml encoding result from the transformer which will change to UTF-8
            tFactory = TransformerFactory.newInstance();
            transformer = tFactory.newTransformer();
            StreamSource ss = new StreamSource( new StringBufferInputStream( xmlStrBuf.toString()));
            System.out.println("Test1 result = ");
            transformer.transform( ss, new StreamResult(System.out));
            //Test2: Create a DOM document object for xmlStrBuf and manipulate it by adding an attribute ATT2="No"
            DocumentBuilderFactory dfactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = dfactory.newDocumentBuilder();
            document = builder.parse( new StringBufferInputStream( xmlStrBuf.toString()));
            // skip adding attribute since it does not affect the test result
            // Use a Transformer to output the DOM document. the encoding becomes UTF-8
            DOMSource source = new DOMSource(document);
            StreamResult result = new StreamResult(System.out);
            System.out.println("\n\nTest2 result = ");
            transformer.transform(source, result);
        catch (Exception e)
            System.out.println("<performTest> Exception caught. " + e.toString());
    public static void main( String arg[])
        XmlEncodingTest xmlTest = new XmlEncodingTest();
        xmlTest.performTest();
}

Thanks DrClap for your answer. With your information, I rewrote the sample program as in the following, and it works well now as I intended! About the UTF-8 and Spanish charaters, I think you are right. It looks like there can be many factors involved on this subject though - for example, the real character sets used to create an xml document, and the xml encoding information declared will matter. The special character I had a trouble was u00F3, and somehow, I found out that Sax Parser or even Document Builder parser does not like this character when encoding is set to "UTF-8" in the Xml document. My sample program below may not be a perfect example, but if you replaces ISO-8859-1 with UTF-8, and not setting the encoding property to the transfermer, you may notice that the special character in my example is broken in Test1 and Test2. In my sample, I decided to use ByteArrayInputStream instead of StringBufferInpuptStream because the documentation says StringBufferInputStream may have a problem with converting characters into bytes.
Thanks again for your help!
Jin Kim
import java.io.*;
import java.util.*;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.w3c.dom.Attr;
import org.xml.sax.InputSource;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Templates;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;
* XML encoding test for Transformer
public class XmlEncodingTest2
    StringBuffer xmlStrBuf = new StringBuffer();
    TransformerFactory tFactory = null;
    Document document = null;
    public void performTest()
        xmlStrBuf.append("<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>\n")
                 .append("<TESTXML>\n")
                 .append("<ELEM ATT1=\"Resoluci�n\">\n")
                 .append("Special charatered attribute test")
                 .append("\n</ELEM>")
                 .append("\n</TESTXML>\n");
        // the encoding is set to iso-8859-1 in the xml declaration.
        System.out.println("**** Initial xml = \n" + xmlStrBuf.toString());
        try
            //TransformerFactoryImpl transformerFactory = new TransformerFactoryImpl();
            //Test1: Use the transformer to ouput the xmlStrBuf.
            tFactory = TransformerFactory.newInstance();
            Transformer transformer = tFactory.newTransformer();
            byte xmlbytes[] = xmlStrBuf.toString().getBytes("ISO-8859-1");
            StreamSource streamSource = new StreamSource( new ByteArrayInputStream( xmlbytes ));
            ByteArrayOutputStream xmlBaos = new ByteArrayOutputStream();
            Properties transProperties = transformer.getOutputProperties();
            transProperties.list( System.out); // prints out current transformer properties
            System.out.println("**** setting the transformer's encoding property to ISO-8859-1.");
            transformer.setOutputProperty("encoding", "ISO-8859-1");
            transformer.transform( streamSource, new StreamResult( xmlBaos));
            System.out.println("**** Test1 result = ");
            System.out.println(xmlBaos.toString("ISO-8859-1"));
            //Test2: Create a DOM document object for xmlStrBuf to add a new attribute ATT2="No"
            DocumentBuilderFactory dfactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = dfactory.newDocumentBuilder();
            document = builder.parse( new ByteArrayInputStream( xmlbytes));
            // skip adding attribute since it does not affect the test result
            // Use a Transformer to output the DOM document.
            DOMSource source = new DOMSource(document);
            xmlBaos.reset();
            transformer.transform( source, new StreamResult( xmlBaos));
            System.out.println("\n\n****Test2 result = ");
            System.out.println(xmlBaos.toString("ISO-8859-1"));
            //xmlBaos.flush();
            //xmlBaos.close();
        catch (Exception e)
            System.out.println("<performTest> Exception caught. " + e.toString());
        finally
    public static void main( String arg[])
        XmlEncodingTest2 xmlTest = new XmlEncodingTest2();
        xmlTest.performTest();
}

Problems with reading XML files with ISO-8859-1 encoding

Hi!
I try to read a RSS file. The script below works with XML files with UTF-8 encoding but not ISO-8859-1. How to fix so it work with booth?
Here's the code:
import java.io.File;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import java.net.*;
* @author gustav
public class RSSDocument {
    /** Creates a new instance of RSSDocument */
    public RSSDocument(String inurl) {
        String url = new String(inurl);
        try{
            DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
            Document doc = builder.parse(url);
            NodeList nodes = doc.getElementsByTagName("item");
            for (int i = 0; i < nodes.getLength(); i++) {
                Element element = (Element) nodes.item(i);
                NodeList title = element.getElementsByTagName("title");
                Element line = (Element) title.item(0);
                System.out.println("Title: " + getCharacterDataFromElement(line));
                NodeList des = element.getElementsByTagName("description");
                line = (Element) des.item(0);
                System.out.println("Des: " + getCharacterDataFromElement(line));
        } catch (Exception e) {
            e.printStackTrace();
    public String getCharacterDataFromElement(Element e) {
        Node child = e.getFirstChild();
        if (child instanceof CharacterData) {
            CharacterData cd = (CharacterData) child;
            return cd.getData();
        return "?";
}And here's the error message:
org.xml.sax.SAXParseException: Teckenkonverteringsfel: "Malformed UTF-8 char -- is an XML encoding declaration missing?" (radnumret kan vara f�r l�gt).
    at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100)
    at org.apache.crimson.parser.InputEntity.fillbuf(InputEntity.java:1072)
    at org.apache.crimson.parser.InputEntity.isXmlDeclOrTextDeclPrefix(InputEntity.java:914)
    at org.apache.crimson.parser.Parser2.maybeXmlDecl(Parser2.java:1183)
    at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:653)
    at org.apache.crimson.parser.Parser2.parse(Parser2.java:337)
    at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:448)
    at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:185)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
    at getrss.RSSDocument.<init>(RSSDocument.java:25)
    at getrss.Main.main(Main.java:25)

I read files from the web, but there is a XML tag
with the encoding attribute in the RSS file.If you are quite sure that you have an encoding attribute set to ISO-8859-1 then I expect that your RSS file has non-ISO-8859-1 character though I thought all bytes -128 to 127 were valid ISO-8859-1 characters!
Many years ago I had a problem with an XML file with invalid characters. I wrote a simple filter (using FilterInputStream) that made sure that all the byes it processed were ASCII. My problem turned out to be characters with value zero which the Microsoft XML parser failed to process. It put the parser in an infinite loop!
In the filter, as each byte is read you could write out the Hex value. That way you should be able to find the offending character(s).

Codepage Conversionerror UTF-8 from System-Codepage to Codepage iso-8859-1

Hello,
we have on SAP PI 7.1 the problem that we can't process a IDOC to Plain HTTP.
The channel throws "Codepage Conversionerror UTF-8 from System-Codepage to Codepage iso-8859-1".
The IDOC is 25 MB. Does anybody have a idea how we can find out what is wrong with the IDOC?
Thanks in advance.

In java strings are always unicode i.e. utf16. Its the byte arrays that are encoded. So use the following codeString iso,utf,temp = "�� ";
byte b8859[] = temp.getBytes("ISO-8859-1");
byte butf8= temp.getBytes("utf8");
try{
iso = new String(b8859,"ISO-8859-1");
utf = new String(butf8,"UTF-8");
System.out.println("ISO-8859-1:"+iso);
System.out.println("UTF-8:"+utf);
System.out.println("UTF to ISO-8859-1:"+new String(utf.getBytes("iso8859_1"),"ISO-8859-1"));
System.out.println(utf);
System.out.println(iso);
}catch(Exception e){ }Also keep in mind that DOS window doesnot support international characters so write it to a file

Encode a Text to ISO-8859-2 Format

I have a text like ("one example"). So i want to Encode it into ISO-8859-2 format.for that is there any built-in package or any way . Can anybody suggest me by giving sample code.
Thanks,
MaheshM

You can use convert to convert a text from one characterset to another:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions027.htm#SQLRF00620
e.g.
select convert('<your_text>', 'AL32UTF8', 'EE8ISO8859P2') from dualto convert a UTF-8 Text to Latin-2. The text should be encoded in the characterset you passed as source characterset of course.
cheers

Encoding UTF-8; ISO 8859-1

Similar Messages

Maybe you are looking for