Reading a website in ISO-8859-1

Hello
I am trying to read a website using the ISO-8859-1 charset.
I have searched a bit and found some different ways suggested for this. This is the one I think I want because it seems to be the simpler one.
byte[] iso88591Data = theString.getBytes("ISO-8859-1");But I don't understand the "flow" of the charsets:
1. When I read an html that has a #&<code> on it my string is in utf-8 or ISO-8859-1?
2. When the getBytes command is used, the specified charset is the one I want it to convert it to or the one it is in?
To understand this problem I did a separate class where I tried the following code.
import java.io.UnsupportedEncodingException;
public class charsetConversion {
     public static void main(String[] args) {
          String in = args[0];
          byte bytes[] = in.getBytes();
          try {
               byte bytesISO[] = in.getBytes("ISO-8859-1");
               String out1 = new String(bytes, "ISO-8859-1");
               String out2 = new String(bytesISO, "ISO-8859-1");
               String out3 = new String(bytes);
               String out4 = new String(bytesISO);
               System.out.println(out1);
               System.out.println(out2);
               System.out.println(out3);
               System.out.println(out4);
          } catch (UnsupportedEncodingException e) {
               e.printStackTrace();
}I run it with the input Poole&#x27 ;s and always get Poole&#x27 ;s. It doesn't have a space between 7 and ; but if I didn't write like that it always shows ' instead.
Don't really know what else to do.

So here is the example of the code
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.UnsupportedEncodingException;
public class charsetConversion {
     public static void main(String[] args) {
          FileReader fr = null;
          BufferedReader br = null;
          String in = null;
          try {
               fr = new FileReader(args[0]);
               br = new BufferedReader(fr);
               in = br.readLine();
          } catch (Exception e) {
               System.out.println(e.getMessage());
               System.exit(0);
          byte bytes[] = in.getBytes();
          try {
               byte bytesISO[] = in.getBytes("ISO-8859-1");
               String out1 = new String(bytes, "ISO-8859-1");
               String out2 = new String(bytesISO, "ISO-8859-1");
               String out3 = new String(bytes);
               String out4 = new String(bytesISO);
               System.out.println(out1);
               System.out.println(out2);
               System.out.println(out3);
               System.out.println(out4);
          } catch (UnsupportedEncodingException e) {
               e.printStackTrace();
}As an argument I pass the file. I use this instead Inside the file I have
Poole&#x27 ;s   (without the space)Without attaching a file it's hard to use the html

Similar Messages

  • Problems with reading XML files with ISO-8859-1 encoding

    Hi!
    I try to read a RSS file. The script below works with XML files with UTF-8 encoding but not ISO-8859-1. How to fix so it work with booth?
    Here's the code:
    import java.io.File;
    import javax.xml.parsers.*;
    import org.w3c.dom.*;
    import java.net.*;
    * @author gustav
    public class RSSDocument {
        /** Creates a new instance of RSSDocument */
        public RSSDocument(String inurl) {
            String url = new String(inurl);
            try{
                DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
                Document doc = builder.parse(url);
                NodeList nodes = doc.getElementsByTagName("item");
                for (int i = 0; i < nodes.getLength(); i++) {
                    Element element = (Element) nodes.item(i);
                    NodeList title = element.getElementsByTagName("title");
                    Element line = (Element) title.item(0);
                    System.out.println("Title: " + getCharacterDataFromElement(line));
                    NodeList des = element.getElementsByTagName("description");
                    line = (Element) des.item(0);
                    System.out.println("Des: " + getCharacterDataFromElement(line));
            } catch (Exception e) {
                e.printStackTrace();
        public String getCharacterDataFromElement(Element e) {
            Node child = e.getFirstChild();
            if (child instanceof CharacterData) {
                CharacterData cd = (CharacterData) child;
                return cd.getData();
            return "?";
    }And here's the error message:
    org.xml.sax.SAXParseException: Teckenkonverteringsfel: "Malformed UTF-8 char -- is an XML encoding declaration missing?" (radnumret kan vara f�r l�gt).
        at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100)
        at org.apache.crimson.parser.InputEntity.fillbuf(InputEntity.java:1072)
        at org.apache.crimson.parser.InputEntity.isXmlDeclOrTextDeclPrefix(InputEntity.java:914)
        at org.apache.crimson.parser.Parser2.maybeXmlDecl(Parser2.java:1183)
        at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java:653)
        at org.apache.crimson.parser.Parser2.parse(Parser2.java:337)
        at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java:448)
        at org.apache.crimson.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:185)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
        at getrss.RSSDocument.<init>(RSSDocument.java:25)
        at getrss.Main.main(Main.java:25)

    I read files from the web, but there is a XML tag
    with the encoding attribute in the RSS file.If you are quite sure that you have an encoding attribute set to ISO-8859-1 then I expect that your RSS file has non-ISO-8859-1 character though I thought all bytes -128 to 127 were valid ISO-8859-1 characters!
    Many years ago I had a problem with an XML file with invalid characters. I wrote a simple filter (using FilterInputStream) that made sure that all the byes it processed were ASCII. My problem turned out to be characters with value zero which the Microsoft XML parser failed to process. It put the parser in an infinite loop!
    In the filter, as each byte is read you could write out the Hex value. That way you should be able to find the offending character(s).

  • Website not displaying correctly. Firefox is changing the character set to Western (ISO-8859-1) automatically.

    Normally I have set Firefox (or it's set by default) to Character Set Unicode (UTF-8) and everything displays perfectly. I've never had a problem before.
    Now however, whenever I upload my own website, for some bizarre reason on that particular tab (and only that tab) the Character Set is changed over to Western (ISO-8859-1) and then there's a few characters within my site that do not display correctly, namely apostrophes and hypens.
    It definitely isn't my software (Serif WebPlus X4) because the page displays correctly in every other browser. Plus it displays correctly in Firefox if I change the Character set back to Unicode.
    PS The site is a work in progress

    That happens because the server sends a content-type (<b>text/html; charset=ISO-8859-1</b>) via the HTTP response headers and in that case that content type prevails. The page code is saved with an UTF-8 byte order mark () that you see in this case.
    *http://web-sniffer.net/?url=http%3A%2F%2Fwww.valuevisionglasses.co.uk&http=1.1&gzip=yes&type=HEAD&uak=0
    *http://httpd.apache.org/docs/current/mod/mod_mime.html#AddType

  • Java App on Linux : Unable to read iso-8859-1 encoded file correctly.

    I have a file which is encoded as iso-8859-1, and contains characters such as ô .
    I am reading this file with java code, something like:
    File in = new File("myfile.csv");
    InputStream fr = new FileInputStream(in);
    byte[] buffer = new byte[4096];
    while (true) {
    int byteCount = fr.read(buffer, 0, buffer.length);
    if (byteCount <= 0) {
    break;
    String s = new String(buffer, 0, byteCount,"ISO-8859-1");
    System.out.println(s);
    However the ô character is always garbled, usually printing as a ? .
    I am running this on a Linux machine. It works fine on my XP machine.
    I have verified that I can see the correct characters when I cat the file on the terminal.
    (Interestingly, but I think maybe only by co-incidence, it works when I run with the -Dfile.encoding=UTF16 option, but not with UTF8, although this appears a hack rather than a fix since this option was not intended for developer use by sun - but I thought mentioning it may provide some clues as to what is going on)

    I think your main probelm is with the console. When you send text to the console, it's sent in the system default encoding. On an English-locale system that might be ASCII, ISO-8859-1, windows-1252, UTF-8, MacRoman, and probably several other possibilities. Then the console decodes the the bytes using whatever encoding it feels like using--on my WinXP machine, it uses cp437 by default (just for laughs, as far as I can tell). If the text happens to be pure, seven-bit ASCII, there's no problem, since all those encodings are identical in that range.
    But if you need to output anything other than ASCII characters, avoid the console. Send the output to a file and specify an encoding that you know will be able to handle your characters--UTF-8 can handle anything. Then open the file with an editor that can read that encoding; most of them can handle UTF-8 these days, and many will even detect it automatically. You also need to be using a font that can display your characters.
    However, you're also going about the reading part wrong. Instead of reading the text in as bytes and passing them to a String constructor, you should use an InputStreamReader and read it as text from the beginning: BufferedReader br = new BufferedReader(
      new InputStreamReader(
        new FileInputStream("myfile.csv"), "ISO-8859-1"));I am curious about your statement that "it works" when you run with the -Dfile.encoding=UTF16 option. I wouldn't be surprised to see it output the correct characters (ASCII characters, anyway), but I would expect to see the characters interspersed with blank spaces or rectangles.

  • Problems reading Latin2 (ISO 8859-2) characters

    Hello!
    I want to read the content of an MS Access table (in an MDB file) using the JDBC:ODBC driver.
    The program works well but there is a character conversion problem when I read text fields from the table.
    The Latin2 (ISO 8859-2) characters like áéíóőűüöÁÉÍÓÜÖŰŐ are replaced by the "?" character.
    I use the ResultSet object's getString() method.
    Any idea about how to solve this problem?

    Try to change session encoding from defaut to iso-8559-2
    This probably would help:
    http://download.oracle.com/javase/1.4.2/docs/guide/jdbc/bridge.html
    >
    What's New with the JDBC-ODBC Bridge?
    * A jdbc:odbc: connection can now have a charSet property, to specify a Character Encoding Scheme other than the client default.
    For possible values, see the Internationalization specification on the Web Site.
    The following code fragment shows how to set 'Big5' as the character set for all character data.
    // Load the JDBC-ODBC bridge driver
    Class.forName(sun.jdbc.odbc.JdbcOdbcDriver) ;
    // setup the properties
    java.util.Properties prop = new java.util.Properties();
    prop.put("charSet", "Big5");
    prop.put("user", username);
    prop.put("password", password);
    // Connect to the database
    con = DriverManager.getConnection(url, prop);

  • ISO-8859-1: %22 equals & # 3 4 ; ?

    NOTE: In this topic, & # 3 4 ; should contain no spaces. But when I remove the spaces, it automatically converts it to " , so that's why I put spaces.
    Hey all,
    Sorry for the 'difficult' topic title. I experience a problem when I try to parse HTML downloaded from imdb.com.
    I have this code to search on IMDB:
    URL imdbURL = new URL("http://us.imdb.com/Find?" + URLEncoder.encode(page1.getSearch(), "ISO-8859-1"));Then, I parse the incoming HTML in this bufferedreader:
    BufferedReader in = new BufferedReader(new InputStreamReader(imdbURL.openStream(), "ISO-8859-1"));Problem I experience, is that when I search for "Mr. Bean", I get a title returned called: & # 3 4 ;Mr. Bean& # 3 4 ; (when you check the website, it displays as "Mr. Bean"). So, on the website, the & # 3 4 ; is translated into " , just like the ISO-8859-1 is meant to do. But my InputStreamReader SHOULD do the same, but it doesn't.
    When I search for: " , I can see that my encoding line encoded the " into %22. So this is probably the reason it doesn't convert & # 3 4 ; into ", because it will only convert %22 into ". Does anyone know how I can solve this problem?
    Thanks in advance!!

    Thanks for your reply,
    Using the bufferedreader, I read the content line by line. At some stage, I get to this line:
    <p><b>Popular Titles</b> (Displaying 3 Results)<ol><li>  <a href="/title/tt0096657/" onclick="set_args('tt0096657',1,1)">&#34;Mr. Bean&#34;</a> (1990)</li>As you can see (or maybe not because this forum automatically parses it automatically), the title is & # 3 4 ;Mr.Bean& # 3 4; . This is also the String my JList displays when I put it in there. So what I want is, of course, that that String will replace that part with " . Of course I could achieve it using:
    sName = sName.replaceAll("& # 3 4 ;", "\"");But, maybe there are more 'codes' like this in the webpage, so I want all the html entity's to be parsed automatically.
    Hopefully I made myself clear.
    Regards,
    Peter

  • Western iso 8859-1 character set is gone, "other (including Western" does not work. Why did you take it out?

    Some websites need Western ISO 8859-1 character set to run properly and you have taken it away in favour of "other(including Western)", now the site does not work properly. Why did you take it out and can you please put it back.

    ISO-8859-1 and Windows-1252 should be equivalent. Can you provide an example of a page that doesn't display properly?
    If you have to manually select a character encoding to view the page correctly, then the site is broken and you should notify its owner that it needs be fixed. Websites specify the character encoding in one of two ways:
    * [http://www.w3.org/International/tutorials/tutorial-char-enc/ Handling character encodings in HTML and CSS | W3C]
    # The ''Content-Type'' response header.
    # The ''meta'' tag in the page source.
    ''Henri Sivonen (:hsivonen) wrote:''
    <blockquote>We are in the process of implementing http://encoding.spec.whatwg.org/ . The process involves removing support for legacy character decoders that aren’t really necessary for supporting existing Web content.</blockquote>

  • Flashfilm =?ISO-8859-15?Q?zerschie=DFt_Tabelle?=

    Hallo NG,
    in der Hoffnung, dass hier noch jemand mitliest, stelle ich
    mal folgende
    Frage:
    Ich habe mit Photoshop eine Website gebastelt und innerhalb
    der Tabelle,
    welche die Slices enthält, zwei
    übereinanderliegende Grafiken durch
    - deren Größe entsprechende - Flashfilme getauscht.
    Nun wird das Tabellengefüge durch 2 horizontale 5px hohe
    Streifen
    unterhalb jedes Filmes getrennt, so, als ob die Filme jeweils
    5 px zu
    hoch wären.
    Dies tritt im Firefox 3 auf, im IE 7 ist alles in Ordnung.
    Wie ich gerade beim Umschalten (nur im IE 7) auf die weiteren
    Seiten
    (mit gleicher Tabelle aber ohne flash-filme) festgestellt
    habe,
    springt die Tabelle/Layout - die ist v/h mittig angeordnet -
    jeweils um
    10px nach oben und links.
    Wie kann ich das Problem abstellen? Google hat leider nix
    gesagt...
    MfG
    Christian
    (achja, MM Studio 8)

    Hi Christian,
    witzig , witzig, wusste ich bis jetzt auch nicht.
    entweder du gibts deinem embed Tag einen style mit
    <embed style="display:block" .....
    oder und das find ich echt abgefahren, du schreibst dein
    <td ..... </td>
    in eine Zeile.
    Dann funktionierts!
    Woher ich das hab? Naja wie sooft google weiß alles
    https://bugzilla.mozilla.org/attachment.cgi?id=103143&action=view
    Gruß Dominik (IPXLAN) franzrahe
    Ganz neu, ganz frisch:
    http://www.superskank.com
    ICQ:165771582
    Skype: ipxlan
    shock-box
    mail: ipxlan<@>gmx.de
    site:
    http://www.shock-box.net
    spirit link
    mail: dominik.franzrahe<@>spiritlink.de
    site:
    http://www.spiritlink.de
    Christian Lifka schrieb:
    > Hi Dominik,
    >
    > ich habe die "Versuchsanordnung" ein wenig abgespeckt,
    hat aber den
    > gleichen Effekt:
    >
    >
    > <html xmlns="
    http://www.w3.org/1999/xhtml">
    > <head>
    > <meta http-equiv="Content-Type" content="text/html;
    charset=iso-8859-1" />
    >
    > <title>ueber_uns</title>
    > <meta http-equiv="Content-Type" content="text/html;
    charset=iso-8859-1" />
    > <style type="text/css">
    > <!--
    > body {
    > background-color: #666666;
    > }
    > -->
    > </style>
    >
    > </head>
    > <body leftmargin="0" topmargin="0" marginwidth="0"
    marginheight="0">
    > <table width="100%" height="100%" border="0"
    cellpadding="5"
    > cellspacing="0">
    > <tr>
    > <td align="center" valign="middle">
    >
    > <!-- ImageReady Slices (ueber_uns.psd) -->
    > <div align="center">
    > <!-- End ImageReady Slices -->
    >
    > <table width="500" border="0" cellpadding="0"
    cellspacing="0"
    > bgcolor="#00FFFF">
    > <tr>
    > <td width="17"> </td>
    > <td width="265"><object
    > classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
    > codebase="
    http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=7,0,19,0"
    > width="265" height="135" title="1">
    > <param name="movie"
    value="flash/anwalt_1_klein.swf">
    > <param name="quality" value="high">
    > <embed src="flash/anwalt_1_klein.swf" quality="high"
    > pluginspage="
    http://www.macromedia.com/go/getflashplayer"
    > type="application/x-shockwave-flash" width="265"
    height="135"></embed>
    > </object></td>
    > <td width="218"
    bordercolor="0"> </td>
    > </tr>
    > <tr>
    > <td> </td>
    > <td><object
    classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
    > codebase="
    http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=7,0,19,0"
    > width="265" height="135" title="2">
    > <param name="movie"
    value="flash/anwalt_2_klein.swf">
    > <param name="quality" value="high">
    > <embed src="flash/anwalt_2_klein.swf" quality="high"
    > pluginspage="
    http://www.macromedia.com/go/getflashplayer"
    > type="application/x-shockwave-flash" width="265"
    height="135"></embed>
    > </object></td>
    > <td> </td>
    > </tr>
    > </table>
    > </div></td>
    > </tr>
    > </table>
    >
    >
    > </body>
    > </html>
    >
    > Gruß
    > Christian
    >

  • Como deixar a codificação padrão "iso-8859-1" ao invés de "utf-8" no navegador?

    Quando faço o acesso ao banco do meu cartão de crédito, a primeira pagina deles é codificada com "UTF-8" e também existe o comando "charset=utf-8" no código HTML, após digitar o número do cartão, ele direciona para outra página, porém codificada com "iso-8859-1", mas sem nenhuma codificação no código HTML.
    Identifiquei que o navegador usa por padrão o "utf-8" quando não existe codificação no código HTML, então gostaria de modificar esse padrão para "iso-8859-1".
    O único lugar que encontrei para essa codificação foi ao pressionar o botão de avançado em "Fontes e cores", porém não funcionou.
    A versão do meu navegado é 23.0.1, entretanto desde a versão 18 que encontro esse problema.

    It is possible that the server sends the file(s) by default with UTF-8 and in such a case this encoding send by the server prevails over all other settings.
    You can see the encoding here: Tools > Page Info > General
    *Press the F10 key or tap the Alt key to bring up the hidden "Menu Bar" temporarily.

  • Mail Receiver - Send file in ISO-8859-1 encoding

    Hi,
    I'm sending mail with an attachment using mail adapter, but instead of specified ISO-8859-1 it is converted to UTF-8 no BOM,. Because of that, some characters (ñ,ç, etc) are not transferred properly.
    Settings:
    Message protocol: XIPAYLOAD
    No mail package.
    Transform.ContentType: multipart/mixed; boundary=--AaZz; charset=ISO-8859-1
    Payload:
    multipart/mixed; boundary=AaZz; charset=ISO-8859-1</Content_Type><Content>--AaZz
    Content-Type: text/plain; charset=ISO-8859-1
    Content-Disposition: inline
    File attachment
    AaZz
    Content-Type: text/plain; charset= ISO-8859-1
    Content-Disposition: attachment; filename=TestFile
    iso-8859 characters ñ ç ñ ñ
    AaZz--
    </Content></ns:Mail>
    I need advice in how to force the file to be created with ISO-8859-1 enconding.
    Thanks in advance.
    Regards,
    Iván.

    Hi Jean-Philippe,
    Yes, please check my first post, if you use same settings, and create message as mine, it should work, the TestFile is created as an attachment.
    Include this line in the module configuration with transform key:
    Transform.ContentType: multipart/mixed; boundary=--AaZz;
    If you still have issues, please give me a description of the error.
    Regards,
    Ivan.

  • Codepage coverting error utf-8 from System codepage to iso-8859-1 (PI 7.1)

    Hello Experts,
    In our Prcess, we receive an Idoc from an IS-U system and then we send this Idoc with some Header-Information via http-Adapter to a Seeburger System.
    In the outbound communication Channel we have a XI Payload manipulation with xml-Code iso-8859-1.
    We get the Error: Codepage coverting error utf-8 from System codepage to iso-8859-1, and only for this Idoc, where othe similar Idocs runs correctly.
    Is it possible, that the Idoc contains non-utf-8 chars so the error occurs?
    PS: another XI in our landscape uses a http-Channel with the same configuration in a similar process, an it work, so guess the Problem is not in the communication channel.
    thanks,
    best regards

    > Is it possible, that the Idoc contains non-utf-8 chars so the error occurs?
    A would rather think, that there could be any non-iso-8859-1 character be in the IDoc. For example an czech or polish character.

  • File adapter ISO-8859-1 encoding problems in XI 3.0

    We are using the XI 3.0 file adapter and are experiencing some XML encoding troubles.
    A SAP R/3 system is delivering an IDoc outbound. XI picks up the IDoc and converts it to an external defined .xml file. The .xml file is send to a connected ftp-server. At the remote FTP server the file is generating an error, as it is expected to arrive in ISO-8859-1 encoding. The Transfer Mode is set to Binary, File Type Text, and Encoding ISO-8859-1.
    The .xml file is encoded correctly in ISO-8859-1, but the problem is that the XML encoding declaration has the wrong value 'UTF-8'.
    Does anybody know of a work around, to change the encoding declaration to ‘ISO-8859-1’ in the message mapping program?

    An example of the XSL code might be as follow:
    <?xml version='1.0'?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method='xml' encoding='ISO-8859-1' />
    <xsl:template match="/">
         <xsl:copy-of select="*" />
    </xsl:template>
    </xsl:stylesheet>

  • Converting from CP1252 (Windows) to ISO 8859-1 doesn't work with java.nio?

    Hi
    I'm trying to write some code that checks whether an InputStream contains only characters with a given encoding. I'm using java.nio for that. For tests, I downloaded some character set examples from http://www.columbia.edu/kermit/csettables.html
    When creating the CharsetDecoder, I want to get all errors:
        Charset charset = Charset.forName( encoding );
        CharsetDecoder decoder = charset.newDecoder();
        decoder.onMalformedInput( CodingErrorAction.REPORT );
        decoder.onUnmappableCharacter( CodingErrorAction.REPORT );I then read an InputStream and try to convert it. If that fails, it can't contain the desired encoding:
        boolean isWellEncoded = true;
        ByteBuffer inBuffer = ByteBuffer.allocate( 1024 );
        ReadableByteChannel channel = Channels.newChannel( inputStream );
        while ( channel.read( inBuffer ) != -1 )
          CharBuffer decoded = null;
          try
            inBuffer.flip();
            decoded = decoder.decode( inBuffer );
          catch ( MalformedInputException ex )
            isWellEncoded = false;
          catch ( UnmappableCharacterException ex )
            isWellEncoded = false;
          catch ( CharacterCodingException ex )
            isWellEncoded = false;
          if ( decoded != null )
            LOG.debug( decoded.toString() );
          if ( !isWellEncoded )
            break;
          inBuffer.compact();
        channel.close();
        return isWellEncoded;Now I want to check whether a file containing Windows 1252 characters is ISO-8859-1. From my point of view, the code above should fail when it gets to the Euro symbol (decimal 128), since that's not defined in ISO-8859-1.
    But all I get is a ? character instead:
    (})  125  07/13  175  7D                 RIGHT CURLY BRACKET, RIGHT BRACE
    (~)  126  07/14  176  7E                 TILDE
    [?]  128  08/00  200  80  EURO SYMBOL
    [?]  130  08/02  202  82  LOW 9 SINGLE QUOTEI also tried to replace the faulty character, using
        decoder.onUnmappableCharacter( CodingErrorAction.REPLACE );
        decoder.replaceWith("!");but I still get the question marks.
    I'm probably doing something fundamentally wrong, but I dont get it :-)
    Any help is greatly appreciated!
    Eric

    As a suggestion....create a complete example demonstrating the problem. It shouldn't have channel in it since that wouldn't appear to be the problem (decoding is.) You should create the byte array in the example code - populate it with the byte sequence that you think should work. And your code should then demonstrate that it doesn't. Then post that.

  • Xslt ecc6  ISO-8859-1 problem when download xml file

    Hello,
    i create an ABAP test program:
    *& Report Z_ABAP_TO_XML                                             *
    *& Write the data from an internal ABAP table into an XML document, *
    *& and write it onto your frontend computer                         *
    REPORT z_abap_to_xml.
    TYPE-POOLS: abap.
    CONSTANTS gs_file TYPE string VALUE 'C:\Users\Marco Consultant\Desktop\test.xml'.
    * This is the structure for the data to go into the XML file
    TYPES: BEGIN OF ts_person,
      cust_id(4)    TYPE n,
      firstname(20) TYPE c,
      lastname(20)  TYPE c,
    END OF ts_person.
    * Table for the XML content
    DATA: gt_itab        TYPE STANDARD TABLE OF char2048.
    * Table and work area for the data to fill the XML file with
    DATA: gt_person      TYPE STANDARD TABLE OF ts_person,
          gs_person      TYPE ts_person.
    * Source table that contains references
    * of the internal tables that go into the XML file
    DATA: gt_source_itab TYPE abap_trans_srcbind_tab,
          gs_source_wa   TYPE abap_trans_resbind.
    * For error handling
    DATA: gs_rif_ex      TYPE REF TO cx_root,
          gs_var_text    TYPE string.
    * Fill the internal table
    gs_person-cust_id   = '3'.
    gs_person-firstname = 'Bill'.
    gs_person-lastname  = 'Gates'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '4'.
    gs_person-firstname = 'Frodo'.
    gs_person-lastname  = 'Baggins'.
    APPEND gs_person TO gt_person.
    * Fill the source table with a reference to the data table.
    * Within the XSLT stylesheet, the data table can be accessed with
    * "IPERSON".
    GET REFERENCE OF gt_person INTO gs_source_wa-value.
    gs_source_wa-name = 'IPERSON'.
    APPEND gs_source_wa TO gt_source_itab.
    * Perform the XSLT stylesheet
    TRY.
        CALL TRANSFORMATION z_abap_to_xml
        SOURCE (gt_source_itab)
        RESULT XML gt_itab.
      CATCH cx_root INTO gs_rif_ex.
        gs_var_text = gs_rif_ex->get_text( ).
        gs_var_text = gs_rif_ex->get_text( ).
        MESSAGE gs_var_text TYPE 'E'.
    ENDTRY.
    * Download the XML file to your client
    CALL METHOD cl_gui_frontend_services=>gui_download
      EXPORTING
        filename                = gs_file
      CHANGING
        data_tab                = gt_itab
      EXCEPTIONS
        file_write_error        = 1
        no_batch                = 2
        gui_refuse_filetransfer = 3
        invalid_type            = 4
        no_authority            = 5
        unknown_error           = 6
        header_not_allowed      = 7
        separator_not_allowed   = 8
        filesize_not_allowed    = 9
        header_too_long         = 10
        dp_error_create         = 11
        dp_error_send           = 12
        dp_error_write          = 13
        unknown_dp_error        = 14
        access_denied           = 15
        dp_out_of_memory        = 16
        disk_full               = 17
        dp_timeout              = 18
        file_not_found          = 19
        dataprovider_exception  = 20
        control_flush_error     = 21
        not_supported_by_gui    = 22
        error_no_gui            = 23
        OTHERS                  = 24.
    IF sy-subrc <> 0.
      MESSAGE ID sy-msgid TYPE sy-msgty NUMBER sy-msgno
      WITH sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
    ENDIF.
    and i created XSLT test conversion:
    <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
      <xsl:output encoding="iso-8859-1" indent="yes" method="xml" version="1.0"/>
      <xsl:strip-space elements="*"/>
      <xsl:template match="/">
        <CUSTOMERS>
          <xsl:apply-templates select="//IPERSON/item"/>
        </CUSTOMERS>
      </xsl:template>
      <xsl:template match="IPERSON/item">
        <item>
          <customer_id>
            <xsl:value-of select="CUST_ID"/>
          </customer_id>
          <first_name>
            <xsl:value-of select="FIRSTNAME"/>
          </first_name>
          <last_name>
            <xsl:value-of select="LASTNAME"/>
          </last_name>
        </item>
      </xsl:template>
    </xsl:transform>
    Seem all correct infact the program download  a file XML but the file have the encoding="UTF-16" also if i have specified "iso-8859-1" and if i tried to opend the xml file the file appears not correct because is generated with as first character "#", why?
    Below the xml generated..
    What i have to do to generate a correct XML without errors?
    #<?xml version="1.0" encoding="utf-16"?>
    <CUSTOMERS>
      <item>
        <customer_id>0003</customer_id>
        <first_name>Bill</first_name>
        <last_name>Gates</last_name>
      </item>
      <item>
        <customer_id>0004</customer_id>
        <first_name>Frodo</first_name>
        <last_name>Baggins</last_name>
      </item>
    </CUSTOMERS>

    hello all!
    i resolve the problem using:
    * Perform the XSLT stylesheet
      g_ixml = cl_ixml=>create( ).
      g_stream_factory = g_ixml->CREATE_STREAM_FACTORY( ).
      g_encoding = g_ixml->create_encoding( character_set = 'utf-16' "unicode
        byte_order = 0 ).
      resstream = g_stream_factory->CREATE_OSTREAM_ITABLE( table = gt_xml_itab ).
      call method resstream->set_encoding
        exporting encoding = g_encoding.
    I think it's the right way, i put all my ABAP program updated:
    *& Report Z_ABAP_TO_XML                                             *
    *& Write the data from an internal ABAP table into an XML document, *
    *& and write it onto your frontend computer                         *
    REPORT z_abap_to_xml.
    TYPE-POOLS: abap.
    CONSTANTS gs_file TYPE string VALUE 'C:UsersMarco ConsultantDesktop     est.xml'.
    data:  g_ixml type ref to if_ixml.
    data:  g_stream_factory type ref to IF_IXML_STREAM_FACTORY.
    data:  resstream type ref to if_ixml_ostream.
    data:  g_encoding type ref to if_ixml_encoding.
    * This is the structure for the data to go into the XML file
    TYPES: BEGIN OF ts_person,
      cust_id(4)    TYPE n,
      firstname(20) TYPE c,
      lastname(20)  TYPE c,
    END OF ts_person.
    * Table for the XML content
    DATA: gt_xml_itab        TYPE STANDARD TABLE OF char2048.
    * Table and work area for the data to fill the XML file with
    DATA: gt_person      TYPE STANDARD TABLE OF ts_person,
          gs_person      TYPE ts_person.
    * Source table that contains references
    * of the internal tables that go into the XML file
    DATA: gt_source_itab TYPE abap_trans_srcbind_tab,
          gs_source_wa   TYPE abap_trans_resbind.
    * For error handling
    DATA: gs_rif_ex      TYPE REF TO cx_root,
          gs_var_text    TYPE string.
    * Fill the internal table
    gs_person-cust_id   = '3'.
    gs_person-firstname = 'Bill'.
    gs_person-lastname  = 'Gates'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '4'.
    gs_person-firstname = 'Frodo'.
    gs_person-lastname  = 'Baggins'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '5'.
    gs_person-firstname = 'Frodo'.
    gs_person-lastname  = 'Baggins'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '6'.
    gs_person-firstname = 'Frodo'.
    gs_person-lastname  = 'Baggins'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '7'.
    gs_person-firstname = 'Frodo'.
    gs_person-lastname  = 'Baggins'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '8'.
    gs_person-firstname = 'Frodo'.
    gs_person-lastname  = 'Baggins'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '9'.
    gs_person-firstname = 'Frodo'.
    gs_person-lastname  = 'Baggins'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '10'.
    gs_person-firstname = 'Frodo'.
    gs_person-lastname  = 'Baggins'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '11'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    gs_person-cust_id   = '88'.
    gs_person-firstname = 'Frodoèé'.
    gs_person-lastname  = 'Baggins~¦Üu0192'.
    APPEND gs_person TO gt_person.
    * Fill the source table with a reference to the data table.
    * Within the XSLT stylesheet, the data table can be accessed with
    * "IPERSON".
    GET REFERENCE OF gt_person INTO gs_source_wa-value.
    gs_source_wa-name = 'IPERSON'.
    APPEND gs_source_wa TO gt_source_itab.
    * Perform the XSLT stylesheet
      g_ixml = cl_ixml=>create( ).
      g_stream_factory = g_ixml->CREATE_STREAM_FACTORY( ).
      g_encoding = g_ixml->create_encoding( character_set = 'utf-16' "unicode
        byte_order = 0 ).
      resstream = g_stream_factory->CREATE_OSTREAM_ITABLE( table = gt_xml_itab ).
      call method resstream->set_encoding
        exporting encoding = g_encoding.
    TRY.
        CALL TRANSFORMATION z_abap_to_xml
        SOURCE (gt_source_itab)
        RESULT XML gt_xml_itab.
      CATCH cx_root INTO gs_rif_ex.
        gs_var_text = gs_rif_ex->get_text( ).
        gs_var_text = gs_rif_ex->get_text( ).
        MESSAGE gs_var_text TYPE 'E'.
    ENDTRY.
    * Download the XML file to your client
    CALL METHOD cl_gui_frontend_services=>gui_download
      EXPORTING
        filename                = gs_file
        FILETYPE                  = 'BIN'
      CHANGING
        data_tab                = gt_xml_itab
      EXCEPTIONS
        file_write_error        = 1
        no_batch                = 2
        gui_refuse_filetransfer = 3
        invalid_type            = 4
        no_authority            = 5
        unknown_error           = 6
        header_not_allowed      = 7
        separator_not_allowed   = 8
        filesize_not_allowed    = 9
        header_too_long         = 10
        dp_error_create         = 11
        dp_error_send           = 12
        dp_error_write          = 13
        unknown_dp_error        = 14
        access_denied           = 15
        dp_out_of_memory        = 16
        disk_full               = 17
        dp_timeout              = 18
        file_not_found          = 19
        dataprovider_exception  = 20
        control_flush_error     = 21
        not_supported_by_gui    = 22
        error_no_gui            = 23
        OTHERS                  = 24.
    IF sy-subrc <> 0.
      MESSAGE ID sy-msgid TYPE sy-msgty NUMBER sy-msgno
      WITH sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
    ENDIF.
    *-- we don't need the stream any more, so let's close it...
    CALL METHOD resstream->CLOSE( ).
    CLEAR resstream.

  • Not a valid SOAP Content-Type: text/html; charset=iso-8859-1

    Friends
    JDEV and SOA suite 10134
    I have multiple domains on my BPEL Server. In one of the domain since I deployed the new process, all the processes of that domain are now failing on execution with following error in opmn/soa_instance/*.err log files. No errors in domain.log
    +"Caused by: java.security.PrivilegedActionException: oracle.j2ee.ws.saaj.ContentTypeException: Not a valid SOAP Content-Type: text/html; cha+
    +rset=iso-8859-1"+
    At the same time we get Internal Server Error on BPEL Console.
    I have sync processes with 1 or two invokes, so I am generally losing the instances, cannot provide the details in the process execution. All BPEL processes are invoking Siebel Web Services, that is the common part.
    When I restart my system, it may or may not work; even if it works then within few instances execution again starts giving the same error. I can see that after the errors the instance are going through and getting completed successfully few times. All these processes were working successfully earlier.
    Any idea about this !!!!
    Thanks

    Thanks Anirudh,
    I don't use compensation handlers. Moreover I have properly defined the scopes and sequences throughout the bpel process. My processes are sync in nature and I'm not able say at what step exactly the processes are failing and throwing the SOAP content Type error though the instances are getting completed with delay soemtimes.

Maybe you are looking for