AQ Adapter character encoding error

Hi,
We have two composites that exchange data trough a XMLType Oracle queue (AQ). The messages are being enqueued and dequeued using AQ Adapters running on SOA Suite 11.1.1.7.0.
Everything is working fine with the composite enqueuing messages. However, the composite that dequeues message are getting enconding problems in message content: every character with accentuation are being replaced by Ã.
When I see the message content directly at database using SQL Developer it seems OK.
Below is an example of the error:
Message being enqueued by composite one:
<process>
     <requisitor>ECatalogo</requisitor>
     <identificacao>7267</identificacao>
     <codigoProduto>9268208000</codigoProduto>
     <categoria1>
          <id>13</id>
          <descricao>FERRAMENTAS ELÉTRICAS</descricao>
     </categoria1>
     <categoria2>
          <id>386</id>
          <descricao>ACESSÓRIOS PARA MICRORRETÍFICA</descricao>
     </categoria2>
     <categoria3>
          <id>3396</id>
          <descricao>CAIXA PLÁSTICA DE ACESSÓRIOS</descricao>
     </categoria3>
</process>
Message being received by composite two:
<process>
     <requisitor>ECatalogo</requisitor>
     <identificacao>7267</identificacao>
     <codigoProduto>9268208000</codigoProduto>
     <categoria1>
          <id>13</id>
          <descricao>FERRAMENTAS ELÃTRICAS</descricao>
     </categoria1>
     <categoria2>
          <id>386</id>
          <descricao>ACESSÃ?RIOS PARA MICRORRETÍFICA</descricao>
     </categoria2>
     <categoria3>
          <id>3396</id>
          <descricao>CAIXA PLÁSTICA DE ACESSÃRIOS</descricao>
     </categoria3>
</process>
Any ideas about what could by the problem?
Thanks in advance.

Hi,
We have two composites that exchange data trough a XMLType Oracle queue (AQ). The messages are being enqueued and dequeued using AQ Adapters running on SOA Suite 11.1.1.7.0.
Everything is working fine with the composite enqueuing messages. However, the composite that dequeues message are getting enconding problems in message content: every character with accentuation are being replaced by Ã.
When I see the message content directly at database using SQL Developer it seems OK.
Below is an example of the error:
Message being enqueued by composite one:
<process>
     <requisitor>ECatalogo</requisitor>
     <identificacao>7267</identificacao>
     <codigoProduto>9268208000</codigoProduto>
     <categoria1>
          <id>13</id>
          <descricao>FERRAMENTAS ELÉTRICAS</descricao>
     </categoria1>
     <categoria2>
          <id>386</id>
          <descricao>ACESSÓRIOS PARA MICRORRETÍFICA</descricao>
     </categoria2>
     <categoria3>
          <id>3396</id>
          <descricao>CAIXA PLÁSTICA DE ACESSÓRIOS</descricao>
     </categoria3>
</process>
Message being received by composite two:
<process>
     <requisitor>ECatalogo</requisitor>
     <identificacao>7267</identificacao>
     <codigoProduto>9268208000</codigoProduto>
     <categoria1>
          <id>13</id>
          <descricao>FERRAMENTAS ELÃTRICAS</descricao>
     </categoria1>
     <categoria2>
          <id>386</id>
          <descricao>ACESSÃ?RIOS PARA MICRORRETÍFICA</descricao>
     </categoria2>
     <categoria3>
          <id>3396</id>
          <descricao>CAIXA PLÁSTICA DE ACESSÃRIOS</descricao>
     </categoria3>
</process>
Any ideas about what could by the problem?
Thanks in advance.

Similar Messages

  • Character encoding error

    Hi,
    I've some problem using Oracle xml parser v2 and XSLT.
    The xml source file use ISO-8859-1 character table and i wanna create and UTF-8 in the result file and the parser complains.
    Here's some output of the exception:
    oracle.xml.parser.v2.XMLParseException: Missing entity 'auml'.
    at oracle.xml.parser.v2.XMLError.flushErrors(XMLError.java:208)
    Another strange thing is in the file i code the character '&' to &; and the output is &#38; works fine, but if i code character '>' to > and i get the output the character '>' and not &#62;
    /thanx in advance
    Hoa Tran
    [email protected]

    Hi you look like have more knowledge about this UTF-8. I am getting the following error in the browser, when I execute a demo XSQL file. Can you please tell what could be wrong ..Thanks ..// Manohar //
    Environment: Windows NT, Sun's Java Web Server. Oracle XML Parser V2 for java, XSQL Utility.
    Database: Oracle 8i on Lynux.
    The errors is ( Apprears in the browser window when I type the following URL: http://sacdev03:8080/servlet/XSQLServlet/demo/emp.xsql
    500 Internal Server Error
    The servlet named XSQLServlet at the requested URL
    http://sacdev03:8080/servlet/XSQLServlet/demo/helloworld.xsql
    reported this exception: sun.io.CharToByteUTF-8. Please report this to the administrator of the web server.
    java.lang.IllegalArgumentException: sun.io.CharToByteUTF-8 at sun.io.CharToByteConverter.getConverterClass(CharToByteConverter.java:79) at sun.io.CharToByteConverter.getConverter(CharToByteConverter.java:109) at java.io.OutputStreamWriter.(OutputStreamWriter.java:75) at com.sun.server.servlet.http.HttpResponse.getWriter(HttpResponse.java:672) at oracle.xml.xsql.XSQLServletPageRequest.setupWriter(XSQLServletPageRequest.java:129) at oracle.xml.xsql.XSQLServletPageRequest.setContentType(XSQLServletPageRequest.java:156) at oracle.xml.xsql.XSQLPageProcessor.process(XSQLPageProcessor.java:268) at oracle.xml.xsql.XSQLServlet.doGet(XSQLServlet.java:124) at javax.servlet.http.HttpServlet.service(HttpServlet.java:715) at javax.servlet.http.HttpServlet.service(HttpServlet.java:840) at com.sun.server.ServletState.callService(ServletState.java:226) at com.sun.server.ServletManager.callServletService(ServletManager.java:936) at com.sun.server.http.servlet.InvokerServlet.service(InvokerServlet.java:137) at javax.servlet.http.HttpServlet.service(HttpServlet.java:840) at com.sun.server.ServletState.callService(ServletState.java:226) at com.sun.server.ServletManager.callServletService(ServletManager.java:936) at com.sun.server.ProcessingState.invokeTargetServlet(ProcessingState.java:423) at com.sun.server.http.HttpProcessingState.execute(HttpProcessingState.java:79) at com.sun.server.http.stages.Runner.process(Runner.java:79) at com.sun.server.ProcessingSupport.process(ProcessingSupport.java:294) at com.sun.server.Service.process(Service.java:204) at com.sun.server.http.HttpServiceHandler.handleRequest(HttpServiceHandler.java:374) at com.sun.server.http.HttpServiceHandler.handleRequest(HttpServiceHandler.java:166) at com.sun.server.HandlerThread.run(HandlerThread.java:162)
    Thanks for the help in advance. .....// Manohar //
    null

  • Bizarre character encoding error; %7F character inserted frequently???

    Hi there,
    I've been experiencing an issue with my Mac (Macbook, 2.0 GHz) for at least the last six to nine months. While typing, sometimes if I use home-select (ctrlshiftleft arrow key--to select an entire line of text), and then press "delete", the text will appear to delete but a %7F character will appear in its place. This character is invisible on the Mac, but visible to all non-Mac readers as a square box. If I am going back through a text box, the %7F character is totally invisible to me, but if I press left or right the cursor will move behind or ahead of the character.
    I normally only notice it when someone I am talking to says "What's that weird box in your typing" or if I do it in a place like the address bar of Safari such that when I press enter it tries to resolve the domain %7fhttp://www.google.ca or some such.
    This happens in all applications for me, although I notice it most in Safari since that's the place where I'm doing the most line selecting and text editing like that.
    I wish I could drum up an example of this happening to me, but I still haven't figured out consistently what behaviour of mine causes this, only that it happens fairly regularly but not consistently when I select text with the keyboard and press delete.
    To make matters worse, I can't seem to find any reference to this happening to anyone else. Of course it's impossible to Google the error message, since Google ignores %7F as a search term.

    That makes sense. Is there some particular character combination that would insert this character (again, it comes up as a box) while I'm typing, rather than actually performing a Delete?
    I'm assuming that's what's happening--that whatever particular key combination I'm pressing to do the delete, I sometimes fumble and press another key combination instead, which inserts the character rather than doing the delete.

  • Wrong character encoding in error messages

    The Java compiler can be adjusted to source file encoding with the option javac -encoding ...
    The Java runtime can be adjusted to terminal encoding with java -Dfile.encoding=...
    While this appears somehow inconsistent, it works and can be used e.g. when running the tools from Cygwin (the POSIX layer on Windows) which uses UTF-8 by default, while Java, following the Windows mechanism, uses some other character encoding by default (this works more seemlessly on Unix/Linux, by the way).
    Now if I compile UTF-8 source with non-ASCII characters, and there is an error message related to them, the error message printed to the console will not be UTF-8 encoded, resulting in mangled text output.
    (Arguably, source and terminal encoding could be different, but then there is no option available to the compiler to adjust this;
    it does not accept -Dfile.encoding=....)
    Example: Error message looks like this:
    FM.java:1: error: class, interface, or enum expected
    b▒h
    While the string is actually "bäh" in the source.
    This is a bug. Any proper place to actually report a bug?
    Edited by: 994195 on 15-Mar-2013 09:42

    I'll ignore you just blatantly assuming it is a bug because you say so, you did not think to type "java report bug" into Google?

  • What every developer should know about character encoding

    This was originally posted (with better formatting) at Moderator edit: link removed/what-every-developer-should-know-about-character-encoding.html. I'm posting because lots of people trip over this.
    If you write code that touches a text file, you probably need this.
    Lets start off with two key items
    1.Unicode does not solve this issue for us (yet).
    2.Every text file is encoded. There is no such thing as an unencoded file or a "general" encoding.
    And lets add a codacil to this – most Americans can get by without having to take this in to account – most of the time. Because the characters for the first 127 bytes in the vast majority of encoding schemes map to the same set of characters (more accurately called glyphs). And because we only use A-Z without any other characters, accents, etc. – we're good to go. But the second you use those same assumptions in an HTML or XML file that has characters outside the first 127 – then the trouble starts.
    The computer industry started with diskspace and memory at a premium. Anyone who suggested using 2 bytes for each character instead of one would have been laughed at. In fact we're lucky that the byte worked best as 8 bits or we might have had fewer than 256 bits for each character. There of course were numerous charactersets (or codepages) developed early on. But we ended up with most everyone using a standard set of codepages where the first 127 bytes were identical on all and the second were unique to each set. There were sets for America/Western Europe, Central Europe, Russia, etc.
    And then for Asia, because 256 characters were not enough, some of the range 128 – 255 had what was called DBCS (double byte character sets). For each value of a first byte (in these higher ranges), the second byte then identified one of 256 characters. This gave a total of 128 * 256 additional characters. It was a hack, but it kept memory use to a minimum. Chinese, Japanese, and Korean each have their own DBCS codepage.
    And for awhile this worked well. Operating systems, applications, etc. mostly were set to use a specified code page. But then the internet came along. A website in America using an XML file from Greece to display data to a user browsing in Russia, where each is entering data based on their country – that broke the paradigm.
    Fast forward to today. The two file formats where we can explain this the best, and where everyone trips over it, is HTML and XML. Every HTML and XML file can optionally have the character encoding set in it's header metadata. If it's not set, then most programs assume it is UTF-8, but that is not a standard and not universally followed. If the encoding is not specified and the program reading the file guess wrong – the file will be misread.
    Point 1 – Never treat specifying the encoding as optional when writing a file. Always write it to the file. Always. Even if you are willing to swear that the file will never have characters out of the range 1 – 127.
    Now lets' look at UTF-8 because as the standard and the way it works, it gets people into a lot of trouble. UTF-8 was popular for two reasons. First it matched the standard codepages for the first 127 characters and so most existing HTML and XML would match it. Second, it was designed to use as few bytes as possible which mattered a lot back when it was designed and many people were still using dial-up modems.
    UTF-8 borrowed from the DBCS designs from the Asian codepages. The first 128 bytes are all single byte representations of characters. Then for the next most common set, it uses a block in the second 128 bytes to be a double byte sequence giving us more characters. But wait, there's more. For the less common there's a first byte which leads to a sersies of second bytes. Those then each lead to a third byte and those three bytes define the character. This goes up to 6 byte sequences. Using the MBCS (multi-byte character set) you can write the equivilent of every unicode character. And assuming what you are writing is not a list of seldom used Chinese characters, do it in fewer bytes.
    But here is what everyone trips over – they have an HTML or XML file, it works fine, and they open it up in a text editor. They then add a character that in their text editor, using the codepage for their region, insert a character like ß and save the file. Of course it must be correct – their text editor shows it correctly. But feed it to any program that reads according to the encoding and that is now the first character fo a 2 byte sequence. You either get a different character or if the second byte is not a legal value for that first byte – an error.
    Point 2 – Always create HTML and XML in a program that writes it out correctly using the encode. If you must create with a text editor, then view the final file in a browser.
    Now, what about when the code you are writing will read or write a file? We are not talking binary/data files where you write it out in your own format, but files that are considered text files. Java, .NET, etc all have character encoders. The purpose of these encoders is to translate between a sequence of bytes (the file) and the characters they represent. Lets take what is actually a very difficlut example – your source code, be it C#, Java, etc. These are still by and large "plain old text files" with no encoding hints. So how do programs handle them? Many assume they use the local code page. Many others assume that all characters will be in the range 0 – 127 and will choke on anything else.
    Here's a key point about these text files – every program is still using an encoding. It may not be setting it in code, but by definition an encoding is being used.
    Point 3 – Always set the encoding when you read and write text files. Not just for HTML & XML, but even for files like source code. It's fine if you set it to use the default codepage, but set the encoding.
    Point 4 – Use the most complete encoder possible. You can write your own XML as a text file encoded for UTF-8. But if you write it using an XML encoder, then it will include the encoding in the meta data and you can't get it wrong. (it also adds the endian preamble to the file.)
    Ok, you're reading & writing files correctly but what about inside your code. What there? This is where it's easy – unicode. That's what those encoders created in the Java & .NET runtime are designed to do. You read in and get unicode. You write unicode and get an encoded file. That's why the char type is 16 bits and is a unique core type that is for characters. This you probably have right because languages today don't give you much choice in the matter.
    Point 5 – (For developers on languages that have been around awhile) – Always use unicode internally. In C++ this is called wide chars (or something similar). Don't get clever to save a couple of bytes, memory is cheap and you have more important things to do.
    Wrapping it up
    I think there are two key items to keep in mind here. First, make sure you are taking the encoding in to account on text files. Second, this is actually all very easy and straightforward. People rarely screw up how to use an encoding, it's when they ignore the issue that they get in to trouble.
    Edited by: Darryl Burke -- link removed

    DavidThi808 wrote:
    This was originally posted (with better formatting) at Moderator edit: link removed/what-every-developer-should-know-about-character-encoding.html. I'm posting because lots of people trip over this.
    If you write code that touches a text file, you probably need this.
    Lets start off with two key items
    1.Unicode does not solve this issue for us (yet).
    2.Every text file is encoded. There is no such thing as an unencoded file or a "general" encoding.
    And lets add a codacil to this – most Americans can get by without having to take this in to account – most of the time. Because the characters for the first 127 bytes in the vast majority of encoding schemes map to the same set of characters (more accurately called glyphs). And because we only use A-Z without any other characters, accents, etc. – we're good to go. But the second you use those same assumptions in an HTML or XML file that has characters outside the first 127 – then the trouble starts. Pretty sure most Americans do not use character sets that only have a range of 0-127. I don't think I have every used a desktop OS that did. I might have used some big iron boxes before that but at that time I wasn't even aware that character sets existed.
    They might only use that range but that is a different issue, especially since that range is exactly the same as the UTF8 character set anyways.
    >
    The computer industry started with diskspace and memory at a premium. Anyone who suggested using 2 bytes for each character instead of one would have been laughed at. In fact we're lucky that the byte worked best as 8 bits or we might have had fewer than 256 bits for each character. There of course were numerous charactersets (or codepages) developed early on. But we ended up with most everyone using a standard set of codepages where the first 127 bytes were identical on all and the second were unique to each set. There were sets for America/Western Europe, Central Europe, Russia, etc.
    And then for Asia, because 256 characters were not enough, some of the range 128 – 255 had what was called DBCS (double byte character sets). For each value of a first byte (in these higher ranges), the second byte then identified one of 256 characters. This gave a total of 128 * 256 additional characters. It was a hack, but it kept memory use to a minimum. Chinese, Japanese, and Korean each have their own DBCS codepage.
    And for awhile this worked well. Operating systems, applications, etc. mostly were set to use a specified code page. But then the internet came along. A website in America using an XML file from Greece to display data to a user browsing in Russia, where each is entering data based on their country – that broke the paradigm.
    The above is only true for small volume sets. If I am targeting a processing rate of 2000 txns/sec with a requirement to hold data active for seven years then a column with a size of 8 bytes is significantly different than one with 16 bytes.
    Fast forward to today. The two file formats where we can explain this the best, and where everyone trips over it, is HTML and XML. Every HTML and XML file can optionally have the character encoding set in it's header metadata. If it's not set, then most programs assume it is UTF-8, but that is not a standard and not universally followed. If the encoding is not specified and the program reading the file guess wrong – the file will be misread.
    The above is out of place. It would be best to address this as part of Point 1.
    Point 1 – Never treat specifying the encoding as optional when writing a file. Always write it to the file. Always. Even if you are willing to swear that the file will never have characters out of the range 1 – 127.
    Now lets' look at UTF-8 because as the standard and the way it works, it gets people into a lot of trouble. UTF-8 was popular for two reasons. First it matched the standard codepages for the first 127 characters and so most existing HTML and XML would match it. Second, it was designed to use as few bytes as possible which mattered a lot back when it was designed and many people were still using dial-up modems.
    UTF-8 borrowed from the DBCS designs from the Asian codepages. The first 128 bytes are all single byte representations of characters. Then for the next most common set, it uses a block in the second 128 bytes to be a double byte sequence giving us more characters. But wait, there's more. For the less common there's a first byte which leads to a sersies of second bytes. Those then each lead to a third byte and those three bytes define the character. This goes up to 6 byte sequences. Using the MBCS (multi-byte character set) you can write the equivilent of every unicode character. And assuming what you are writing is not a list of seldom used Chinese characters, do it in fewer bytes.
    The first part of that paragraph is odd. The first 128 characters of unicode, all unicode, is based on ASCII. The representational format of UTF8 is required to implement unicode, thus it must represent those characters. It uses the idiom supported by variable width encodings to do that.
    But here is what everyone trips over – they have an HTML or XML file, it works fine, and they open it up in a text editor. They then add a character that in their text editor, using the codepage for their region, insert a character like ß and save the file. Of course it must be correct – their text editor shows it correctly. But feed it to any program that reads according to the encoding and that is now the first character fo a 2 byte sequence. You either get a different character or if the second byte is not a legal value for that first byte – an error.
    Not sure what you are saying here. If a file is supposed to be in one encoding and you insert invalid characters into it then it invalid. End of story. It has nothing to do with html/xml.
    Point 2 – Always create HTML and XML in a program that writes it out correctly using the encode. If you must create with a text editor, then view the final file in a browser.
    The browser still needs to support the encoding.
    Now, what about when the code you are writing will read or write a file? We are not talking binary/data files where you write it out in your own format, but files that are considered text files. Java, .NET, etc all have character encoders. The purpose of these encoders is to translate between a sequence of bytes (the file) and the characters they represent. Lets take what is actually a very difficlut example – your source code, be it C#, Java, etc. These are still by and large "plain old text files" with no encoding hints. So how do programs handle them? Many assume they use the local code page. Many others assume that all characters will be in the range 0 – 127 and will choke on anything else.
    I know java files have a default encoding - the specification defines it. And I am certain C# does as well.
    Point 3 – Always set the encoding when you read and write text files. Not just for HTML & XML, but even for files like source code. It's fine if you set it to use the default codepage, but set the encoding.
    It is important to define it. Whether you set it is another matter.
    Point 4 – Use the most complete encoder possible. You can write your own XML as a text file encoded for UTF-8. But if you write it using an XML encoder, then it will include the encoding in the meta data and you can't get it wrong. (it also adds the endian preamble to the file.)
    Ok, you're reading & writing files correctly but what about inside your code. What there? This is where it's easy – unicode. That's what those encoders created in the Java & .NET runtime are designed to do. You read in and get unicode. You write unicode and get an encoded file. That's why the char type is 16 bits and is a unique core type that is for characters. This you probably have right because languages today don't give you much choice in the matter.
    Unicode character escapes are replaced prior to actual code compilation. Thus it is possible to create strings in java with escaped unicode characters which will fail to compile.
    Point 5 – (For developers on languages that have been around awhile) – Always use unicode internally. In C++ this is called wide chars (or something similar). Don't get clever to save a couple of bytes, memory is cheap and you have more important things to do.
    No. A developer should understand the problem domain represented by the requirements and the business and create solutions that appropriate to that. Thus there is absolutely no point for someone that is creating an inventory system for a stand alone store to craft a solution that supports multiple languages.
    And another example is with high volume systems moving/storing bytes is relevant. As such one must carefully consider each text element as to whether it is customer consumable or internally consumable. Saving bytes in such cases will impact the total load of the system. In such systems incremental savings impact operating costs and marketing advantage with speed.

  • UTF-8 Encoding errors during nightly batch runs

    My boss recently tasked me with researching (and hopefully resolving) why our XML frequently has UTF-8 encoding errors.
    I've been in the IS world for less than a year now so please bear with me when it comes to terms, data flow, etc.
    Overview:
    Our Oracle DB spits out XML for the nightly batch runs into a file location, lets say C:\xPression\CustomerData\Certificate.xml. The XML is in Courier New font but some characters make their way into the XML but arent supported. The big one is the elongated ' - ' character. Just one instance of this and the entire XML fails.
    When the batch job is run sometimes there are encoding errors (¿, ¡, -, etc) and every morning I have to come in, finding the invalid character, fix it and have the job re-run.
    I want to know if there's a way so that the XML that comes out is always in the Courier New font, or is there a way to convert it.

    I want to know if there's a way so that the XML that comes out is always in the Courier New font, or is there a way to convert it.
    First thing first, an XML file is a text file, it doesn't have a "font" but an encoding.
    The font is the graphical representation of characters and it is related to whatever client tool you're using to view the content, not to the content itself.
    That being said, a lot of fonts do not support the full range of unicode characters so you may get replacement characters in some case.
    We're missing some information to provide an answer :
    - what's the database version?
    - what's the character set of the database?
    - how are you generating and writing the XML to the file ? UTL_FILE, dbms_xslprocessor, dbms_xmldom?
    If the file is generated using UTF-8 encoding then the issue might just be that you're not using an UTF-8-enable editor.

  • "character conversion error" while parsing xml files

    Hello,
    I'm trying to parse MusicXML (Recordare) files, but I'm getting an exception.
    I'm using the SAX parser (javax.xml.parsers.SAXParser).
    Here is the code I use to instantiate it:
    final javax.xml.parsers.SAXParserFactory saxParserFactory = javax.xml.parsers.SAXParserFactory.newInstance();
    final javax.xml.parsers.SAXParser saxParser = saxParserFactory.newSAXParser();
    final org.xml.sax.XMLReader parser = saxParser.getXMLReader();
    I'm using my own handler, but I get the same exception even if I use org.xml.sax.helpers.DefaultHandler.
    The error I get is:
    Character conversion error: "Illegal ASCII character, 0xc2" (line number may be too low).
    The first few lines of my xml files look like this:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <!DOCTYPE score-partwise
    PUBLIC "-//Recordare//DTD MusicXML 0.6 Partwise//EN"
    "http://www.musicxml.org/dtds/partwise.dtd">
    <score-partwise>
    [...etc...]
    If I delete the <!DOCTYPE ...> line, then I don't get the exception anymore. But the MusicXML files I get (from some other program) always contain this line, and it would be quite some work to delete them from every file manually.
    So does anyone know if there is a way to avoid deleting that line in every file, while still being able to parse the xml files without exceptions?
    Or maybe does anyone know what the exact cause of the exception is? (because I don't know what exactly causes it)
    Thank you in advance.
    Greetz,
    Jipo

    So does anyone know if there is a way to avoid
    deleting that line in every file, while still being
    able to parse the xml files without exceptions?ok this is side-stepping the real problem but I've used this code to filterout DTD references for other reasons   public static InputStream filterOutDTDRef(InputStream in) throws IOException {
          BufferedReader iniReader = new BufferedReader(new InputStreamReader(in));
          StringBuffer newXML = new StringBuffer();
          for(String line = iniReader.readLine(); line!=null; line = iniReader.readLine())
             newXML.append(line+"\n");
          in.close();
          int s = newXML.indexOf("<!DOCTYPE ");
          if(s!=-1)
             newXML.replace(s,newXML.indexOf(">",s)+1,"");
          return new ByteArrayInputStream(newXML.toString().getBytes());
       }and it actually speeds up the parsing phase too (since the DTD ref.s were on the web and the XML standard mandates that there is a fetch for each xml file parsed..)
    you can feed the above into the InputSource constructor that takes an InputStream argument.
    Now for the real problem... 0xc2 is "LATIN CAPITAL LETTER A WITH CIRCUMFLEX" according to a unicode chart - which is not an ASCII character (as the error message correctly reports). I'm not sure why the file is being parsed as ASCII though? You could try parsing in a FileReader to the inputsource and hope it picks up the default character encoding of your system, and that that character encoding matches the file. Or you could try passing in a FileReader constructed with a explicit character encoding (eg "UTF8") and see if that does the trick?
    asjf

  • Socket Adapter Request-Reply ~ Error occured in processing client request ~

    Dear Friends,
    Need your help in resolving a Issue regarding Socket Adapter Request/Reply.
    We have a requirement to recieve Message from a External Vendor using Socket based communication,to achieve this Created a Process which has Socket Adapter Inbound Synchronous Request-Reply. I have also created a sample Outbound Service to test and it was able to send & Receive Message Sucessfuly .
    When i try to Recieve a Message from the External Vendor on Port 8008 in this I am unable to receive the message(No Bpel Instance Created) message coming to the SOA Server, in the logs I can see Below Error-
    Socket Adapter ClientProcessor:run() Error occured in processing client request
    Socket Schema Translation Error.
    Error while trying to translate from native.
    Please ensure that the schemas are set up with native annotations and comply with the output XML. Contact Oracle support if error is not fixable.
    Pls. find my Input XSD Below-
    <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:nxsd="http://xmlns.oracle.com/pcbpel/nxsd"
    xmlns:tns="http://TargetNamespace.com/InboundService"
    targetNamespace="http://TargetNamespace.com/InboundService"
    elementFormDefault="qualified"
    attributeFormDefault="unqualified"
    nxsd:version="NXSD"
    nxsd:stream="chars"
    nxsd:encoding="US-ASCII"
    >
    <xsd:element name="R1">
    <xsd:complexType>
    <xsd:sequence>
    <xsd:element name="C1" type="xsd:string" nxsd:style="terminated" nxsd:terminatedBy="${eol}" nxsd:quotedBy="&quot;" />
    </xsd:sequence>
    </xsd:complexType>
    </xsd:element>
    </xsd:schema>
    Sample Incoming Message-
    <?xml version="1.0" encoding="UTF-8"?>
    <ExitRequest>
    <ExitRequestID>1234</ExitRequestID>
    <Timestamp>28-11-2012 01:19:11</Timestamp>
    <ActiveTagData>23456</ActiveTagData>
    <DriverID>5555</DriverID>
    <LicensePlate>6546</LicensePlate>
    <DriverName> Sujit Nair</DriverName>
    <DriverDOB>06-06-2012</DriverDOB>
    <DriverEmployer>Testing</DriverEmployer>
    <DriverSex>Male</DriverSex>
    <DriverLang>ENGLISH</DriverLang>
    <DriverNationality>TEST</DriverNationality>
    <LaneID>Gate Testing</LaneID>
    <CardReaderID>07700</CardReaderID>
    </ExitRequest>
    Can anyone pls. help me on this , still not able to understand what I am doing wrong. Pls. let me know if any other details required.
    Thanks,
    Sujit Nair

    Hi,
    You don't need a NSXD (Native Format Builder) to receive a message that is XML already, and the NXSD you listed above has nothing to do with the sample incoming message. So, no surprises it is complaining "Error while trying to translate from native.".
    You have to fix the incoming message element on the Socket Adapter.
    Cheers,
    Vlad

  • JAXB: Encoding error (Malformed UTF-8, trying to set ISO)

    Hi, I get this error when doing a Unmarshaller.unmarshal( anURL );
    DefaultValidationEventHandler: [WARNING]: Declared encoding "ISO-8859-1" does not match actual one "UTF-8"; this might not be an error.
    DefaultValidationEventHandler: [FATAL_ERROR]: Character conversion error: "Malformed UTF-8 char -- is an XML encoding declaration missing?" (line number may be too low).
    org.xml.sax.SAXParseException: Character conversion error: "Malformed UTF-8 char -- is an XML encoding declaration missing?" (line number may be too low).
         at org.apache.crimson.parser.InputEntity.fatal(InputEntity.java:1100)
    (...) etc..So far, I set the ISO-8859-1 encoding 4 places;
    1. the xml response from the servlet: xml.append("<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n\n")
    2. the original XSD which the JAXB classes was generated from
    3. In the client, after recieving the string containing the XML: bytes b[] = xmlString.getBytes("ISO-8859-1");
    4. In the client; when posting to the server: httpConn.setRequestProperty("Content-Type","text/xml;charset=ISO-8859-1");
    Yet, JAXB still thinks I'm doing UTF-8 here! Did I forget something?
    thanks,
    bjorn

    bump .. no one knows why this happens?

  • Character Encoding for IDOC to JMS scenario with foreign characters

    Dear Experts,
    The scenario is desribed as follows:
    Issue Description:
    There is an IDOC which is created after extracting data from different countries (but only one country at a time). So, for instance first time the data is picked in Greek and Latin and corresponding IDOC is created and sent to PI, the next time plain English and sent to PI and next Chinese and so on. As of now every time this IDOC reaches PI ,it comes with UTF-8 character encoding as seen in the IDOC XML.
    I am converting this IDOC XML into single string flat file (currently taking the default encoding UTF-8) and sending it to receiver JMS Queue (MQ Series). Now when this data is picked up from the end recepient from the corresponding queue in MQ Series, they see ? wherever there is a Greek/latin characters (may be because that should be having a different encoding like ISO-8859_7). This is causing issues at their end.
    My Understanding
    SAP system should trigger the IDOC with the right code page i.e if the IDOC is sent with Greek/Latin code page should be ISO-8859_7, if this same IDOC is sent with Chinese characters the corresponding code page else UTF-8 or default code page.
    Once this is sent correctly from SAP, Java Mapping should have to use the correct code page when righting the bytes to outputstream and then we would also need to set the right code page as JMS Header before putting the message in the JMS queue so that receiver can interpret it.
    Queries:
    1. Is my approach for the scenario correct, if not please guide me to the right approach.
    2. Does SAP support different code page being picked for the same IDOC based on different data set. If so how is it achieved.
    3. What is the JMS Header property to set the right code page. I think there should be some JMS Header defined by MQ Series for Character Encoding which I should be setting correctly) I find that there is a property to set the CCSID in JMS Receiver Adapter but that only refers to Non-ASCII names and doesn't refer to the payload content.
    I would appreciate if anybody can give me pointers on how to resolve this issue.
    Thanks,
    Pratik

    Hi Pratik,
         http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42?quicklink=index&overridelayout=true
    This link might help.
    regards
    Anupam

  • Character encoding conversion for marshall/unmarshall?

    Hello, Java Web Services gurus,
    I am wondering if there is an easy/plugin-able way to do character encoding conversion transparently in the process of marshall/unmarshall.
    Basically, my input/output will always be these UTF-8 XMLs. As the backend database is ISO encoded, I hope the result of unmarshall will give me ISO strings. And when it comes to marshall, the ISO strings can be transparently turned to UTF-8 XML response. Right now I'm using JAXB's annotations to parse XML into objects.
    I understand there will be chars in the input file not able to get converted, if so, I'd be be expecting an error/exception that flags the failure
    Hope I sound clear. This has been a headache for a while. Really hope someone may help out a bit. Thanks a million in advance

    [Duplicate Post|http://forums.sun.com/thread.jspa?messageID=10971554&tstart=0#10971554]

  • Character Encoding in XML

    Hello All,
    I am not clear about solving the problem.
    We have a Java application on NT that is supposed to communicate with the same application on MVS mainframe through XML.
    We have a character encoding for these XML commands we send for communication.
    The problem is, on MVS the parser is not understaning the US-ASCII character encoding. And so we are getting the infamous "illegal character error".
    The main frame file.encoding=CP1047 and
    NT's file.encoding = us-ascii.
    Is there any character encoding that is common to these two machines: mainframe and NT.
    IF it is Unicode, what is the correct notation for it.
    Or is there any way for specifying the parsers to which character encoding should be used.
    thanks,
    Sridhar

    On the mainframe end maybe something like-
    FileInputStream fris = new FileInputStream("C:\\whatever.xml");
    InputStreamReader is= new InputStreamReader(fris, "ASCII");//or maybe "us-ascii" "US-ASCII"
    BufferedReader brin = new BufferedReader(is);
    Or give inputstream/buffered reader to whatever application you are using to parse the xml. The input stream reader should allow you to set your encoding even if the system doesnt have the native encoding. Depends though on which/whose jvm using you are using jdk1.2 at least supports following on this page http://as400bks.rochester.ibm.com/pubs/html/as400/v4r4/ic2924/info/java/rzaha/javaapi/intl/encoding.doc.html

  • c:import character encoding problem (utf-8)

    Aloha @ all,
    I am currently importing a file using the <c:import> functionallity (<c:import url="module/item.jsp" charEncoding="UTF-8">) but it seems that the returned data is not encoded with utf-8 and hence not displayed correctly. The overall file header is:
    HTTP/1.1 200 OK
    Server: Apache-Coyote/1.1
    Set-Cookie: JSESSIONID=E67F9DAF44C7F96C0725652BEA1713D8;
    Content-Type: text/html;charset=UTF-8
    Content-Length: 6861
    Date: Thu, 05 Jul 2007 04:18:39 GMT
    Connection: close
    I've set the file-encoding on all pages to :
    <%@ page contentType="text/html;charset=UTF-8" %>
    <%@ page pageEncoding="UTF-8"%>
    but the error remains... is this a known bug and is there a workaround?

    Partially, yes. It turns out that I created the documents in eclipse with a different character encoding. Hence the entire document was actually not UTF-encoded...
    So I changed each document encoding in Eclipse to UTF and got it working just fine...

  • Character conversion error: "Unconvertible UTF-8 character beginning with 0

    Hi All,
    I developed an Adapter Module and added to Adapter Framework.
    package sample;
    import java.io.InputStream;
    import java.text.SimpleDateFormat;
    import java.util.Date;
    import javax.ejb.CreateException;
    import javax.ejb.SessionBean;
    import javax.ejb.SessionContext;
    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import org.w3c.dom.Document;
    import org.w3c.dom.Element;
    import org.w3c.dom.NodeList;
    import com.sap.aii.af.mp.module.Module;
    import com.sap.aii.af.mp.module.ModuleContext;
    import com.sap.aii.af.mp.module.ModuleData;
    import com.sap.aii.af.mp.module.ModuleException;
    import com.sap.aii.af.ra.ms.api.Message;
    import com.sap.aii.af.ra.ms.api.XMLPayload;
    @ejbHome <{com.sap.aii.af.mp.module.ModuleHome}>
    @ejbLocal <{com.sap.aii.af.mp.module.ModuleLocal}>
    @ejbLocalHome <{com.sap.aii.af.mp.module.ModuleLocalHome}>
    @ejbRemote <{com.sap.aii.af.mp.module.ModuleRemote}>
    @stateless
    public class SetAttachmentName implements SessionBean, Module {
         private SessionContext myContext;
         private String mailFileName = "UStN";
         public void ejbRemove() {
         public void ejbActivate() {
         public void ejbPassivate() {
         public void setSessionContext(SessionContext context) {
              myContext = context;
         public void ejbCreate() throws CreateException {
         public ModuleData process(
              ModuleContext moduleContext,
              ModuleData inputModuleData)
              throws ModuleException {
              // create a second attachment for the receiver mail adapter
              try {
                   //                  get the XI message from the environment
                   Message msg = (Message) inputModuleData.getPrincipalData();
                   //               creating parsable XML document
                   InputStream XIStreamData = null;
                   XMLPayload xmlpayload = msg.getDocument();
                   XIStreamData = xmlpayload.getInputStream();
                   DocumentBuilderFactory docBuilderFactory =
                        DocumentBuilderFactory.newInstance();
                   DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
                   Document doc = docBuilder.parse(XIStreamData);
                   //            finding the tag's name from the Modules tab in the Directory that will hold the attachment's name
                   String absenderIDTag = null;
                   absenderIDTag = moduleContext.getContextData("<RCVPRN>");
                   //            finding the content of the tag that will be used as the attachment's name (assuming it's the only tag with this name)
                   Element element = doc.getDocumentElement();
                   NodeList list = doc.getElementsByTagName(absenderIDTag);
                   mailFileName += "_" + list.item(0).getFirstChild().toString();
                   String anIDTag = null;
                   anIDTag = moduleContext.getContextData("<CREDAT>");
                   element = doc.getDocumentElement();
                   list = doc.getElementsByTagName(anIDTag);
                   mailFileName += "_" + list.item(0).getFirstChild().toString();
                   Date date = new Date(System.currentTimeMillis());
                   //            Add date to the Message
                   SimpleDateFormat dateFormat = new SimpleDateFormat("yyyyMMdd");
                   mailFileName += "_" + dateFormat.format(date);
                   String belegNummerTag = null;
                   belegNummerTag = moduleContext.getContextData("<BULK_REF>");
                   element = doc.getDocumentElement();
                   list = doc.getElementsByTagName(belegNummerTag);
                   mailFileName += "_" + list.item(0).getFirstChild().toString();
                   //               creating the attachment
                   byte by[] = xmlpayload.getText().getBytes();
                   XMLPayload attachmentPDF = msg.createXMLPayload();
                   attachmentPDF.setName(mailFileName);
                   attachmentPDF.setContentType("application/pdf");
                   attachmentPDF.setContent(by);
                   //adding the message to the attachment
                   msg.addAttachment(attachmentPDF);
                   //                  provide the XI message for returning
                   inputModuleData.setPrincipalData(msg);
              } catch (Exception e) {
                   //                  raise exception, when an error occurred
                   ModuleException me = new ModuleException(e);
                   throw me;
              //                  return XI message
              return inputModuleData;
    I get the following error
    Character conversion error: "Unconvertible UTF-8 character beginning with 0xaa" (line number may be too low).
    Any tips, pointers ?
    Thanks in Advance
    Mukhtar

    Hi Henrique,
    I am using .getNodeValue()
    import java.io.InputStream;
    import java.text.SimpleDateFormat;
    import java.util.Date;
    import javax.ejb.CreateException;
    import javax.ejb.SessionBean;
    import javax.ejb.SessionContext;
    import javax.xml.parsers.DocumentBuilder;
    import javax.xml.parsers.DocumentBuilderFactory;
    import org.w3c.dom.Document;
    import org.w3c.dom.Element;
    import org.w3c.dom.NodeList;
    import com.sap.aii.af.mp.module.*;
    import com.sap.aii.af.ra.ms.api.*;
    @ejbHome <{com.sap.aii.af.mp.module.ModuleHome}>
    @ejbLocal <{com.sap.aii.af.mp.module.ModuleLocal}>
    @ejbLocalHome <{com.sap.aii.af.mp.module.ModuleLocalHome}>
    @ejbRemote <{com.sap.aii.af.mp.module.ModuleRemote}>
    @stateless
    public class UStNAttachmentName3 implements SessionBean, Module {
         private SessionContext myContext;
         private String mailFileName = "UStN";
         public void ejbRemove() {
         public void ejbActivate() {
         public void ejbPassivate() {
         public void setSessionContext(SessionContext context) {
              myContext = context;
         public void ejbCreate() throws CreateException {
         public ModuleData process(
              ModuleContext moduleContext,
              ModuleData inputModuleData)
              throws ModuleException {
              // create a second attachment for the receiver mail adapter
              try {
                   // get the XI message from the environment
                   Message msg = (Message) inputModuleData.getPrincipalData();
                   // creating parsable XML document
                   InputStream XIStreamData = null;
                   XMLPayload xmlpayload = msg.getDocument();
                   XIStreamData = xmlpayload.getInputStream();
                   DocumentBuilderFactory docBuilderFactory =
                        DocumentBuilderFactory.newInstance();
                   DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
                   Document doc = docBuilder.parse(XIStreamData);
                   // finding the tag's name from the Modules tab in the Directory that will hold the attachment's name
                   String absenderIDTag = null;
                   absenderIDTag = moduleContext.getContextData("<RCVPRN>");
                   // finding the content of the tag that will be used as the attachment's name (assuming it's the only tag with this name)
                   Element element = doc.getDocumentElement();
                   NodeList list = doc.getElementsByTagName(absenderIDTag);
                   mailFileName += "_" + list.item(0).getFirstChild().getNodeValue();
                   String anIDTag = null;
                   anIDTag = moduleContext.getContextData("<CREDAT>");
                   element = doc.getDocumentElement();
                   list = doc.getElementsByTagName(anIDTag);
                   mailFileName += "_" + list.item(0).getFirstChild().getNodeValue();
                   Date date = new Date(System.currentTimeMillis());
                   // Add date to the Message
                   SimpleDateFormat dateFormat = new SimpleDateFormat("yyyyMMdd");
                   mailFileName += "_" + dateFormat.format(date);
                   String belegNummerTag = null;
                   belegNummerTag = moduleContext.getContextData("<BULK_REF>");
                   element = doc.getDocumentElement();
                   list = doc.getElementsByTagName(belegNummerTag);
                   mailFileName += "_" + list.item(0).getFirstChild().getNodeValue();
                   // creating the attachment
                   byte by[] = xmlpayload.getText().getBytes();
                   XMLPayload attachmentPDF = msg.createXMLPayload();
                   attachmentPDF.setName(mailFileName);
                   attachmentPDF.setContentType("application/pdf");
                   attachmentPDF.setContent(by);
                   //adding the message to the attachment
                   msg.addAttachment(attachmentPDF);
                   // provide the XI message for returning
                   inputModuleData.setPrincipalData(msg);
              } catch (Exception e) {
                   // raise exception, when an error occurred
                   ModuleException me = new ModuleException(e);
                   throw me;
              // return XI message
              return inputModuleData;
    Still I get the same error.
    org.xml.sax.SAXParseException: Character conversion error: "Unconvertible UTF-8 character beginning with 0xaa" (line number may be too low).
    Adapter-Framework: Character conversion error: "Unconvertible UTF-8 character beginning with 0xaa" (line number may be too low).
    Regards,
    Mukhtar

  • Character encoding in Netweaver Developer Studio

    Hi all!
    I've migrated a EP5E Project to P6 and it worked fine. But now I use another workstation and while trying to open a java-file of migrated project I got
    "Error Encoding Problem, this file is unreading using the UTF-8 character encoding".
    The java-file contains german characters like "ä".
    I'm using SAP NetWeaver Developer Studio Version: 2.0.5
    Build id: 200404200353
    Does anyone know, how to set Character Encoding in NetWeaver Developer Studio?
    Thank You

    I've found the solution:
    Changing the encoding used to show the source
    To change the encoding used by the Java editor to display source files:
    With the Java editor open, select Edit > Encoding from the menu bar
    Select an encoding from the menu or select Others and, in the dialog that appears, type in the encoding's name.
    Note: this setting affects only the way the source is presented.
    To change the encoding that the Java editor uses when saving files, specify a text file encoding preference on Window > Preferences > Workbench > Editors.

Maybe you are looking for

  • I need a full tutorial for a 14 year old :|

    Hi, Im new to Final Cut Express 4 and im also 14 This means that its sorta complicated... so i was wondering if you (the people viewing this message) can Each make me about a 200 word tutorial on The items below. -Panning -Zooming (while the clip is

  • T61p, window vista ultimate 32 and registry cleaning

    Hello, small question. Does anybody uses a registry cleaning / optimizing software ? If yes which one would you recommend as reliable and not creating havoc in the computer ? Or is it better to avoid such piece of software ? regards and thanks in adv

  • Warehouse builder / java

    Hello every body I am new with warehouse builder. Here is the situation I have installed Oracle 11g, and I had Java 6 at this moment. the design center of Oracle Warehouse builder don't response. Now I change the version of java to 5. but the design

  • Painfully SLOW d/l of an update

    Why is an update via Ovi Suite a PAIN to download? It took almost four hours instead of 30 minutes on unused 1 Mbit link! Also, why is this update 146 MB in size and OTA version is only 9,9 MB (Nokia X6, V21 to V30)?

  • Best method for saving settings

    What is the generally accepted "best" way to store a programs settings? I was thinking of just using a text file as its easy to implement, is there any reason i shouldn't do this? Cheers J EDIT: Lets assume im not saving any sensitive data, just fluf