10g AS & FOP SAX Parsers

Hi,
Im deploying an application with FOP on 10g AS, FOP uses SAX parsers to generate PDF docs. On 10g I get the following error:-
2008-01-28 18:05:22,648[<UNKNOWN>][DEBUG] FOIOHelper.convertFOToPDF run
2008-01-28 18:05:22,649[<UNKNOWN>][INFO] Using oracle.xml.parser.v2.SAXParser as SAX2 Parser
2008-01-28 18:05:22,649[<UNKNOWN>][INFO] building formatting object tree
2008-01-28 18:05:22,649[<UNKNOWN>][INFO] setting up fonts
2008-01-28 18:05:22,652[<UNKNOWN>][ERROR] Exception>>
java.lang.NullPointerException
at org.apache.fop.fo.PropertyManager.getTextDecoration(PropertyManager.java:365)
at org.apache.fop.fo.FObjMixed.<init>(FObjMixed.java:72)
Sometime when the batch process that calls this code runs, on the same server without bouncing, i get the following error:-
2008-01-28 15:56:44,798[<UNKNOWN>][DEBUG] FOIOHelper.convertFOToPDF run
2008-01-28 15:56:44,799[<UNKNOWN>][INFO] Using org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser as SAX2 Parser
2008-01-28 15:56:44,809[<UNKNOWN>][INFO] building formatting object tree
2008-01-28 15:56:44,812[<UNKNOWN>][INFO] setting up fonts
2008-01-28 15:56:44,907[<UNKNOWN>][ERROR] Exception>>
java.lang.NullPointerException
at org.apache.fop.fo.PropertyManager.getTextDecoration(PropertyManager.java:365)
at org.apache.fop.fo.FObjMixed.<init>(FObjMixed.java:72)
This code works in the dev env which is RAD on Websphere, using xerces, and I have been told it used to work on Oracle 9.
I have the xerces jar in the orion-application.xml as a library path, and in the manifest classpath of the application jar in the ear, but Oracle seems to be randomly running different SAX parsers. Is there a way to 100% guarantee only the jar files in the ear are run by the application, as I know these work?

Here's a faq question and answer from jdom.org. Can anybody explain how to do this in more detail? He mentions two workarounds. I've done the second and it works fine, but I'd prefer to get the first workaround working instead -- the wrapper InputStream idea.
Why does passing a document through a socket sometimes hang the parser?
The problem is that several XML parsers close the input stream when they read EOF (-1). This is true of Xerces, which is JDOM's default parser. It is also true of Crimson. Unfortunately, closing a SocketInputStream closes the underlying SocketImpl, setting the file descriptor to null. The socket's output stream is useless after this, so your application will be unable to send a response. To workaround, protect your socket's input stream with an InputStream wrapper that doesn't close the underlying stream (override the close() method), or read everything into a buffer before handing off to the JDOM builder:
byte[] buf = new byte[length];
new DataInputStream(inputStream).readFully(buf);
InputStream in = new ByteArrayInputStream(buf);
(Contributed by Joseph Bowbeer)

Similar Messages

  • SAX parsers hanging (crimson and xerces) on InputStream

    Hi,
    While searching I've seen a lot of postings about sax parsers hanging on InputStreams. Many have no replies, some do. So far I've tried several fixes mentioned to no avail. I've seen it hang in the startDocument() and hang after reading the entire document. All sorts of hangs . . . Oh and in some situations it works fine, but that's not good enough.
    1. I tried wrapping the InputStream as described at the jdom.org FAQ with my own InputStream that hides the close() call. Then I pass my wrapper InputStream to the saxparser.parse() method. This didn't work, because you need to wrap SocketInputStream, not InputStream and SocketInputStream has package permissions which keep me from inheriting from it. If somebody can shed some light on how to do this, I'll try again, but for now that approach is shelved.
    2. I tried adding an EOF character, 0x1A, to the end of the xml string being sent over the socket. The parser still hangs.
    Any more suggestions?
    Thanks,
    Steve

    Here's a faq question and answer from jdom.org. Can anybody explain how to do this in more detail? He mentions two workarounds. I've done the second and it works fine, but I'd prefer to get the first workaround working instead -- the wrapper InputStream idea.
    Why does passing a document through a socket sometimes hang the parser?
    The problem is that several XML parsers close the input stream when they read EOF (-1). This is true of Xerces, which is JDOM's default parser. It is also true of Crimson. Unfortunately, closing a SocketInputStream closes the underlying SocketImpl, setting the file descriptor to null. The socket's output stream is useless after this, so your application will be unable to send a response. To workaround, protect your socket's input stream with an InputStream wrapper that doesn't close the underlying stream (override the close() method), or read everything into a buffer before handing off to the JDOM builder:
    byte[] buf = new byte[length];
    new DataInputStream(inputStream).readFully(buf);
    InputStream in = new ByteArrayInputStream(buf);
    (Contributed by Joseph Bowbeer)

  • Meaningful error handling with Sax Parsers (Apache Xerces)

    Hello All,
    I am using the Apache Xerces Parser 2.6.2 to parse some XML documents with a W3C XML Schema document.
    My schema has a lot of rules enforced on the different elements and I was looking at some way to get userfriendly/meaningful prescriptive messages as output back to the user so that he/she can correct the errors in their XML files and have files conformant to the schema.
    If I give an invalid XML file (with errors introduced by myself) the parser throws a SAXParseException with a message like ,
    XML Document has Error: cvc-type.3.1.3: The value 'ANZc9901' of element 'uniqueid' is not valid. line number 6, column number 42
    My XSD file
    <xs:element name="uniqueid">
    <xs:simpleType>
    <xs:restriction base="xs:token">
    <xs:pattern value="\s*ANZ[0-9]{4}\s*"/>
    </xs:restriction>
    </xs:simpleType>
    </xs:element>
    My XML File's uniqueID element
    <uniqueid>ANZc9901</uniqueid>
    As you can see, the error message isn't of much use. I was looking at a way to handle them generically, and give out a suggestive error message based on the type of the error. Any help in this regard would be highly appreciated

    well, it seems pretty obvious to me... Not sure how you are going to do a whole lot better.

  • PDF generation with FOP throws NPE

    Hi,
    Stack trace
    at oracle.xml.jaxp.JXTransformer.reportXSLException(JXTransformer.java:769)
    at oracle.xml.jaxp.JXTransformer.transform(JXTransformer.java:342)
    I'm using standard Java 1.4 API calls to create a PDF document from a DOM with Apache FOP 0.20.5.
    The servlet is deployed to Oracle 10g Application Server 9.0.4.0.0. With Tomcat 4.1.30 the same code executes without exceptions
    Because I have to port our web application to Oracle for a new customer it is important for me to get this fixed soon.
    Thank you
    Eckard

    Light at the end of the tunnel: It seems to work when FO and PDF generation are separated. Until now the DOM was piped into FOP who internally created the FO from SAX events and then seems to parse it again to create PDF.
    I separated the two steps: There is no problem in FO generation, but due to a namespace problem subsequent parsing fails
    ("org.apache.fop.apps.FOPException: null:2:10 Root element is missing the namespace declaration: http://www.w3.org/1999/XSL/Format")
    Now the correct exception comes up. Whether FOP SAX event listener is buggy or XML parser's error handling has a problem, I don't know
    Thank you anyway
    Eckard

  • Is there any possibility to combine XPath with SAX in Java?

    HI Gentlemen,
    I have an XML instance to parse. One solution works with XPath and Document builder. However, the tree in memory is too big so that I can not build it in my storage (8 GB). Does anyone of you know a method where I use an XPath expression (Java) to select a node but with a better parser (e g SAX) which is not so space hungry? Direct access of nodes is obligatory.
    Thanks, kind regards from
    Miklos HERBOLY

    As SAX  parsers do not build a DOM structure and XPath requires a DOM structure to select elements from, XPath is not usable with SAX, but some analysers support setting the XPath expressions to analyse before invoking the SAX parser and provide the result for XPath expressions.
    Refer
    https://code.google.com/p/xpath4sax/

  • Problem with sax parser

    Hello..
    I have the following problem. When I parse an xml document with blank spaces and numbers with decimals, its sometimes comes out as one string and sometimes as two, for example "First A" sometimes comes out as "First" and "A" and sometimes as "First A", which is how its stored in the xml file. Same with numbers like 19.20. Im enclosing a little of my code..
    public void characters(char buf[], int offset, int len)
    throws SAXException
    if (textBuffer != null) {
    SaveString = ""+textBuffer;
    if(i>-1)
    numbers = SaveString;
    Whats wrong and how do I fix it.
    Best Regards Dan
    PS I have more code, in data and out data if needed.Ds

    Hello,
    I do not know if this is your problem, yet please find hereafter an excerpt of the SAX API:
    public void characters(char[] ch,
                           int start,
                           int length)
                    throws SAXException
    ... SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks;...
    ... Note that some parsers will report whitespace in element content using the ignorableWhitespace method rather than this one (validating parsers must do so)...
    In other words, I am afraid that your issue is the "standard behaviour" of a SAX parser.
    I hope it helps.

  • SAX XML Parser problem

    Hi,
    I have a standard .xml file and a file called TemplateContentHandler that parses the info in the xml doc but there is an error in it. See below:
    TemplateContentHandler.java
    if ( strElementName.equals( TemplateConstants.RNC_Module_LDN_TU1 ) ) {
                timingUnit1TagExistInDefaultTemplateFile = true;
                System.err.println("timingUnit1TagExistInDefaultTemplateFile for TU1 == "+timingUnit1TagExistInDefaultTemplateFile);
                System.err.println("strValue for TU1 == "+strValue);
                tTemplate.setRNC_Module_LDN_TU1( strValue );
                Trace.exit();
                return;
            if ( strElementName.equals( TemplateConstants.RNC_Module_LDN_TU2 ) ) {
                timingUnit2TagExistInDefaultTemplateFile = true;
                System.err.println("timingUnit2TagExistInDefaultTemplateFile for TU2 == "+timingUnit2TagExistInDefaultTemplateFile);
                System.err.println("strValue for TU2 == "+strValue);
                tTemplate.setRNC_Module_LDN_TU2( strValue );       
                Trace.exit();
                return;
            }DefaultTemplate
    <RNC_Module_LDN_TU2>ManagedElement=1,Equipment=1,Subrack=MS,Slot=5,PlugInUnit=1</RNC_Module_LDN_TU2>
    <RNC_Module_LDN_TU1>ManagedElement=1,Equipment=1,Subrack=MS,Slot=4,PlugInUnit=1</RNC_Module_LDN_TU1>
    OutPut
    root@atrcus13> timingUnit2TagExistInDefaultTemplateFile for TU2 == true
    strValue for TU2 == ManagedElement=1,Equipment=1,Subrack=MS,Slot=5,PlugInUnit=1
    timingUnit1TagExistInDefaultTemplateFile for TU1 == true
    strValue for TU1 == ManagedElement=1,Equipment=1,Subrack=MS,SlottimingUnit1TagExistInDefaultTemplateFile for TU1 == true
    strValue for TU1 == =4,PlugInUnit=1
    as you can see it parses the Tag <RNC_Module_LDN_TU1> twice and splits it in2 pieces. does anyone know why this is? Also I added a dirty hack whcih works:
    private int timimngUnit1Tagcounter = 0;
        private int timimngUnit2Tagcounter = 0;
    if( timimngUnit1Tagcounter == 0) {
                    tempTU1 = strValue;
                    timimngUnit1Tagcounter++;
                    Trace.message("FIRST TIME STRVALUE === "+strValue);
                    tTemplate.setRNC_Module_LDN_TU1( strValue );
                else {
                    Trace.message("SECOND TIME STRVALUE === " + tempTU1+strValue);
                     tTemplate.setRNC_Module_LDN_TU1( tempTU1+strValue );
                     timimngUnit1Tagcounter = 0;
                }but now the parser is screwing up other tags in the file; anyone have any ideas?
    Thanks,
    Vinnie

    In which method is the code you've posted called?
    Given your results, I'd suspect that it's in characters, and strValue is a string created from the character data passed in.
    in that case, read
    http://java.sun.com/j2se/1.4.2/docs/api/org/xml/sax/ContentHandler.html#characters(char[],%20int,%20int)
    in particular
    The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.
    The accepted pattern of use where you want to process all character data between tags in one go is to use a StringBuffer to acculmulate the characters, then process them in the following endElement or beginElement method.
    Pete

  • SAX parser problem (very odd)

    Hi,
    I�m trying to parse a XML file using SAX, it worked fine until i test with a larger file(about 12MB), in the characters() implementation, i�m trying to load the value into an object, but the object that comes with the characters()(the value of the element) comes wrong, i mean it comes but comes with less bytes.
    explanation:
    I make a System.out with the values of the offset and the length of the values of the elements, and most of the values became fine except some values that came with a byte less:
    value : blabla , offset : 456 , length : 6
    value : blabl , offset : 6662 , length : 5
    anyone knows what the hell is going on in this class...
    PS: i�ve extend the Class DefaultHandler of org.xml.sax.helpers.DefaultHandler;
    PS2: the XML file it�s fine!! The values are OK!!!

    From the documentation for the characters method of org.xml.sax.ContentHandler:
    "SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks..."

  • SAX Parser method charaters trouble

    Hi,
    I am using a sax parser for loading an xml file of size over 1gb. The problem whats happening is that the content gets truncated at random.
    And this truncation happens in the characters method of the sax parser. This am sure as I logged the content I get from characters method before starting my processing. for ex:
    <content>signal1,signal960</content>
    <content>signal1,signal970</content>
    On parsing the above snippet, the first content "signal1,signal960"
    gets extracted completely, however the next one is truncated, and the truncation is random. This happens for some tags, and then the extraction resumes normally. And this truncation starts occuring again after it has extracted a few tags.
    Also the first truncation started occuring only after parsing around 200 mb of the 1gb file.
    Could anyone tell if there is some limitation with XERCES or CRIMSON.
    or if any other one that I can use???
    Regards,
    R.

    To quote from the Documentation of ContentHandler.characters:
    -- start quote --
    The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.
    -- end quote --
    So providing the content in two seperate calls is perfectly valid and must be handled by your code. It's probably a result of the internal workings of the XML parser and allowing that in the Parser specification probably allowed some optimizations that would otherwise be impossible (a constant buffer size, for example).

  • SAX and DOM - treating encoding differently ??

    Hello!
    I have run into a strange problem - I was parsing my xml document using
    WebLogic's DOM and SAX parsers to compare their performance, and I found
    that while using their DOM parser I was able to parse the document just
    fine, using the SAX parser gave me an error:
    org.xml.sax.SAXParseException: Declared encoding "UTF-8" does not match
    actual one "ISO8859_1"
    I don't use any extended characters in the xml document, and even if I
    did - I'm puzzled as to why DOM parser would process it without
    complaining but the SAX paser would not??
    Thanks,
    Marina

    Hello!
    I have run into a strange problem - I was parsing my xml document using
    WebLogic's DOM and SAX parsers to compare their performance, and I found
    that while using their DOM parser I was able to parse the document just
    fine, using the SAX parser gave me an error:
    org.xml.sax.SAXParseException: Declared encoding "UTF-8" does not match
    actual one "ISO8859_1"
    I don't use any extended characters in the xml document, and even if I
    did - I'm puzzled as to why DOM parser would process it without
    complaining but the SAX paser would not??
    Thanks,
    Marina

  • SAX Parser Gotcha

    For what its worth, the Java SAX XML had/has a killer gotcha that one must compensate for or it does not work.
    Essentially, a SAX parser has five routines one has to program:
    startDocument
    startElement
    characters
    endElement
    endDocument
    The characters routine is where the contents of a tag appear.
    Suppose the underlying disk read routine does a read of the
    input stream that does not terminate at a tag; that has to happen
    occasionally. Then all the characters routine will see is part
    of a tag's contents. The next read of the input stream will
    cause characters to be called again with the second half of
    the tag's contents in the buffer. The application has to be
    smart enough to append the second half of the tag's contents
    to the first half, or there will be an error. My solution was
    to initialize a string in startElement, always append to it in
    characters, and call it the contents of the tag in endElement.
    This seems to work for me.
    It took weeks to find and fix this error. And as far as I know
    there is no defense against it, other than the one I outined above.
    I have read several books on XML and have never seen this problem
    described.
    Has anyone else ever seen this problem and have a better soln
    for it?
    Charles Elliott

    From the API documentation for the characters() method of org.xml.sax.ContentHandler:
    The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.You ask if anyone else has ever seen the problem? Actually it's an FAQ in this forum. Appears every couple of weeks. You've found the correct way to handle it. (Although using a StringBuilder instead of a String to accumulate the data might be a tiny bit better.)

  • Probem with SAX parser

    Hi to averybody!
    I'm trying to parse a xml file using SAX parser but I found a problem.
    When the method
    public void characters(char[] ch, int start, int length)
    I try to print the content in this way:
    for (int i = start; i <= length; i++)
    System.out.print(ch);
    but nothing is printed except except in 2 cases.
    the file parsed start like this:
    <BatchMaint xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.ncr.com/bosphoenix/WSMaintenance_5.0.xsd">
    <Id>1000950</Id>
    <DateTime>2009-02-11T11:37:09</DateTime>
    <FromPC>TD185008-1J7</FromPC>
    <FromAppl>BatchToPosWS.dll</FromAppl>
    <FromVersion>001.000.000.000</FromVersion>
    the method print only "1000950" and "2009-02-11T11:37:09".
    Has somebody any idea how fix the problem??

    From the API documentation for the characters() method:
    "The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information."

  • Writing contents of XML-documents with SAX

    I have the following problem: I use the SAX-API for parsing an XML- document and to write the contents of several tags into field variables.
    For that, I use the method 'characters(char cbuf[], int start, int len' and read the interesting string part
    'new String(cbuf,start,len) where cbuf is the character buffer, start the offset in the file and len the length of the string within the tag.
    I registered the following problem if I load my application from a jar-file (with a 13kb long xml-file):
    The offset runs till 8192 bytes and then resets to 1 and runs up again. If it reaches the offset 8192 bytes (8Kbyte) within a string as content of a tag, the string is split into two: the first till this offset and the second after it.
    I have already tried to solve this problem by setting some features of the SAX-parser but I had no success.
    The problem does not occur if I start my application from the Oracle-JDeveloper.
    Is there anyone who has an idea? I am glad about any information or hint which might a help for me.
    Thank you.

    From the javadocs for org.xml.sax.ContentHandler.characters(char[] ch,
    int start,
    int length)
    The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.
    To get around this.
    - Initialize a StringBuffer field in startElement().
    characters = new StringBuffer();- Each time characters() is called, append the section of the char[] to the StringBuffer.
    characters.append(buf, offset, len);- Work with the entire resulting value in endElement().
    -Scott
    http://www.swiftradius.com

  • SAX, incor. data for XML element, XML 1.1 doc (1.0 ok) in JRE 6 (JRE 5 ok)

    Hello,
    for some reason, the test case below fails when driven with Java 6. Looks like, when long string data is stored in a XML element, DocumentHandler's characters() method will be provided with incorrect data.
    With XML 1.0, parsing works fine in all my tests, Apache Xerces, JRE 5, JRE 6.
    With XML 1.1., parsing works fine with Apache Xerces, JRE5 but not with JRE6.
    I've checked it with latest JRE 6 update 4.
    Anyone else xperiencing such problems with XML1.1 parsing when using JAXP bundled with Java 6?
    I've tried to file a bug at bugdatabase, but for some reason I got no response for my issue - so I'm trying the forum now ;-)
    Thanx for comments
    Merten
    import java.io.StringReader;
    import javax.xml.parsers.SAXParser;
    import javax.xml.parsers.SAXParserFactory;
    import org.xml.sax.Attributes;
    import org.xml.sax.InputSource;
    import org.xml.sax.SAXException;
    import org.xml.sax.helpers.DefaultHandler;
    import junit.framework.TestCase;
    import junit.framework.TestResult;
    import junit.framework.TestSuite;
    /** snippet to demonstrate problem in SAXParser */
    public class ProblemWithSAXParser extends TestCase
    public static void main(String[] args)
    TestResult result=junit.textui.TestRunner.run(suite());
    System.exit(result.wasSuccessful() ? 0 : 1);
    private static junit.framework.Test suite()
    final TestSuite ts=new TestSuite();
    ts.addTestSuite(ProblemWithSAXParser.class);
    return (ts);
    /** small DocumentHandler, just waiting for one and only XML element <c></c> */
    class MySAXDocumentHandler extends DefaultHandler
    boolean listening=false;
    public void startElement(final String uri, final String localName, final String qName, final Attributes attributes)
    throws SAXException
    System.out.println("startElement uri " + uri + " localName " + localName + " qName " + qName);
    if("c".equals(qName))
    listening=true;
    public void endElement(final String uri, final String localName, final String qName) throws SAXException
    System.out.println("endElement uri " + uri + " localName " + localName + " qName " + qName);
    if("c".equals(qName))
    listening=false;
    public void characters(final char ch[], final int start, final int length) throws SAXException
    if(listening)
    final String str=new String(ch, start, length);
    System.out.println("<c> element, start==" + start + " length==" + length + " ch.length==" + ch.length + " ch=="
    + str);
    sb.append(str);
    private final StringBuffer sb=new StringBuffer();
    /** return what I got for XML element <c></c> */
    public String toString()
    return (sb.toString());
    /** test an XML document with an element <c>, use XML 1.0 and XML 1.1 and some (more)
    * content in the XML element */
    public void testWithSunAndApache() throws Exception
    // test XML element content for <c> element
    final StringBuffer plain_str=new StringBuffer("");
    for(int i=0; i < 500; i++)
    plain_str.append("0123456789 This is some text ").append(i).append(". ");
    final String xml_orig=plain_str.toString();
    // SAX parsers, one Sun, one (original) Apache Xerces
    // accept that Apache Xerces is not in path ...
    final String prop_name="javax.xml.parsers.SAXParserFactory";
    final String prop_val_sun="com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl";
    final String prop_val_apache="org.apache.xerces.jaxp.SAXParserFactoryImpl";
    System.setProperty(prop_name, prop_val_sun);
    final SAXParser sax_sun=SAXParserFactory.newInstance().newSAXParser();
    System.out.println("SaxParser Sun= " + sax_sun);
    assertTrue(("" + sax_sun).indexOf("com.sun.") >= 0);
    System.setProperty(prop_name, prop_val_apache);
    SAXParser sax_apache;
    try
    sax_apache=SAXParserFactory.newInstance().newSAXParser();
    catch(final Throwable t)
    System.err.println("no Apache Xerces in path? " + t);
    sax_apache=null;
    System.out.println("SaxParser Apache= " + sax_apache);
    assertTrue(sax_apache==null || ("" + sax_apache).startsWith("org.apache."));
    // i==0: XML 1.0, i==1: XML 1.1
    for(int i=0; i <= 1; i++)
    assert i == 0 || i == 1;
    final String xml_version=(i == 0 ? "1.0" : "1.1");
    final StringBuffer sb=new StringBuffer("<?xml version=\"" + xml_version + "\" encoding=\"UTF-8\"?><c>");
    sb.append(xml_orig);
    sb.append("</c>");
    final String xml=sb.toString();
    // parse it!
    final StringReader string_reader_sun, string_reader_apache;
    string_reader_sun=new StringReader(xml);
    final InputSource input_source_sun=new InputSource(string_reader_sun);
    string_reader_apache=new StringReader(xml);
    final InputSource input_source_apache=new InputSource(string_reader_apache);
    final MySAXDocumentHandler my_handler_sun, my_handler_apache;
    my_handler_sun=new MySAXDocumentHandler();
    my_handler_apache=new MySAXDocumentHandler();
    sax_sun.parse(input_source_sun, my_handler_sun);
    if(sax_apache!=null)
    sax_apache.parse(input_source_apache, my_handler_apache);
    final String xml_sun=my_handler_sun.toString();
    final String xml_apache=my_handler_apache.toString();
    assertNotNull(xml_sun);
    assertNotNull(xml_apache);
    System.out.println("xml version " + xml_version);
    System.out.println("xml " + xml);
    System.out.println("xml_orig " + xml_orig);
    System.out.println("xml_sun " + xml_sun);
    System.out.println("xml_apache " + xml_apache);
    // test the data returned from DocumentHandler
    if(sax_apache!=null)
    assertEquals(xml_orig, xml_apache);
    assertEquals(xml_orig.length(), xml_sun.length()); // length seems to be okay
    assertEquals(xml_orig, xml_sun); // content seems to be not okay for XML 1.1
    }

    thanx, DrClap, haven't seen the "code" button for some reason, code is attached again
    The output of my test is like this for JRE6 (truncated)
    xml_sun 456789 This is is some text 0. 0123456789 This is some text 1.
    xml_apache 0123456789 This is some text 0. 0123456789 This is some text 1.
    Notice, xml_sun not only starts wrong ("0123" missing) - for some reason, the text there is not "This is some text" but "This is is some text" - very interesting in a way. I guess there's an issue with repeated text chunks ...
    So, the data in my XML element is kind of changed, the length of the string is okay but the characters are "misplaced" or so :-) I thought about a problem with StringBuffer first, but the StringBuffer works fine for Apache ... so I really think there's an issue with JRE 6's parser for XML 1.1
    Merten
    import java.io.StringReader;
    import javax.xml.parsers.SAXParser;
    import javax.xml.parsers.SAXParserFactory;
    import org.xml.sax.Attributes;
    import org.xml.sax.InputSource;
    import org.xml.sax.SAXException;
    import org.xml.sax.helpers.DefaultHandler;
    import junit.framework.TestCase;
    import junit.framework.TestResult;
    import junit.framework.TestSuite;
    /** snippet to demonstrate problem in SAXParser */
    public class ProblemWithSAXParser extends TestCase
    public static void main(String[] args)
      TestResult result=junit.textui.TestRunner.run(suite());
      System.exit(result.wasSuccessful() ? 0 : 1);
    private static junit.framework.Test suite()
      final TestSuite ts=new TestSuite();
      ts.addTestSuite(ProblemWithSAXParser.class);
      return (ts);
    /** small DocumentHandler, just waiting for one and only XML element <c></c> */
    class MySAXDocumentHandler extends DefaultHandler
      boolean listening=false;
      public void startElement(final String uri, final String localName, final String qName, final Attributes attributes)
        throws SAXException
       System.out.println("startElement uri " + uri + " localName " + localName + " qName " + qName);
       if("c".equals(qName))
        listening=true;
      public void endElement(final String uri, final String localName, final String qName) throws SAXException
       System.out.println("endElement uri " + uri + " localName " + localName + " qName " + qName);
       if("c".equals(qName))
        listening=false;
      public void characters(final char ch[], final int start, final int length) throws SAXException
       if(listening)
        final String str=new String(ch, start, length);
        System.out.println("<c> element, start==" + start + " length==" + length + " ch.length==" + ch.length + " ch=="
          + str);
        sb.append(str);
      private final StringBuffer sb=new StringBuffer();
      /** return what I got for XML element <c></c> */
      public String toString()
       return (sb.toString());
    /** test an XML document with an element <c>, use XML 1.0 and XML 1.1 and some (more)
      * content in the XML element */
    public void testWithSunAndApache() throws Exception
      // test XML element content for <c> element
      final StringBuffer plain_str=new StringBuffer("");
      for(int i=0; i < 500; i++)
       plain_str.append("0123456789 This is some text ").append(i).append(". ");
      final String xml_orig=plain_str.toString();
      // SAX parsers, one Sun, one (original) Apache Xerces
      // accept that Apache Xerces is not in path ...
      final String prop_name="javax.xml.parsers.SAXParserFactory";
      final String prop_val_sun="com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl";
      final String prop_val_apache="org.apache.xerces.jaxp.SAXParserFactoryImpl";
      System.setProperty(prop_name, prop_val_sun);
      final SAXParser sax_sun=SAXParserFactory.newInstance().newSAXParser();
      System.out.println("SaxParser Sun= " + sax_sun);
      assertTrue(("" + sax_sun).indexOf("com.sun.") >= 0);
      System.setProperty(prop_name, prop_val_apache);
      SAXParser sax_apache;
      try
       sax_apache=SAXParserFactory.newInstance().newSAXParser();
      catch(final Throwable t)
       System.err.println("no Apache Xerces in path? " + t);
       sax_apache=null;
      System.out.println("SaxParser Apache= " + sax_apache);
      assertTrue(sax_apache==null || ("" + sax_apache).startsWith("org.apache."));
      // i==0: XML 1.0, i==1: XML 1.1
      for(int i=0; i <= 1; i++)
       assert i == 0 || i == 1;
       final String xml_version=(i == 0 ? "1.0" : "1.1");
       final StringBuffer sb=new StringBuffer("<?xml version=\"" + xml_version + "\" encoding=\"UTF-8\"?><c>");
       sb.append(xml_orig);
       sb.append("</c>");
       final String xml=sb.toString();
       // parse it!
       final StringReader string_reader_sun, string_reader_apache;
       string_reader_sun=new StringReader(xml);
       final InputSource input_source_sun=new InputSource(string_reader_sun);
       string_reader_apache=new StringReader(xml);
       final InputSource input_source_apache=new InputSource(string_reader_apache);
       final MySAXDocumentHandler my_handler_sun, my_handler_apache;
       my_handler_sun=new MySAXDocumentHandler();
       my_handler_apache=new MySAXDocumentHandler();
       sax_sun.parse(input_source_sun, my_handler_sun);
       if(sax_apache!=null)
        sax_apache.parse(input_source_apache, my_handler_apache);
       final String xml_sun=my_handler_sun.toString();
       final String xml_apache=my_handler_apache.toString();
       assertNotNull(xml_sun);
       assertNotNull(xml_apache);
       System.out.println("xml version " + xml_version);
       System.out.println("xml        " + xml);
       System.out.println("xml_orig   " + xml_orig);
       System.out.println("xml_sun    " + xml_sun);
       System.out.println("xml_apache " + xml_apache);
       // test the data returned from DocumentHandler
       if(sax_apache!=null)
        assertEquals(xml_orig, xml_apache);
       assertEquals(xml_orig.length(), xml_sun.length()); // length seems to be okay
       assertEquals(xml_orig, xml_sun); // content seems to be not okay for XML 1.1
    }

  • Some strange SAX behavior

    Hi,
    I'm somewhat new to SAX processing. I have a simple program that uses the SAX parser. In testing this code, I've noticed some behavior that I find strange. Can anyone veryify if what I describe is expected behavior?
    1) Parsing an element with no data, like <element></element>. I'm finding that the characters method is still being called when it sees elements like this. Furthermore, when I extract the supposed data using:
    String data = new String(p_buff, p_start, p_length);
    What I get is a newline, '\n'.
    2) SAX appears to drop data when it parses an element that contains an entity. For example, the characters method returns only an "R" when it parses an XML field like:
    <genre>R&B/Soul</genre>
    Where & is actually the ampersand character, followed by "amp;"
    Are these bugs/features, or is more likely that I'm coding it incorrectly?

    1) Parsing an element with no data, like <element></element>. I'm finding that the characters
    method is still being called when it sees elements like this. Furthermore, when I extract the
    supposed data using:
    String data = new String(p_buff, p_start, rt, p_length);
    What I get is a newline, '\n'.The newline and space characters are called ignorable Whitespace and are passed by a SAX parser by the character() or the ignorableWhitespace() callback methods, that is between the XML's root element start and end tags. Your example <element></element> does not generate any ignorable whitespace, however, so let's look at this as an example (4 spaces before each child tag):
    <root>
        <child>
        </child>
    </root>Here, the ignorable whitespace is:
    a newline after <root> - will be attributed to element <root>
    4 spaces before <child> - will be attributed to element <root>
    a newline after <child> - will be attributed to element <child>
    4 spaces before </child> - will be attributed to element <child>
    a newline after </child> - will be attributed to element <root>
    There are two categories of SAX parsers: validating and non-validating. Validating parsers must report ignorable whitespace through the ignorableWhitespace() callback. Non-validating parsers CAN use the same method OR the character() callback (yours is the latter, apparently).
    2) SAX appears to drop data when it parses an element that contains an entity. For example,
    the characters method returns only an "R" when it parses an XML field like:
    <genre>R&B/Soul</genre>
    Where & is actually the ampersand character, followed by "amp;"Again, SAX parser implementations have some freedom here as well. A parser CAN report content in one character() callback or can use several callbacks (Xerces, which I use, calls the method 3 times for your example [once for R, once for &, and once for B]). You might want to look at your character() implementation if you only see a callback for R and nothing else, or your parser is faulty (which is unlikely).
    So, relax, everything is normal SAX parser behaviour.

Maybe you are looking for

  • IMac failures

    I have tried unsuccessfully to find an email address for Apple, where I can give critical feedback about my iMac experience and expect to have some response.  The only feedback channels make it clear there will be no response and I fear it disappears

  • Invoking ODI scenario from BPEL using Asynchronous callbacks

    It has been given in many sites that for invoking an odi scenario, odi-public-ws.aar must be uploaded to AXIS2 framework as the web service uses AXIS2 as the web service container. For doing this first the AXIS2 should be deployed in OC4J. I am worki

  • How can I make a pull down menu in I web

    how can I make a pull down menu in I web?

  • Linksys E4200 is not channel bonding

    I have been trying for weeks to figure out why my linksys E4200 would not do 300 mbps I think I came up with the answer the firmware is not allowing it to channel bond on the 2.4ghz channel  in my old 160n when I change the channel width to 20 to 40m

  • Devices not showing up

    None of my devices are showing up. ATV2 nor my ipad. I have read the trouble shooting guide. Nothing seems to work. Its really strange because it was all working fine a few days ago...? Any help, input would be most welcome. Thanks