StringIndexOutOfBoundsException with SAX

Hi,
Am facing this error: "StringIndexOutOfBoundsException - string index out of range -84" while parsing a file using Java SAX parser
I am hitting google adwords API and from the response data, am writing a xml which i then parse to get its data and put into database.
I think there is some bad data in the response, but am not sure how to handle this error...i need to parse the file. Also, the file size is bit huge (more than 30 mb) and using DOM gives an out of memory error, hence i have to use SAX
Can anyone suggest a solution or workaround to this so that am able to successfully parse the complete file

With very little to go on, let me risk a guess.
Check very carefully how you have implemented the characters method. There is no guarantee that the parser will only call that method once. It may make several calls to the method in order to process what looks to you like it should be a single String. Check the javadocs for that method and you will see a suggestion to use a StringBuilder or StringBuffer to which you append the contents of each call. There is a version of the append method that uses the same three parameters (char[], int, int) as is provided in the characters method. When you get to the endElement method, you can use toString() to convert the StringBuilder or StringBuffer to a String.
A suggestion. Since you are processing a large file, instantiate the SringBuilder or StringBuffer once, perhaps in the constructor for the class that extends the ContentHandler interface. Specify a length in the constructor that is larger than the length of any of the Strings that will be processed. Then use "setLength( 0 )" in the startElement to reset the length without repeatedly creating a new object.
Dave Patterson

Similar Messages

  • Is there any possibility to combine XPath with SAX in Java?

    HI Gentlemen,
    I have an XML instance to parse. One solution works with XPath and Document builder. However, the tree in memory is too big so that I can not build it in my storage (8 GB). Does anyone of you know a method where I use an XPath expression (Java) to select a node but with a better parser (e g SAX) which is not so space hungry? Direct access of nodes is obligatory.
    Thanks, kind regards from
    Miklos HERBOLY

    As SAX  parsers do not build a DOM structure and XPath requires a DOM structure to select elements from, XPath is not usable with SAX, but some analysers support setting the XPath expressions to analyse before invoking the SAX parser and provide the result for XPath expressions.
    Refer
    https://code.google.com/p/xpath4sax/

  • Changing xmlschema while parsing with sax

    Hello world,
    I`m using the sax-parser (xerces) and i want to combine different schema-files to parse an xml-String;
    <root>
         <intervall>1,5</intervall>
         <nix>jetzt echt nix!</nix>
         <testtag>
              <text>huhu joe</text>
         </testtag>
         <theLast>das letzte element</theLast>
    </root>e.g. i want to parse the tag testtag with another schema ??
    does anyone have an idea or a sample coding ??
    thank you very much

    thank you for the reply, but i want to change the Schema file while parsing the xmlString with Sax, without manipulation the xmlString;
    the xml-String i get is fixed;
    wish you all a great sunday

  • NullPointerException with SAX

    I have developed a CSV to XML parser using a JAXP with SAX Events to parse the CSV file into a DOM tree.
    Well inside the parse() method I have the following code":
    public void parse(InputSource input) throws IOException, SAXException
    BufferedReader br = null;
    if( input.getCharacterStream() != null )
    br = new BufferedReader( input.getCharacterStream() );
    else if( input.getByteStream() != null )
    br = new BufferedReader( new InputStreamReader( input.getByteStream() ) );
    else if( input.getSystemId() != null )
    URL url = new URL( input.getSystemId() );
    br = new BufferedReader( new InputStreamReader( url.openStream() ) );
    else
    throw new SAXException( "Objeto InputSource invalido" );
    ContentHandler ch = getContentHandler();
    ch.startDocument();
    ch.startElement( "", "", "file", new AttributesImpl() );
    this.parseInput( br );
    ch.endElement( "", "", "file" );
    ch.endDocument();
    Problem is that whenever the app gets to the ch.startDocument() statement it throws an java.lang.NullPointerExecption. I have no idea why this is happening, I have tested the very same code with Xalan 2 and Xercer 2 parsers and it works without problems. But using the oracle xml parser v2 throws the Exception.
    Is this a bug? should I set tome of the Transformer's attributes to an specifica value to avoid this? Where could I find more info on processing SAX events?
    Thanks,
    Fedro

    Fedro,
    Did you try it using XDK v10?

  • Edit an XML file with SAX

    Dear all, I am so confused�.
    I have been trying for the last few days to understand how sax works� The only thing I understood is:
    DefaultHandler handler = new Echo01();
    SAXParserFactory factory = SAXParserFactory.newInstance();
            try {
                out = new OutputStreamWriter(System.out, "UTF8");
                SAXParser saxParser = factory.newSAXParser();
                saxParser.parse(file , handler);
            } catch (Throwable t) {
                t.printStackTrace();
            System.exit(0);
        }Ok, I assign the SAXParser the xml file and a handler. The parser parses and throws events that the handler catches. By implementing some handler interface or overriding the methods of an existing handler (e.g DeafultHandler class) I get to do stuff�
    But still, suppose I have implement startElement() method of DefaultHandler class and I know that the pointer is currently placed on an element e.g. <name>bob</name>. How do I get the value of the element, and if I manage to do that, how can I replace�bob� with �tom�?
    I would really appreciate any help given� just don�t recommend http://java.sun.com/webservices/jaxp/dist/1.1/docs/tutorial/ because although there are interesting staff in there, it does not solve my problem�

    Maybe SAX is not the right tool for you.
    With SAX, you implement methods like startElement and characters that get called as XML data is encountered by the parser. If you want to catch it or not, the SAX parser does not care. In your case, the "bob" part will be passed in one or more calls to characters. To safely process the data, you need to do something like build a StringBuffer or StringBuilder in the constructor of the class, and then in the startElement, if the name is one you want to read, set the length to zero. In the characters method, append the data to the StringBuilder or StringBuffer. In the endElement, do a toString to keep the data wherever you want.
    This works for simple XML, but may need to be enhanced if you have nested elements with string values that contain other elements.
    On the other hand, if your file is not huge, you could use DOM. With DOM, (or with JDOM, and I would expect with Dom4J -- but I have only used the first two) you do a parse and get a Document object with the entire tree. That allows you to easily (at least it is easy once you figure out how to do it) find a node like the "name" element and change the Text object that is its child from a value of "bob" to "tom". With DOM, you can then serialize the modified Document tree and save it as an XML file. SAX does not have any way to save your data. That burden falls to you entirely.
    Dave Patterson

  • How to add attribute to Element with SAX

    Hi,
    I'm parsing XML document with SAX using DefaultHandler.
    How can I add attribute to start tag?

    Is this right????????????Yes, it's right. Everything everybody except you has said in this thread has been right.

  • Processing unfinished stream with SAX

    Hi,
    I'm just writing some kind of a jabber plugin in java. I've decided to use sax for parsing server responses. However I've encountered a problem with sax.
    saxParser.parse(inputStream, this);Problem is, that events (such as startElement) are called after the connection (streams read method returns with -1) is closed. Is there any way to force sax to raise event as soon as the tag is read?
    Any help will be greatly appreciated.
    Regards
    Versor
    Edited by: versor on Nov 2, 2007 3:20 AM

    versor wrote:
    ... Problem is, that events (such as startElement) are called after the connection (streams read method returns with -1) is closed. Is there any way to force sax to raise event as soon as the tag is read? ...Fully circumvent the problem parsing the buffered stream.

  • Can I parse non-wellformed XML with SAX at all?

    Hi all,
    i was wondering whether its possible at all to parse XML that is not well formed with SAX.
    e.g. A HTML file that doesnt close tags and stuff like that.
    I tried implementing the fatal() method of the Handler in a a way that it consumes the exception but does not rethrow it.
    Also I tried setting the validation property to false. Both with no success.
    Any help would be appriciated.
    thx
    philipp

    Your experiments tell you the answer.
    If you have HTML tag soup, why not just run it through JTidy or HTMLTidy to make it into well-formed XHTML?

  • BUG? using own EntityResolver with SAX doesn't work

    Hello,
    I was experimenting with the oracle.xml.parser.XMLParser using
    the SAX interface.
    I've written a test program that instantiates a driver and
    registers my own handlers (which just print to System.out).
    I also have my own org.xml.sax.EntityResolver, it looks like
    this:
    public class SAXEntityResolver implements EntityResolver {
    public InputSource resolveEntity(String publicId,
    String systemId) throws SAXException, IOException {
    System.out.println("<<Call to resolveEntity>>");
    try { //assume it's a URL of some sort
    URL url=new URL(systemId);
    return new InputSource(url.openStream());
    catch (MalformedURLException e1) {
    try { //it's not a URL, assume a file spec
    FileInputStream fin=new FileInputStream(systemId);
    return new InputSource(fin);
    catch (FileNotFoundException e2) {
    return null;
    //don't understand it, let the parser handle it.
    when I parse the following xml file:
    <?xml version="1.0"?>
    <!DOCTYPE dinner SYSTEM "dinner.dtd">
    <dinner>
    <location planet="Earth">Alma 3</location>
    <time>12:30</time>
    <date>Vandaag</date>
    </dinner>
    The parser generates an error to my org.xml.sax.ErrorHandler
    which prints it to the screen. The output looks like this:
    [C:\temp\xml]java -cp c:\TEMP\xml\oracle\lib\xmlparser.jar;.
    SAXParseXML oracle.xml.parser.XMLParser dinner.xml
    Locater accepted: oracle.xml.parser.SAXLocator@6ba51a96
    document parsing start
    [error: Couldn't find external DTD 'dinner.dtd']
    element dinner start: null:4:1
    (other output follows with no more errors)
    It seems as if the Oracle XMLParser doesn't use my EntityResolver
    to resolve it's external entities (the dinner.dtd file in this
    case, the file is indeed there, trust me!), otherwise it would
    have printed the message seen in the code above (<<Call to
    resolveEntity>>). If you're wondering how I configured the
    systemId in the SAX parser, here's how:
    File f=new File(args[1]);
    InputSource src=new InputSource(new FileInputStream(f));
    src.setSystemId(f.toURL().toString());
    p.parse(src);
    Can you tell me why this is? (I use NT4 with jdk 1.2)
    I've tested the same thing with the IBM, Microstar and Sun
    parsers, and they all seem to work fine with this example...
    Hope to hear from you! (cc in with mail please)
    Erwin.
    null

    Thanks for the post. You have identified a bug which will be
    fixed in a maintenance release. Until that time you can parse a
    String type rather than a InputSource type in SAXParseXML.java as
    a workaround.
    Oracle XML Team
    http://technet.oracle.com
    Erwin Vervaet (guest) wrote:
    : Oracle XML Team wrote:
    : : Which version of the parser are you using? If not 1.0.0.3
    (the
    : : latest) try that version. If the problem still exists it
    would
    : : help if you could provide your test program.
    : The readme.html in the xmlparser_v1_0_0_3.zip file (I download
    it
    : on monday 8/2/1999) says: 'Oracle XML Parser 1.0.0.3.0'.
    : So that's not the problem, below are all the files of the test
    : program. The command I use to start the program is the
    following
    : (note that there cannot be a classpath clash problem!, I use
    Sun
    : jdk1.2 on NT4 SP4):
    : [C:\temp\xml]dir
    : Volume in drive C is unlabeled Serial number is 2C90:8BDE
    : Directory of C:\temp\xml\*
    : 11/02/99 10:50 <DIR> .
    : 11/02/99 10:50 <DIR> ..
    : 9/02/99 16:34 <DIR> aelfred
    : 9/02/99 21:56 <DIR> oracle
    : 8/02/99 17:44 <DIR> xml-ea2
    : 8/02/99 14:10 <DIR> xml4j
    : 9/02/99 22:44 <DIR> xp
    : 9/02/99 16:42 215 dinner.dtd
    : 9/02/99 23:03 167 dinner.xml
    : 8/02/99 15:23 438 ParseXml.java
    : 11/02/99 10:50 2.402 SAXDocHandler.class
    : 9/02/99 21:14 1.585 SAXDocHandler.java
    : 11/02/99 10:50 1.129 SAXEntityResolver.class
    : 9/02/99 22:04 737 SAXEntityResolver.java
    : 11/02/99 10:50 976 SAXErrHandler.class
    : 9/02/99 15:39 495 SAXErrHandler.java
    : 11/02/99 10:50 1.261 SAXParseXML.class
    : 9/02/99 22:09 629 SAXParseXML.java
    : 10.034 bytes in 11 files and 7 dirs 12.800 bytes
    : allocated
    : 201.152.512 bytes free
    : [C:\temp\xml]java -cp c:\temp\xml\oracle\lib\xmlparser.jar;.
    : SAXParseXML oracle.xml.parser.XMLParser dinner.xml
    : Here are the files:
    : //file SAXErrHandler.java
    : import org.xml.sax.*;
    : public class SAXErrHandler implements ErrorHandler {
    : public void warning(SAXParseException exception) throws
    : SAXException {
    : System.err.println("[warning: " + exception +
    : public void error(SAXParseException exception) throws
    : SAXException {
    : System.err.println("[error: " + exception + "]");
    : public void fatalError(SAXParseException exception)
    : throws SAXException {
    : System.err.println("[fatal error: " + exception + "]");
    : //file SAXEntityResolver.java
    : import org.xml.sax.*;
    : import java.net.*;
    : import java.io.*;
    : public class SAXEntityResolver implements EntityResolver {
    : public InputSource resolveEntity(String publicId, String
    : systemId) throws SAXException, IOException {
    : System.out.println("<<Call to resolveEntity>> " + publicId + "
    : + systemId);
    : try { //assume it's a URL of some sort
    : URL url=new URL(systemId);
    : return new InputSource(url.openStream());
    : catch (MalformedURLException e1) {
    : try { //it's not a URL, assume a file
    : spec
    : FileInputStream fin=new
    : FileInputStream(systemId);
    : return new InputSource(fin);
    : catch (FileNotFoundException e2) {
    : return null; //don't understand
    : it, let the parser handle it.
    : //file SAXDocHandler.java
    : import org.xml.sax.*;
    : public class SAXDocHandler implements DocumentHandler {
    : private Locator locator=null;
    : public void startDocument() throws SAXException {
    : System.out.println("document parsing start");
    : public void setDocumentLocator(Locator locator) {
    : System.out.println("Locater accepted: " + locator);
    : this.locator=locator;
    : public void startElement(String name, AttributeList atts)
    : throws SAXException {
    : System.out.println("element " + name + " start: "
    : + locate());
    : for (int i = 0; i < atts.getLength(); i++)
    : System.out.println("attribute " +
    : atts.getName(i) + "=" + atts.getValue(i) + " (" +
    atts.getType(i)
    : + ")");
    : public void characters(char[] ch, int start, int length)
    : throws SAXException {
    : System.out.println("char data: " + new
    : String(ch,start,length));
    : public void ignorableWhitespace(char[] ch, int start, int
    : length) throws SAXException {
    : System.out.println("ignoring some whitespace: " +
    : new String(ch,start,length));
    : public void endElement(String name) throws SAXException {
    : System.out.println("element " + name + " end: " +
    locate());
    : public void processingInstruction(String target, String
    : data) throws SAXException {
    : System.out.println("PI: " + target + "=" + data);
    : public void endDocument() throws SAXException {
    : System.out.println("document parsing end");
    : private String locate() {
    : if (locator!=null) {
    : return locator.getSystemId() + ":" +
    : locator.getLineNumber() + ":" + locator.getColumnNumber();
    : return "";
    : //file SAXParseXML.java
    : import org.xml.sax.*;
    : import org.xml.sax.helpers.ParserFactory;
    : import java.io.*;
    : public class SAXParseXML {
    : public static void main(String[] args) {
    : if (args.length>1) {
    : try {
    : Parser
    : p=ParserFactory.makeParser(args[0]);
    : p.setDocumentHandler(new
    : SAXDocHandler());
    : p.setErrorHandler(new
    : SAXErrHandler());
    : p.setEntityResolver(new
    : SAXEntityResolver());
    : File f=new File(args[1]);
    : InputSource src=new
    : InputSource(new FileInputStream(f));
    : src.setSystemId(f.toURL().toString());
    : p.parse(src);
    : catch (Exception e) {
    : e.printStackTrace();
    : //file dinner.xml
    : <?xml version="1.0"?>
    : <!DOCTYPE dinner SYSTEM "dinner.dtd">
    : <dinner>
    : <location planet="Earth">Alma 3</location>
    : <time>12:30</time>
    : <date>Vandaag</date>
    : </dinner>
    : //file dinner.dtd
    : <?xml version="1.0" encoding="UTF-8"?>
    : <!ELEMENT dinner (location, time, date?)>
    : <!ELEMENT location (#PCDATA)>
    : <!ELEMENT time (#PCDATA)>
    : <!ELEMENT date (#PCDATA)>
    : <!ATTLIST location country CDATA "Belgium">
    Oracle Technology Network
    null

  • Possible to overwrite parse method in DOM with SAX Handler?

    Hi,
    Is it possible to overwrite the parse method within DOM?
    What a want to do is:
         private Node parseXml( String text ) throws SAXParseException, SAXException, IOException, ParserConfigurationException
              SAXParserFactory factory = SAXParserFactory.newInstance();
                   factory.setNamespaceAware(true);
                   SAXParser parser = factory.newSAXParser();
                   TestHandler handler = new TestHandler();
                   // Parse the file
                   parser.parse(new InputSource(new StringReader(text)), handler);
              return (Node)handler.getRoot();
         } //end parseXml()The reason I want to use this is that within SAX I can write my own line counter so I keep a count of which node is on which line of text.
    What I was thinking is that I could use SAX to return DOM Nodes with the line number attached, possibly using the node.setUserData() method?!
    I have tried to play around with it and it doen't seem to work, can anyone help?
    Cheers Alex

    I have managaed to re-write my SAX parser to create a JTree, this works perfectly I can move through the tree and each line of text that corresponds to the node is highlighted.
    My problem is however that in my application I have used the DOM structure throughout so some of the functionality is lost.
    Is I understand that JAXP uses both SAX and DOM together, so I was wondering if it is possible to combine my sax parse method within the DOM?
    If anything is unclear please say and I will try and explain better. The main reason for doing this is that I want to keep a reference to which line of text each node of the dom tree represents. I have not been able to implemnet one in DOM however using SAX I have managed.
    Many thanks,
    Alex

  • How to Parse XML with SAX and Retrieving the Information?

    Hiya!
    I have written this code in one of my classes:
    /**Parse XML File**/
              SAXParserFactory factory = SAXParserFactory.newInstance();
              GameContentHandler gameCH = new GameContentHandler();
              try
                   SAXParser saxParser = factory.newSAXParser();
                   saxParser.parse(recentFiles[0], gameCH);
              catch(javax.xml.parsers.ParserConfigurationException e)
                   e.printStackTrace();
              catch(java.io.IOException e)
                   e.printStackTrace();
              catch(org.xml.sax.SAXException e)
                   e.printStackTrace();
              /**Parse XML File**/
              games = gameCH.getGames();And here is the content handler:
    import java.util.ArrayList;
    import org.xml.sax.*;
    import org.xml.sax.helpers.DefaultHandler;
    class GameContentHandler extends DefaultHandler
         private ArrayList<Game> games = new ArrayList<Game>();
         public void startDocument()
              System.out.println("Start document.");
         public void endDocument()
              System.out.println("End document.");
         public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes atts) throws SAXException
         public void endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException
         public void characters(char[] ch, int start, int length) throws SAXException
              /**for (int i = start; i < start+length; i++)
                   System.out.print(ch);
         public ArrayList<Game> getGames()
              return games;
    }And here is the xml i am trying to parse:<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
    <Database>
         <Name></Name>
         <Description></Description>
         <CurrentGameID></CurrentGameID>
         <Game>
              <gameID></gameID>
              <name></name>
              <publisher></publisher>
              <platform></platform>
              <type></type>
              <subtype></subtype>
              <genre></genre>
              <serial></serial>
              <prodReg></prodReg>
              <expantionFor></expantionFor>
              <relYear></relYear>
              <expantion></expantion>
              <picPath></picPath>
              <notes></notes>
              <discType></discType>
              <owner></owner>
              <location></location>
              <borrower></borrower>
              <numDiscs></numDiscs>
              <discSize></discSize>
              <locFrom></locFrom>
              <locTo></locTo>
              <onLoan></onLoan>
              <borrowed></borrowed>
              <manual></manual>
              <update></update>
              <mods></mods>
              <guide></guide>
              <walkthrough></walkthrough>
              <cheats></cheats>
              <savegame></savegame>
              <completed></completed>
         </Game>
    </Database>I have been trying for ages and just can't get the content handler class to extract a gameID and instantiate a Game to add to my ArrayList! How do I extract the information from my file?
    I have tried so many things in the startElement() method that I can't actually remember what I've tried and what I haven't! If you need to know, the Game class instantiates with asnew Game(int gameID)and the rest of the variables are public.
    Please help someone...                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

    OK, how's this?
    public void startElement(String namespaceURI, String localName, String qualifiedName, Attributes atts) throws SAXException
              current = "";
         public void endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException
              try
                   if(qualifiedName.equals("Game") || qualifiedName.equals("Database"))
                        {return;}
                   else if(qualifiedName.equals("gameID"))
                        {games.add(new Game(Integer.parseInt(current)));}
                   else if(qualifiedName.equals("name"))
                        {games.get(games.size()-1).name = current;}
                   else if(qualifiedName.equals("publisher"))
                        {games.get(games.size()-1).publisher = current;}
                   etc...
                   else
                        {System.out.println("ERROR - Qualified Name found in xml that does not exist as databse field: " + qualifiedName);}
              catch (Exception e) {} //Ignore
         public void characters(char[] ch, int start, int length) throws SAXException
              current += new String(ch, start, length);
         }

  • Generating XML content with SAX including schema reference

    Hi all, XML newbie question here.
    I'm trying to generate an XML document from a certain file format using SAX, but I can't figure out how to get the generated XML document to include a schema reference.
    Here's the code, it's taken from posts on this forum, so it should be familiar:
    StreamResult streamResult = new StreamResult(out);
            SAXTransformerFactory tf = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
            try {
                TransformerHandler hd = tf.newTransformerHandler();
                Transformer serializer = hd.getTransformer();
                serializer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");
                serializer.setOutputProperty(OutputKeys.INDENT, "yes");
                hd.setResult(streamResult);
                hd.startDocument();
                AttributesImpl atts = new AttributesImpl();
                hd.startElement("http://maul.ddm.apm.bpm.eds.com", "DTR_XML", "DTR_XML", atts);
                hd.endElement("http://maul.ddm.apm.bpm.eds.com", "DTR_XML", "DTR_XML");
                hd.endDocument();
            } catch (SAXException e) {
                e.printStackTrace();
            } catch (TransformerConfigurationException e) {
                e.printStackTrace();
            }Here's the output I get:
    <?xml version="1.0" encoding="ISO-8859-1"?>
    <DTR_XML/>
    And I'm looking for output like this (I think - basically I have this dtr_xml.xsd located at the root web directory on http://maul.ddm.apm.bpm.eds.com and I want the generated XML file to reference that schema for validation when parsing):
    <?xml version="1.0" encoding="ISO-8859-1"?>
    <DTR_XML xmlns="http://maul.ddm.apm.bpm.eds.com"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maul.ddm.apm.bpm.eds.com dtr_xml.xsd"/>

    Yes, you've led me along the right track with the attributes stuff. Here's what I've got right now...
    StreamResult streamResult = new StreamResult(out);
            SAXTransformerFactory tf = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
            try {
                TransformerHandler hd = tf.newTransformerHandler();
                Transformer serializer = hd.getTransformer();
                serializer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");
                serializer.setOutputProperty(OutputKeys.INDENT, "yes");
                hd.setResult(streamResult);
                hd.startDocument();
                AttributesImpl atts = new AttributesImpl();
                atts.addAttribute("", "xmlns", "xmlns", "CDATA", "http://maul.ddm.apm.bpm.eds.com");
                atts.addAttribute("", "xsi", "xmlns:xsi", "CDATA", "http://www.w3.org/2001/XMLSchema-instance");
                atts.addAttribute("", "schemaLocation", "xsi:schemaLocation", "CDATA", "http://maul.ddm.apm.bpm.eds.com dtr_xml.xsd");
                hd.startElement("", "DTR_XML", "DTR_XML", atts);
                hd.endElement("", "DTR_XML", "DTR_XML");
                hd.endDocument();
            } catch (SAXException e) {
                e.printStackTrace();
            } catch (TransformerConfigurationException e) {
                e.printStackTrace();
            }This produces the output I wanted earlier...
    As for the org.xml.sax.helpers.NamespaceSupport, I can't seem to find any documentation on using it anywhere, and the javadoc is cryptic to me. Maybe it's used internally in SAX for tracking namespaces or something like that.
    Another interesting thing to me is that if I use the code you gave:
    atts.addAttribute("http://www.w3.org/2001/XMLSchema-instance", "schemaLocation",
         "xsi:schemaLocation", "CDATA", "http://maul.ddm.apm.bpm.eds.com dtr_xml.xsd");I don't see "http://www.w3.org/2001/XMLSchema-instance" anywhere in the output. Is SAX ignoring the namespace uri argument? It appears so. The javadoc states that the uri argument is "The Namespace URI, or the empty string if none is available or Namespace processing is not being performed." It would appear that Namespace processing is not being done... but I don't know how to turn it on.

  • Problem with SAX parser - entity must finish with a semi-colon

    Hi,
    I'm pretty new to the complexities of using SAXParserFactory and its cousins of XMLReaderAdapter, HTMLBuilder, HTMLDocument, entity resolvers and the like, so wondered if perhaps someone could give me a hand with this problem.
    In a nutshell, my code is really nothing more than a glorified HTML parser - a web page editor, if you like. I read in an HTML file (only one that my software has created in the first place), parse it, then produce a Swing representation of the various tags I've parsed from the page and display this on a canvas. So, for instance, I would convert a simple <TABLE> of three rows and one column, via an HTMLTableElement, into a Swing JPanel containing three JLabels, suitably laid out.
    I then allow the user to amend the values of the various HTML attributes, and I then write the HTML representation back to the web page.
    It works reasonably well, albeit a bit heavy on resources. Here's a summary of the code for parsing an HTML file:
          htmlBuilder = new HTMLBuilder();
    parserFactory = SAXParserFactory.newInstance();
    parserFactory.setValidating(false);
    parserFactory.setNamespaceAware(true);
    FileInputStream fileInputStream = new FileInputStream(htmlFile);
    InputSource inputSource = new InputSource(fileInputStream);
    DoctypeChangerStream changer = new DoctypeChangerStream(inputSource.getByteStream());
    changer.setGenerator(
       new DoctypeGenerator()
          public Doctype generate(Doctype old)
             return new DoctypeImpl
             old.getRootElement(),
                              old.getPublicId(),
                              old.getSystemId(),
             old.getInternalSubset()
          resolver = new TSLLocalEntityResolver("-//W3C//DTD XHTML 1.0 Transitional//EN", "xhtml1-transitional.dtd");
          readerAdapter = new XMLReaderAdapter(parserFactory.newSAXParser().getXMLReader());
          readerAdapter.setDocumentHandler(htmlBuilder);
          readerAdapter.setEntityResolver(resolver);
          readerAdapter.parse(inputSource);
          htmlDocument = htmlBuilder.getHTMLDocument();
          htmlBody = (HTMLBodyElement)htmlDocument.getBody();
          traversal = (DocumentTraversal)htmlDocument;
          walker = traversal.createTreeWalker(htmlBody,NodeFilter.SHOW_ELEMENT, null, true);
          rootNode = new DefaultMutableTreeNode(new WidgetTreeRootNode(htmlFile));
          createNodes(walker); However, I'm having a problem parsing a piece of HTML for a streaming video widget. The key part of this HTML is as follows:
                <object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
                  id="client"
            width="100%"
            height="100%"
                  codebase="http://fpdownload.macromedia.com/get/flashplayer/current/swflash.cab">
                  <param name="movie" value="client.swf?user=lkcl&stream=stream2&streamtype=live&server=rtmp://192.168.250.206/oflaDemo" />
             etc....You will see that the <param> tag in the HTML has a value attribute which is a URL plus three URL parameters - looks absolutely standard, and in fact works absolutely correctly in a browser. However, when my readerAdapter.parse() method gets to this point, it throws an exception saying that there should be a semi-colon after the entity 'stream'. I can see whats happening - basically the SAXParser thinks that the ampersand marks the start of a new entity. When it finds '&stream' it expects it to finish with a semi-colon (e.g. much like and other such HTML characters). The only way I can get the parser past this point is to encode all the relevant ampersands to %26 -- but then the web page stops working ! Aaargh....
    Can someone explain what my options are for getting around this problem ? Some property I can set on the parser ? A different DTD ? Not to use SAX at all ? Override the parser's exception handler ? A completely different approach ?!
    Could you provide a simple example to explain what you mean ?
    Thanks in anticipation !

    You probably don't have the ampersands in your "value" attribute escaped properly. It should look like this:
    value="client.swf?user=lkcl&stream=...{code}
    Most HTML processors (i.e. browsers) will overlook that omission, because almost nobody does it right when they are generating HTML by hand, but XML processors won't.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

  • Problem with sax parser

    Hello..
    I have the following problem. When I parse an xml document with blank spaces and numbers with decimals, its sometimes comes out as one string and sometimes as two, for example "First A" sometimes comes out as "First" and "A" and sometimes as "First A", which is how its stored in the xml file. Same with numbers like 19.20. Im enclosing a little of my code..
    public void characters(char buf[], int offset, int len)
    throws SAXException
    if (textBuffer != null) {
    SaveString = ""+textBuffer;
    if(i>-1)
    numbers = SaveString;
    Whats wrong and how do I fix it.
    Best Regards Dan
    PS I have more code, in data and out data if needed.Ds

    Hello,
    I do not know if this is your problem, yet please find hereafter an excerpt of the SAX API:
    public void characters(char[] ch,
                           int start,
                           int length)
                    throws SAXException
    ... SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks;...
    ... Note that some parsers will report whitespace in element content using the ignorableWhitespace method rather than this one (validating parsers must do so)...
    In other words, I am afraid that your issue is the "standard behaviour" of a SAX parser.
    I hope it helps.

  • What to do with SAX events

    I want to iterate over a database recordset and generate sax events to create a virtual xml document. But I'm struggling to see how the events are consumed.
    What do I do with the events that are generated by the strart/end document and element handlers. How do I send to a file, or better still, pass the events onto some tool to output as html/xml pages?
    Cheers again
    -thanks 4earlier code @Trejkaz

    All the examples I have ever seen of SAX are like this:
    You take an XML document and give it to a SAX parser. The SAX parser turns it into a stream of SAX events and calls your handler's startElement() etc. methods, which generally write to a file or something like that.
    Your requirement is the reverse, namely you want to input from the "something like that", make a stream of SAX events, and have those turned into an XML document. I have never seen a decent example of this so I had to work it out for myself. I posted my solution in this forum several months ago but I can't find it now. So here it is again:SAXTransformerFactory factory = (SAXTransformerFactory)TransformerFactory.newInstance();
    TransformerHandler handler = factory.newTransformerHandler();
    // if you want to use XSL to transform what you produce then
    // you need the version that takes a Templates argument.
    handler.setResult(new StreamResult(response.getWriter()));
    // in my case I send the resulting XML document to the servlet
    // response, but you could send it somewhere else.
    SAXParserFactory spf = SAXParserFactory.newInstance();
    XMLReader reader = spf.newSAXParser().getXMLReader();
    reader.setContentHandler(handler);
    reader.setProperty("http://xml.org/sax/properties/lexical-handler", handler);
    reader.setFeature("http://xml.org/sax/features/namespaces", true);
    reader.setFeature("http://xml.org/sax/features/namespace-prefixes", false);
    handler.startDocument();
    startElement(handler, "Doc");
    // I am producing an XML document whose root is a Doc element.
    // Send more SAX events here.
    endElement(handler, "Doc");
    handler.endDocument();

Maybe you are looking for

  • How to set the initial path of a Select File dialog.

    I'm using the Flex3 File.browseForOpen() method.  I'm opening a xml file; but I want to seed a path to the open dialog so used don't have to do much file navigation.   Is there a way to specify a path that the file browser starts in? Thanks. -Hays

  • Pages into pdf, different font

    When I export my pages document into a pdf, I get a completely different font. This font is not even close to the style of my brand. For my company I use teh imported font Lato, which has been working out since years....This flipping into a completel

  • Split Brain handling in oracle 10g

    How does oracle10g deal with split brain? Does it use voting disk or files or does it use any of the SCSI protocols ?

  • Instanceof operator

    Hello I have a flow that if an object of class b then I want a method executed and if of class a then a different method. The two methods are in both classes. I have tried the following with errors everywhere. if ( objA instance of Class B) objA.meth

  • Dual network (n/g) config problems

    Hi folks, I have been running my home network off a (previous gen) Airport Express. I have a new MacBook (n-compatible) and a 12" PowerBook (g-compatible). The interference in my neighborhood on the 2.4 ghz band is dreadful - lots of dropouts, inabil