Parsing the large XML ( 1GB) using SAX PARSER

We have very large XMLs being generated by processes. These XMLs should be validated first and then parsed next. We have implementation that works for small files. My question is
how can we validate and parse large XMLs with SAX parser?

The same way as parsing a small XML file, no? Why don't you try it? Then if you have problems that you can't solve, ask about them here.

Similar Messages

  • Buiding xml file using SAX parser of JAXP

    Please send me xml building using the sax parser.This is the urgent requirement ,iam not geeting how to solve this problem.so please anybody can help with one best example

    You don't build an XML file with a parser. A parser reads an XML file and converts it to some internal representation. Try reading this tutorial:
    http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/

  • SAX: How to create new XML file using SAX parser

    Hi,
    Please anybody help me to create a XML file using the Packages in the 5.0 pack of java. I have successfully created it reading the tag names and values from database using DOM but can i do this using SAX.
    I am successful to read XML using SAX, now i want to create new XML file for some tags and its values using SAX.
    How can i do this ?
    Sachin Kulkarni

    SAX is a parser, not a generator.Well,
    you can use it to create an XML file too. And it will take care of proper encoding, thus being much superior to a normal textwriter:
    See the following code snippet (out is a OutputStream):
    PrintWriter pw = new PrintWriter(out);
          StreamResult streamResult = new StreamResult(pw);
          SAXTransformerFactory tf = (SAXTransformerFactory) TransformerFactory.newInstance();
          //      SAX2.0 ContentHandler.
          TransformerHandler hd = tf.newTransformerHandler();
          Transformer serializer = hd.getTransformer();
          serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");//
          serializer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM,"pdfBookmarks.xsd");
          serializer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM,"http://schema.inplus.de/pdf/1.0");
          serializer.setOutputProperty(OutputKeys.METHOD,"xml");
          serializer.setOutputProperty(OutputKeys.INDENT, "yes");
          hd.setResult(streamResult);
          hd.startDocument();
          //Get a processing instruction
          hd.processingInstruction("xml-stylesheet","type=\"text/xsl\" href=\"mystyle.xsl\"");
          AttributesImpl atts = new AttributesImpl();
          atts.addAttribute("", "", "someattribute", "CDATA", "test");
          atts.addAttribute("", "", "moreattributes", "CDATA", "test2");
           hd.startElement("", "", "MyTag", atts);
    String curTitle = "Something inside a tag";
              hd.characters(curTitle.toCharArray(), 0, curTitle.length());
        hd.endElement("", "", "MyTag");
          hd.endDocument();
    You are responsible for proper nesting. SAX takes care of encoding.
    Hth
    ;-) stw

  • Read any XML File Elements using SAX Parser in J2se

    Hi All
    I can able to parsed one structured XML file using SAX
    Sample code :
    // ===========================================================
         // SAX DocumentHandler methods
         // ===========================================================
         public void startDocument() throws SAXException {
              logger.info("Start of document");
         public void endDocument() throws SAXException {
              logger.info("End of document");
         public void startElement(String namespaceURI, String localName, // local
                   // name
                   String qualName, // qualified name
                   Attributes attrs) throws SAXException {
              elemName = new String(localName); // element name
              if (elemName.equals(""))
                   elemName = new String(qualName); // namespaceAware = false
              tagPosition = TAG_START;
              // Set the string for accumulating the text in a tag to empty
              elemChars = "";
              // If the element name is "row", create a new row instance
              // If the element is "indexxid", "ModelPrice", or "ModelSpread",
              // the value will be read in the method "characters" and stored.
              if (elemName.equals("row")) {
                   row = new IndexRow();
                   numRows++;
              // logger.info("Number of numRow:"+numRows);
         } // end method startElement
         public void endElement(String namespaceURI, String simpleName, // simple
                   // name
                   String qualName // qualified name
         ) throws SAXException {
              elemName = new String(simpleName);
              if (elemName.equals(""))
                   elemName = new String(qualName); // namespaceAware = false
              tagPosition = TAG_END;
              String indexId = new String();
              Double dblVal = new Double(0);
              // If element name is "row", put the current row in the map for row
              // instances
              if (elemName.equals("row")) {
                   if (numRows <= 5) { logger.info("Row is: " + row.toString()); }
                   //ABX
                   //indexRows.put(row.getIndexxId(), row);
                   if (family.equals("ABX.HE")){
                   indexRows.put(row.getIndexREDId(), row);
                   else {
                        //CDX ITRXX
                             indexRows.put(row.getIndexxId(), row);
              } else if (elemName.equals("IndexID")) {
                   row.setIndexxId(elemChars);
                   // Leave double value at default of zero if there are no chars
                   if (elemChars.trim().length() != 0) {
                        dblVal = new Double(elemChars);
                        row.setCompositeSpread(dblVal);
                        indexId = row.getIndexxId();
              } else if (elemName.equals("REDCode")) {
                   row.setRedCode(elemChars);
              else if (elemName.equals("Name")) {
                   row.setRowName(elemChars);
              } else if (elemName.equals("Series")) {
                   row.setSeries(elemChars);
              } else if (elemName.equals("Version")) {
                   row.setVersion(elemChars);
              } else if (elemName.equals("Term")) {
                   row.setTerm(elemChars);
              } else if (elemName.equals("Maturity")) {
                   row.setMaturity(elemChars);
              } else if (elemName.equals("OnTheRun")) {
                   row.setOnTheRun(elemChars);
              } else if (elemName.equals("Date")) {
                   row.setRowDate(elemChars);
              } else if (elemName.equals("Depth")) {
                   row.setDepth(elemChars);
              else if (elemName.equals("Heat")) {
                   // logger.info("Chars for element " + elemName + " are '" +
                   // elemChars + "'");
                   // Leave double value at default of zero if there are no chars
                   if (elemChars.trim().length() != 0) {
                        dblVal = new Double(elemChars);
                        row.setHeat(dblVal);
                        indexId = row.getIndexxId();
    //          ABX.HE
              else if (elemName.equals("IndexREDId")){
                   row.setIndexREDId(elemChars);
              else if (elemName.equals("Coupon")){
                   row.setCoupon(elemChars);
              if (elemName.equals("Ontherun")) {
                   row.setOnTheRun(elemChars);
         } // end method endElement
         public void characters(char buf[], int offset, int len) throws SAXException {
              // If at end of element, there will be no characters
              if (tagPosition == TAG_END) {
                   return;
              // The characteres method may be called more than once
              // for an element if the internal buffer fills up.
              // Append the characters until the end of the element.
              String strVal = new String(buf, offset, len);
              elemChars = elemChars + strVal;
         } // end method characters
    } // end class MarkItIndexLoader
    but the problem is i want to read (parse) any XML file means any Elemets would be change any time using SAX .In the above example
    else if (elemName.equals("Heat")) {
    else if (elemName.equals("IndexREDId")){
    } else if (elemName.equals("Maturity")) {
    like above I am doing hard code Elements names and reading the values so i don't want hard coding the elements names I want to read any element name and value dynamically.
    If i give any one below XML file i want to read the Elements and displaying to console without changing any code i want to read the XML document.
    EX:
    Student.XML: <root>..</StName>..</StAge>...</root>
    Employee.XML: <root>..</EmpName>..</EmpAge>...</root>
    CdCatalog.XML: <root>..</Cdtitle>...</CdNumber>...</root>
    I need one java program can ready any type of XML file elements and send to the Database table.
    Please any one done like this task please suggest some reference links or books or sample snippet which can help me to develop program in my requirement.
    Thanks in advance
    Regards
    satish

    You should ask in the Java forum.
    Regards
    Stefan

  • Code to read xml file  and display that data using sax parser

    Hai
    My problem I have to read a xml file and display the contents of the file on console using sax parser.

    here you go

  • How to Create XML file with SAX parser instead of DOM parser

    HI ALL,
    I am in need of creating an XML file by SAX parser ONLY. As far as my knowledge goes, we can use DOM for such purpose(by using createElement, creatAttribute ...). Can anyone tell me, is there any way to create an XML file using SAX Parser only. I mean, I just want to know whether SAX provides any sort of api for Creatign an element, attribute etc. I know that SAX is for event based parsing. But my requirement is to create an XML file from using only SAX parser.
    Any help would be appreciated
    Thanx in advance
    Kaushik

    Hi,
    You must write a XMLWriter class yourself, and that Class extends DefaultHandle ....., the overwrite the startElement(url, localName, qName, attributeList), startDocument(), endElement().....and so on.
    in startElement write your own logic about how to create a new element and how to create a Attribute list
    in startDocument write your own logic about how to build a document and encodeType, dtd....
    By using:
    XMLWriter out = new XMLWriter()
    out.startDocument();
    Attribute attr1 = new Atribute();
    attr1.add("name", "value");
    out.startElement("","","Element1", attr1);
    Attribute attr2 = new Atribute();
    attr2.add("name", "value");
    out.startElement("","","Element2", attr2);
    out.endElement("","","Element2");
    out.endElement("","","Element1");
    out.endDocument();

  • How can we get  tag of XML file using SAX

    Hi ,
    I'm parsing one SAX parser , I'have almost done this parsing. i have faced problem for one case, i'e how can we get tag from XML file using SAX parser?
    XML file is
    <DFProperties>
    <AccessType>
    <Get/>
    </AccessType> <Description>
    gdhhd
    </Description>
    <DFFormat>
    <chr/>
    </DFFormat>
    <Scope>
    <Permanent/>
    </Scope>
    <DFTitle>gsgd</DFTitle>
    <DFType>
    <MIME>text/plain</MIME>
    </DFType>
    </DFProperties>
    I want out like GET and Permanent... means this one tag which is present inside of another tag.
    Handler class like
    public void startElement(String namespaceURI, String localName,
                   String qName, Attributes atts) throws SAXException {
    if(_ACCESSTYPE.equals(localName)){
                   accessTypeElement=ACCESSTYPE;
    public void characters(char[] ch, int start, int length)
                   throws SAXException {
    if (_ACCESSTYPE.equals(_accessTypeElement)) {
                   String strValue = new String(ch, start, length);
                   System.out.println("Accestype-----------------------------> " + strValue);
                   //System.out.println(" " + strValue);
    public void endElement(String namespaceURI, String localName, String qName)
                   throws SAXException {
    if (_ACCESSTYPE.equals(localName)) {
                   _accessTypeElement = "";
    . please any body help me

    Hi ,
    I have one problem,Please help me.
    1. How can I'll identify where exactly my Node is ended,means how how can we find corresponding nodename? in partcular place
    <Node> .............starttag1
    <NodeName>Test</NodeName>
    <Node>................starttag2
    <nodeName>test1</NodeName>
    </Node>..................endtag2
    <Node>.....................starttag3
    <NodeName><NodeName>
    <Node> .........................starttag4
    <NodeName>test4</NodeName>
    </Node>.......enddtag4
    </Node>...........end tag3
    </Node>............endtag1
    my code is below
    private final String _NODENAME = "NodeName";
    private final String _NODE = "Node";
    private String _nodeElement = "";
         private String _NodeNameElement = "";
    public void startElement(String namespaceURI, String localName,
                   String qName, Attributes atts) throws SAXException {
    if (_NODENAME.equals(localName)) {
                   NodeNameElement = NODENAME;
    if(_NODE.equals(localName)){
         System.out.println("start");
         if (_NODENAME.equals(localName)) {
                   NodeNameElement = NODENAME;
    public void characters(char[] ch, int start, int length)
                   throws SAXException {
    if (_NODENAME.equals(_NodeNameElement)) {
                   String strValue = new String(ch, start, length);
                   String sttt=strValue;
                   System.out.println("NODENAME: ************* " + strValue);
    if(_NODE.equals(_nodeElement)){
                   if (_NODENAME.equals(_NodeNameElement)) {
                        String strValue = new String(ch, start, length);
                        String sttt=strValue;
                        System.out.println("nodevalue********** " + strValue);
    public void endElement(String namespaceURI, String localName, String qName)
                   throws SAXException {
    if (_NODENAME.equals(localName)) {
                   _NodeNameElement = "";
    if(_NODE.equals(localName)){
                   System.out.println("NODENAME: %%%%%%%%%");
    please help me. How can I figure node ending for particular nodename

  • Problem in parsing an XML using SAX parser

    Hai All,
    I have got a problem in parsing an XML using SAX parser.
    I have an XML (sample below) which need to be parsed
    <line-items>
    <item num="1">
         <part-number>PN1234</part-number>
         <quantity uom="ea">10</quantity>
         <lpn>LPN1060</lpn>
         <reference num="1">Line ref 1</reference>
         <reference num="2">Line ref 2</reference>
         <reference num="3">Line ref 3</reference>
    </item>
    <item num="2">
         <part-number>PN1527</part-number>
         <quantity uom="lbs">5</quantity>
         <lpn>LPN2152</lpn>
         <reference num="1">Line ref 1</reference>
         <reference num="2">Line ref 2</reference>
         <reference num="3">Line ref 3</reference>
    </item>
    <item num="n">
    </item>
    </line-items>
    There can be any number of items( 1 to n). I need to parse these
    item values using SAX parser and invoke a stored procedure for
    each item with its
    values(partnumber,qty,lpn,refnum1,refnum2,refnum3).
    Suppose if there are 100 items, i need to invoke the stored
    procedure sp1() 100 times for each item.
    I need to invoke the stored procedure in endDocument() method of
    SAX event handler and not in endelement() method.
    What is the best way to store those values and invoke the stored
    procedure in enddocument() method.
    Any help would br greatly appreciated.
    Thanks in advance
    Pooja.

    VO or ValueObject is a trendy new name for Beans.
    So just create an item class with variables for each of the sub elements.
    <item>
    <part-number>PN1234</part-number>
    <quantity uom="ea">10</quantity>
    <lpn>LPN1060</lpn>
    <reference num="1">Line ref 1</reference>
    <reference num="2">Line ref 2</reference>
    <reference num="3">Line ref 3</reference>
    </item>
    public class ItemVO
    String partNumber;
    int quantity;
    String quantityType;
    String lpn;
    List references = new ArrayList();
    * @return Returns the lpn.
    public String getLpn()
    return this.lpn;
    * @param lpn The lpn to set.
    public void setLpn(String lpn)
    this.lpn = lpn;
    * @return Returns the partNumber.
    public String getPartNumber()
    return this.partNumber;
    * @param partNumber The partNumber to set.
    public void setPartNumber(String partNumber)
    this.partNumber = partNumber;
    * @return Returns the quantity.
    public int getQuantity()
    return this.quantity;
    * @param quantity The quantity to set.
    public void setQuantity(int quantity)
    this.quantity = quantity;
    * @return Returns the quantityType.
    public String getQuantityType()
    return this.quantityType;
    * @param quantityType The quantityType to set.
    public void setQuantityType(String quantityType)
    this.quantityType = quantityType;
    * @return Returns the references.
    public List getReferences()
    return this.references;
    * @param references The references to set.
    public void setReferences(List references)
    this.references = references;

  • XML to CSV using SAX Parser

    Hello
    I need to convert xml files to csv format using SAX Parser. The following code & outputs are as below:
    XML file:
    <Library>
    <Book>
         <Title>Professional JINI</Title>
         <Author>bs</Author>
         <Publisher>Oreilly Publications</Publisher>
    </Book>
    <Book>
         <Title>XML Programming</Title>
         <Author>java</Author>
         <Publisher>Mann Publications</Publisher>
    </Book>
    </Library>
    public class BooksLibrary extends DefaultHandler
    protected static final String XML_FILE_NAME = "C:\\library1.xml";
         public static void main (String argv [])
              // Use the default (non-validating) parser
              SAXParserFactory factory = SAXParserFactory.newInstance();
              try {
                   FileOutputStream fos=new FileOutputStream("C:/test.txt");
                   // Set up output stream
                   out = new OutputStreamWriter (fos, "UTF8");
                   // Parse the input
                   SAXParser saxParser = factory.newSAXParser();
                   saxParser.parse( new File(XML_FILE_NAME), new BooksLibrary() );
              } catch (Throwable t) {
                   t.printStackTrace ();
              System.exit (0);
         static private Writer out;
         //===========================================================
         // Methods in SAX DocumentHandler
         //===========================================================
         public void startDocument ()
         throws SAXException
              showData ("<?xml version='1.0' encoding='UTF-8'?>");
              newLine();
         public void endDocument ()
         throws SAXException
              try {
                   newLine();
                   out.flush ();
              } catch (IOException e) {
                   throw new SAXException ("I/O error", e);
         public void startElement (String name, Attributes attrs)
         throws SAXException
              showData ("<"+name);
              if (attrs != null) {
                   for (int i = 0; i < attrs.getLength (); i++) {
                        showData (" ");
                        showData (attrs.getLocalName(i)+"=\""+attrs.getValue (i)+"\"");
              showData (">");
         public void endElement (String name)
         throws SAXException
              showData ("</"+name+">");
         public void characters (char buf [], int offset, int len)
         throws SAXException
              String s = new String(buf, offset, len);
              showData (s);
         //===========================================================
         // Helpers Methods
         //===========================================================
         // Wrap I/O exceptions in SAX exceptions, to
         // suit handler signature requirements
         private void showData (String s)
         throws SAXException
              try {
                   out.write (s);
                   out.flush ();
              } catch (IOException e) {
                   throw new SAXException ("I/O error", e);
         // Start a new line
         private void newLine ()
         throws SAXException
              //String lineEnd = System.getProperty("line.separator");
              try {
                   out.write (", ");
              } catch (IOException e) {
                   throw new SAXException ("I/O error", e);
    --------------------------------------------------------------------------------------------------output is as follows:
    <?xml version='1.0' encoding='UTF-8'?>,
         Professional JINI
         bs
         Oreilly Publications
         XML Programming
         java
         Mann Publications
    Can anyone please tell me how to remove that indentation space & get the output as :
    <?xml version='1.0' encoding='UTF-8'?>, Professional JINI, bs, Oreilly Publications, XML Programming, java, Mann Publications
    Thanks

    By the way, there is a new feature in Java 5.0 (Tiger) called "Annotations."
    Since your code extneds DefaultHandler, you could specify a line with
    @Override
    before the definition of each of your methods. If you had used these, the compiler would have given an error since your methods did not override the methods of DefaultHandler.
    (If your code implemented ContentHandler, by contrast, using @Override is invalid because you need to implement all of the methods defined in the interface definition.)
    The other comment is that the safest way to handle characters() data is to use a StringBuilder/Buffer (StringBuilder is only valid in 5.0, StringBuffer has been around since Release 1.0) that you define in the startElement method. Use the append method to gather data presented to you in the characters() method and use toString() to harvest the data in the endElement method.
    Dave Patterson

  • Problem in using SAX parser.

    Hai All,
    I have got a problem in using SAX parser.
    My XML looks like this:
    <authorizer>
    <first-name>HP</first-name>
    <last-name>Services</last-name>
    <phone>800-22-1984</phone>
    </authorizer>
    <destination>
    <first-name>John</first-name>
    <last-name>Doe</last-name>
    <company>John Doe Enterprises, Inc.</company>
    <department>Manufacturing</department>
    <phone>800-555-1234</phone>
    <address>
    <street-one>1654 Peachtree Str</street-one>
    <street-two>Suite Y</street-two>
    <city>Atlanta</city>
    <province>GA</province>
    <country>US</country>
    <postal-code>30326</postal-code>
    </address>
    </destination>
    my part of SAX parser code is:
    public void startElement (String name, AttributeList attrs)
    throws SAXException
    accumulator.setLength(0);
    public void characters (char buf [], int offset, int len)
    throws SAXException
    accumulator.append(buf, offset, len);
    public void endElement (String name)
    throws SAXException
    if (name.equals("first-name") )
    firstname=accumulator.toString().trim();
    if (name.equals("last-name"))
    lastname=accumulator.toString().trim();
    My problem is that i have to store the values of first-name and last-name.
    but i have that in both
    <authorizer> </authorizer> Tag and
    <destination> </destination>
    I need to retrive authorizer's firstname,lastname and
    destination's firstname and lastname.
    what i mean is i need to store authorizerFirstName,authorizerLastName
    destinationFirstname and destinationLastname.
    Pls let me know how to do that.
    Thanks in advance.
    Pooja.

    hi pooja,
    I think you are using DataHandler for parsing. Its deprecated. try using contentHandler . You can get the value of the element at the beginning. say for example
    <firstname>sdfs</firstname>
    the startElement will be firstname
    the next method that it invokes will be characters method which has the text associated with the element. I am sending a sample code for your problem. try using it .
    boolean m_boolinAuth = false;
    boolean m_boolinDest = false;
    boolean m_bAuthFName = false;
    boolean m_bAuthLName = false;
    public void startElement(String namespaceURI, String elementName, String qName, Attributes atts)
    //does the logic for startElement
    if(qName.equals("Authorization"))
    m_boolinAuth = true;
    m_boolinDest = false;
    else if(qName.equals("Destination"))
    m_boolinDest = true;
    m_boolinAuth = false;
    if(qName.equals("firstname"))
    m_bFirstName = true;
    if(qName.equals("lastname"))
    m_bLastName = true;
    public void characters(char[] ch, int start, int length)
    //does the logic for characters.
    String str = new String(ch,start,length);
    if(m_bFirstName)
    if(m_boolinAuth)
    m_strAuthFirstName =str;
    else if(m_boolinDest)
    m_strDestFirstName = str;
    m_bFirstName = false;
    if(m_bLastName)
    //same as first name case;
    }

  • How to use SAX parser in J2ME

    Please help me how to use SAX parser in J2ME.
    Is there any function to find the value of a particular element from a XML file?
    I am able to get Element name, Values, Attributes. But control is not in my hand, DefaultHanldler automatically invokes Character method, then only I am able to get values. But I don't know when this method gets invoked.
    Is there any way or method so that I can get value of any element or attribute just by passing element name as parameter in SAX parser?
    Is there any other parser through which I can perform this task in J2ME?
    Thanks in advance.

    Hi..
    have a look at this.
    http://www-128.ibm.com/developerworks/library/wi-parsexml/
    MeTitus

  • How to retain prolog in output xml after parse the input xml

    Hi,
    I am using com.bea.xml.XmlObject.Factory.parse(String) method to parse a xml.
    Input XML is having prolog defore the root node.But after parse the xml using the above method, the prolog is not there in the Output XML.
    can any one help me to retain the prolog in Output XML as it is in Input XML......
    Thanks in advance..
    Regards,
    Deba

    Hi,
    The Input XML is like
    <?xml version="1.0" encoding="UTF-8"?>
    <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
         <SOAP-ENV:Body>
         </SOAP-ENV:Body>
    </SOAP-ENV:Envelope>
    But after parse the Output XML become
    <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
         <SOAP-ENV:Body>
         </SOAP-ENV:Body>
    </SOAP-ENV:Envelope>
    But due to project requirement i want to keep the prolog(<?xml version="1.0" encoding="UTF-8"?>) as it is with Output XML after parse the xml.
    can i use any XMLOption while calling parse() method...Or any one have any otherway to retain prolog after parse ?
    please help me to get it .
    Thanks in advance.
    Regards,
    Deba

  • Updating XML file using DOM parser

    Hi,
    Can someone help me, how to update following XML file using DOM parser.
    The following is my XML file.
    <students>
         <student>
              <id>1</id>
              <name>abc</name>
         </student>
         <student>
              <id>2</id>
              <name>xyz</name>
         </student>
         <student>
              <id>3</id>
              <name/>
         </student>
         <student>
              <id>4</id>
              <name>ijk</name>
         </student>
         <student>
              <id>5</id>
              <name></name>
         </student>
    </students>Consider, I will input 2 fields, ie., id & name. For the matching Id, the name has to be updated.
    Though, I have achieved this, but I am unable to update the value for 3rd record, & 5th record ie., id=3 & id=5. Since, these are blank.
    Thanks.

    Some <name> elements have a child node which is a text node. From what you say it appears you know how to change those text nodes.
    The other <name> elements don't have any child nodes. But you want one. This suggests to me that you need code that creates a text node and adds it to the <name> element as its child.

  • Cannot close an XML file used for parsing

    Hi All,
    I appears to have difficulty closing (possibly flushing it first) an XML file that was subsequently being parsed without success. The error generated is:
    org.jdom.input.JDOMParseException: Error on line 23: The element type "form" must be terminated by the matching end-tag "</form>".
    Below is the code snippets of readData() to retrieve (HTML) data from a website, save it to a file, then convert to XML format before returning the new filename:
    public String readData() {
        try {
              URL url  = new URL("http://www.abc.com");
              URLConnection connection = url.openConnection();      
              InputStream isInHtml = url.openStream();   // throws an IOException    
              disInHtml = new DataInputStream(new BufferedInputStream(isInHtml));         
              System.out.flush();
              FileOutputStream fosOutHtml = null;
              fosOutHtml = new FileOutputStream("C:\\Temp\\ABC.html");
              int oneChar, count=0;
              while ((oneChar=disInHtml.read()) != -1)
                  fosOutHtml.write(oneChar);
              isInHtml.close();
              disInHtml.close();
              fosOutHtml.flush();    // optional
              fosOutHtml.close();
        try {
              File fileInHtml = new File("C:\\Temp\\ABC.html");
              FileReader frInHtml = new FileReader(fileInHtml);
              BufferedReader brInHtml = new BufferedReader(frInHtml);
              String string = "";
              while (brInHtml.ready())
                  string += brInHtml.readLine() + "\n";
              fwOutXml  = new FileWriter("C:\\Temp\\ABC.xml");
              pwOutXml  = new PrintWriter(fwOutXml);
              light_html2xml html2xml = new light_html2xml();
              pwOutXml.print(html2xml.Html2Xml(string));
              system.out.flush()     // optional
              fwOutXml.flush();      // optional
              fwOutXml.close();
              pwOutXml.flush();      // optional
              pwOutXml.close();
              return fileInHtml.getAbsolutePath();
    // parseData reads the XML file using the name returned by readData()
    public void parseData(String XMLFilename)
        try
            FileReader frInXml = new FileReader(FileName);
            BufferedReader brInXml = new BufferedReader(frInXml);
            SAXBuilder saxBuilder = new SAXBuilder("org.apache.xerces.parsers.SAXParser"); // JDOMParseException generated.
    }These codes would worked when they were in a single method but I have since placed some structure around them using a number methods.
    This issue has risen in th past where I have been able to close the XML file prior to reading them again. However, I don't have a solution for it this time round.
    I am running JDK 1.6.0_10, Netbeans 6.1, JDOM 1.1 on Windows XP platform.
    Any assistance would be appreciated.
    Many thanks,
    Jack

    Hi Alain,
    I have added the additional I/O statements in the finally clause as follows but the problem still persisted:
    readData()
    // reading data (html) from the webpage and save it in html format.
    try {
    catch { …. }
    finally {
           System.out.flush();
           isInHtml.close();
           disInHtml.close();
           fosOutHtml.flush();
           fosOutHtml.getFD().sync();
           fosOutHtml.close();
    // convert the html webpage format to xml format
    try {
    catch { …. }
    finally {
           System.out.flush();
           fwOutXml.flush();
           fwOutXml.close();
           pwOutXml.flush();
           pwOutXml.close();
    Below is a short listing of the new XML file:
      <?xml version="1.0" encoding="iso-8859-1" ?>
    - <html>
    - <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
      <meta name="keywords" content="California, cities, towns, villages, list, zipcodes, postal codes, united states, ca" />
      <meta name="description" content="Cities, towns and suburbs in California, United States (CA) starting with A" />
      <title>Cities and Towns in California starting with A – ABC Company</title>
      <link rel="stylesheet" href="http://www.abc.com/style.css" type="text/css" media="screen" />
      </head>
    - <body>
      <a name="top" />
    - <div id="container">
    - <div id="header">
      <div id="postmark" />
    - <a href="http://www.abc.com/" class="imglink">
      <img id="logoimg" src="http://www.abc.com/images/zipcodes.gif" width="192" height="33" alt="Zipcodes America Logo" />
      </a>
      <hr />
      </div>
    - <div id="nav">
    - <ul>
    - <li>
      <a href="http://www.abc.com/" title="Home Page">Home</a>
      </li>
    - <li>
      <strong>Search</strong>
      (zipcode or suburb)
    - <div class="hide">
      <form method="post" action="http://www.abc.com/search" />        // line 23
      </div>
      <input type="text" name="q" class="searchbox" alt="Search query" />
      <br />
      <input type="submit" value="find!" class="searchbutton" alt="Perform search" />
      <div class="hide" />
      </li>
    …What I find it interesting is that it is possible to parse the above XML file with the same parseData() from another class without any problem. As a result, I have come to the following conclusion so far:
    ( i ) There is some file locking that is prevent saxBuilder from parsing the XML file at the time.
    ( ii ) The light_html2xml does not appears to have correctly converted over the orginal Html to Xml but some how it has been picked up by the parser in the same class, but not by the same parser from another class.
    ( iii ) I would like to use another conversion tool such as Tagsoup in place of light_html2xml to determine where the cause of this issue is coming from. As a result, would you or anyone be able to assist me coming up with a few lines of conversion statements using Tagsoup since I am not familiar with using this tool?
    ( iv ) light_html2xml is good as it strip out all namespace, DTD, Entity Resolver, etc and only return what I need. JTidy does correct conversion but include namespace, DTD, Entity Resolver which makes parsing difficulty.
    Many thanks again,
    Jack

  • How can I parse the document in WebI using sdk?

    I wanna to parse the document in WebIntelligence using sdk. My question is :
    1) By which sdk, I can parse the document.  'Report Application Server SDK' ?
    2) I wanna to parse the 'Self-Defined SQL' and 'Query' components of the document. Can the sdk support this request ?
    My enviroment is  BO XI Release 2.
    Thanks all.

    Hi shao,
    1) By which sdk, I can parse the document. 'Report Application Server SDK' ?
    'Report Application Server SDK' is For Crystal reports so for WebIntelligence or DesktopIntelligence Report it is  "Report Engine SDK".
    Apart from this if you want to do more on these reports "BusinessObjects Enterprise SDK" can be used.
    You can get more information on below link for XI R2.
    http://devlibrary.businessobjects.com/BusinessObjectsXIR2SP2/en/devsuite.htm
    For question 2,
    I am not sure about it but Report Engine SDK provides classes and interface of Data Providers.
    i.e. Building and Editing Data Providers   and  Working with Recordsets.
    Also you can have look on
    Report Engine SDK's
    Interface "Query"
    Hope these helps you.
    Thanks,
    Praveen.

Maybe you are looking for

  • Multiple Item Information - problem

    I updated to iTunes 8. Now, whenever I try to "Get Info" on more than one music file, the content in the "Multiple Item Information" window gets displaced. Please refer to this jpeg to understand what I mean: http://img.photobucket.com/albums/v294/mk

  • Are there permissions required to delete table_names in user_tables

    i have  written  a stored procedure in it there is a line "Delete from User_tables where TABLE_NAME like v_TableName;" this query deletes contents in TABLE_NAME column of USER_TABLES in DATABASE but when i executed it it is showing ERROR  ": ORA-0103

  • Logging in WebLogic

    Hi We are using Weblogic 7.01 and Toplink 9.03 When deploying our application TopLink outputs to the console: [TopLink] Logging level is set to INFO. How can we change this logging level, and what levels can we set it to? I would like to set it to DE

  • My new iphone5 keeps texting me with the texts i send to my wifes iPhone 4

    i have just got an iphone 5 and my wife has my iphone 4 - we seem to be connected via texts i am sending to her and anyone else on iphones via imessage - our applie ids are different - how do i stop this?

  • Series in CR

    In Crystal report 10, after creating a report on payment voucher with the following query SELECT T0.[TransId], T0.[Series], T5.[SeriesName], T0.[CardCode], T0.[CardName], T0.[Address], T0.[CounterRef], T0.[DocNum], T0.[DocDate], T4.[BankName], T1.[Ac