Using the SAX parser to split up a document to be processed by a DOMParser

I need to process a potentially large document which could prove too large to be read into memory at once by the DOM parser. The document would look something like..
(sorry about the formatting - can't use tabs!)
<Root>
<Parent>
<Child>
<Blah.....................
</Blah>
</Child>
</Parent>
<Parent>
<Child.......
</Child>
</Parent>
etc...
etc...
</Root>
The number of Parent elements could potentially be in the thousands. What I would like to do would be to use the SAX parser to read the document and when a Parent element is encountered, write the parent element and all its children to a stream in order that an input source can be created. The input source can then be parsed by a DOM parser. Once complete, the next Parent element encountered by the SAX parser could be passed to the DOM parser and so on until all of the Parent elements have been processed.
This way I can combine the ease of parsing the document using the DOM parser without having to worry about the overhead on memory.
Does anyone have any ideas as to the best approach? I could use the SAX parser for the whole thing but the XML is quite complex and lends itself to DOM parsing much more conveniently.

Can you read the file line by line:
start the reading at <child>
pause reading at </child>
copy the string to a var
wrap it with respective tags (<?xml<doctype etc...) and parse
proceed to the next <child>xx</child>
and repeat till you hit the closing of the file...
- Ravi

Similar Messages

  • Using the SAX Parser

    As I reported in this thread: Carriage Returns in XSLT output I am trying to produce a text output file with each line terminating in cr-lf (using the output from ViewObject.writeXML() passed through the XSLProcessor.processXSL() method to apply a XSLT stylesheet transformation to it), but the cr characters in the stylesheet are being converted to nl characters, so I am ending up with two nl characters at the end of each line.
    I found a 2002 query (here: http://www.xslt.com/html/xsl-list/2002-04/msg00193.html) by someone with the same problem, and a reply by Steve Muench explaining it and stating that the Oracle SAX parser does not have the same problem.
    So my question now is - does the SAX parser still retain cr characters (or is it too now converting them to nl) and if it does, how do I make XSLProcessor.processXSL() use it rather than the default XML parser? Can I do it from within my code, or does it need to be set in some sort of environment variable? We are on version 9.0.5.2 of JDeveloper(10g)

    Repost

  • Can multiple APEX application use the same parsing schema?

    Hi,
    I have 4.2 APEX thru pl/sql Gatewat, 11gr2 DB and using theme 24.
    Due to the APEX limitation for version control I would be splitting 1 big ERP applications into 24 different APEX applications and each application would be considered as 1 unit for version control.
    I have about 800 tables and I would assume that all of these would need to be stored in 1 schema since a lot of these table are linked thru FK.
    Can I have multiple APEX APPS using the same parsing schema? or is there a better way to do this?
    Thanks in advance!

    Hi,
    Multiple applications can have same (one) parsing schema.
    You can test that on e.g. apex.oracle.com, where normally you have only one schema and you create multiple applications and all use that same schema.
    Regards,
    Jari
    My Blog: http://dbswh.webhop.net/htmldb/f?p=BLOG:HOME:0
    Twitter: http://www.twitter.com/jariolai
    Edited by: jarola on Jan 28, 2013 7:15 PM

  • Validating xml file against a specific schema using jaxp sax parser

    Hi,
    I would like to validate a soap xml given below
    <?xml version="1.0" encoding="UTF-8"?>
    <Envelope>
         <Body>
              <name>pavan</name>
              <number>123</number>
         </Body>
    </Envelope>
    to validate against the schema of soap from url "http://www.w3.org/2003/05/soap-envelope". I dont want to have this namespace in the xml. How can I validate against the specified schema using jaxp sax parser?
    Please help.
    Thanks,
    T.Pavan kumar

    Any one, please help.

  • Encoding error using Oracle SAX Parser, please help

    Hi, All,
    I have problem while I am trying to parse an XML file using Oracle SAX Parser.
    The following is from the trace file:
    java.io.UTFDataFormatException: Invalid UTF8 encoding.
    java.io.UTFDataFormatException: Invalid UTF8 encoding.
    at java.lang.Throwable.&lt;init&gt;(Compiled Code)
    at java.lang.Exception.&lt;init&gt;(Compiled Code)
    at java.io.IOException.&lt;init&gt;(Compiled Code)
    at java.io.UTFDataFormatException.&lt;init&gt;(Compiled Code)
    at oracle.xml.parser.v2.XMLUTF8Reader.checkUTF8Byte(Compiled Code)
    at oracle.xml.parser.v2.XMLUTF8Reader.readUTF8Char(Compiled Code)
    at oracle.xml.parser.v2.XMLUTF8Reader.fillBuffer(Compiled Code)
    at oracle.xml.parser.v2.XMLByteReader.saveBuffer(Compiled Code)
    at oracle.xml.parser.v2.XMLReader.fillBuffer(Compiled Code)
    at oracle.xml.parser.v2.XMLReader.tryRead(Compiled Code)
    at oracle.xml.parser.v2.XMLReader.scanXMLDecl(Compiled Code)
    at oracle.xml.parser.v2.XMLReader.pushXMLReader(Compiled Code)
    at oracle.xml.parser.v2.XMLReader.pushXMLReader(Compiled Code)
    at oracle.xml.parser.v2.XMLParser.parse(Compiled Code)
    at data_loader.main(Compiled Code)
    The XML file is a pure ascii text file, with encoding set to UTF-8.
    I can parse the file correctly on NT, but I got problem when I ran the code on a SUN solaris 2.6 machine.
    The parser version is 9.0.2.
    Thanks in advance.
    Peter

    You are right, I modified the codes and got it to work on Oracle
    But when I did your suggestion:
    <cftransaction>
      <cfstoredproc ...>
      <cfquery>
        SELECT ...
      </cfquery>
    </cftransaction>
    I got the same error as before which is:
    Error Executing Database Query.
    [Macromedia][Oracle JDBC Driver]Unhandled sql type
    But this time it points to the CF callling the str proc using cfprocparam
    Here is how I call the str. proc:
         <CFTRANSACTION>   
              <cfstoredproc procedure="MyReport" datasource="#Trim(application.dsn)#" returncode="True">  
                  <cfprocparam type="In" cfsqltype="CF_SQL_VARCHAR" variable="ins_name" value="#Trim(session.instname)#">
                  <cfprocparam type="In" cfsqltype="CF_SQL_VARCHAR" variable="aca_year" value="#Trim(ayr)#"> <------- POINT TO THIS LINE
              </cfstoredproc>
         </CFTRANSACTION>

  • Trying to use XML SAX parser with JDK2 ...

    Hi,
    I'm pretty new to Java.
    I'm trying to write and use java sample using XML SAX parser. I try with XP and XML4J.
    Each time I compile it give me a error message like
    "SAX01.java:23: unreported exception java.lang.Exception; must be caught or declared to be
    thrown
    (new SAX01()).countBooks();
    ^
    Note: SAX01.java uses or overrides a deprecated API.
    Note: Recompile with -deprecation for details.
    1 error"
    For what I found, it seems that "deprecated" mean usage of old classes or methods. What should I do ?
    Wait for the XML parser to be rewrite ? ....
    Thank's
    Constant

    "SAX01.java:23: unreported exception
    java.lang.Exception; must be caught or declared to be
    thrown
    (new SAX01()).countBooks();
    ^Do this
    public static void main(String[] args) throws Exception {
    or do this
         try
                       first part of expression?(new SAX0()).countBooks();
         catch(Exception ex)     
              System.out.println("problem "+ ex);
    Note: SAX01.java uses or overrides a deprecated API.
    Note: Recompile with -deprecation for details.
    1 error"deprecation is just a warning if there are no errors elsewhere in your program it should compile fine, but if you want to find out how to change it do this:
    javac YourClassName.java -deprecation
    then look in the docs for an alternative.

  • HOW TO: Use the XML parser in Oracle 8.1.7

    I am trying to figure out how to use the xml parser provided in oracle 8.1.7. all i want to do is parse a xml report that is defined using a schema, and place the data into the proper tables. i am totally unfamiliar with the xml parser and how it works. i have done some reading on the subject, but seem to be getting some conflicting infromation about which utilites i need and how to invoke them. can someone please tell me what utilities i need, how to invoke them, and what i need to do to get a xml document to parse and insert to a table? I would greatly appreciate any help anybody could offer. thanks.

    You can parse the XML Document with XML Parser and place the data into database using XSU(XML SQL Utility).
    Both of these are included in XDK for Java at:
    http://otn.oracle.com/tech/xml/xdk_java
    The following document could also help:
    Oracle9i XML Developer's Guide--XDK [PDF] at http://otn.oracle.com/tech/xml/doc.html

  • How to use the kxml parser?

    I am getting problem in using the kxml parser.Actually my application is showing the java.lang.NoClassDefFoundError: org/kxml/parser/XmlParser error.
    Can you tell me how to use the kxml parser with our j2me application or what are the steps for this?
    Whether we have to use the kxml.zip or some other JAR file for this?
    Again whether we have to set any classpath for this?

    I was getting same error using kxml parser. The reason is losing interface in which constants are declared.
    You need to correct code of kxml a bit.
    In every place where constant from XmlPullParser interface is referenced only by its name (without reference to its class XmlPullParser) you need to change for use full name. it means changes like:
    START_DOCUMENT to XmlPullParser.START_DOCUMENT
    Everywhere.
    That's all!

  • Why the SAX parser cannot support the special character like "¡"

    I do not understand why the SAX parser cannot support the special character like &iexcl; but it can replace the &quot; &amp; &lt; &gt;   to ", &, <, >, ,, but other characters will be replaced to empty charater.
    can somebody give me any suggestions or solutions. THX.
    Edited by: 844086 on 2011-3-14 上午2:27
    Edited by: 844086 on 2011-3-14 上午2:27

    I quote:
    Alternatively implement an EntityResolver that resolves the desired escapes.You are again an example that people only read/register the first thing written in a post.

  • When i use the apple store, Your request is temporarily unable to be processed. what is wrong with it? can someone help me?

    when i use the apple store, Your request is temporarily unable to be processed. what is wrong with it? can someone help me?

    Try turning off the Firewall in System Preferences > Security (or Security & Privacy) > Firewall
    Disable anti virus software if you have that installed before downloading media from the App Store.
    When you post for help, please tell us which Mac OS X you have installed. Thanks !

  • HT4889 Is it possible to use the migration assistant to transfer only some documents and data? For example, I was to transfer my music, but not my word documents. There doesn't seem to be any options to customise what you are transferring - its all or not

    Is it possible to use the migration assistant to transfer only some documents and data? For example, I was to transfer my music, but not my word documents. There doesn't seem to be any options to customise what you are transferring - its all or nothing

    LBraidwood wrote:
    Is it possible to use the migration assistant to transfer only some documents and data? For example, I was to transfer my music, but not my word documents. There doesn't seem to be any options to customise what you are transferring - its all or nothing
    Yes, that's why I recommend one have more than TimeMachine as a backup because if certain files or folders are corrupted on the TM drive, your not logging into your migrated user account or not be able to transfer it what so ever.
    Then it's a REAL PAIN and costs $99 to bit read the TM drive to get the files out, then it's mess to go through and sort all the empty placeholders from the real files as TM doesn't save duplicates in each state, just uses the previous state's copy if there are no changes.
    If you create the same named user account on the new machine, and simply transfer files via a USB thumb drive and place say music contents into the Music folder, then it will work too.
    Look into making clones of your boot drive, the benefit here is it's bootable, you can access the drive directly to pick and choose files, run the computer just like before etc.
    Most commonly used backup methods explained

  • ? - Is there a way to validate 1 XML record at a time, using the SAX or oth

    Hello!
    Before running into space problems, i generated an XML file from a 'pipe delimited' file and and then processed that XML file thru a SAXParser 'validator' and the data was correctly validated, using the RELAXNG Schema patterns as the validation criteria!
    But as feared, the XML file was huge! (12 billion XML recs. generated from 1 billion 'pipe' recs.) and i am now trying to find a way to process 1 'pipe' record at a time, (ie) read 1 record from the 'pipe delimited' file, convert that rec. to an XML rec. and then send that 1 XML rec. thru the SAXParser 'validator', avoiding the build of a huge temporary XML file!
    After testing this approach, its looks like the SAXParser 'validator' (sp.parse) is expecting only (1) StringBufferInputStream as input,and after opening, reading and closing just (1) of the returned StringBufferInputStream objects, the validator wants to close up operations!
    Where i have the "<<<<<" you can see where i'm calling the the object.method that creates the 'pipe>XML' records 'sb.createxml' and will be returning many occurances of the StringBufferInputStream object, where (1) StringBufferInputStream object represents (1) 'pipe>XML' record!
    So what i'm wondering, is if there is a form of 'inputStream' class that can be loaded and processed at the same time! ie instead of requiring that the 'inputStream' object be loaded in it's entirety, before going to validation?
    Or if there is another XML 'validator' that can validate 1 XML record at a time, without requiring that the entire XML file be built first?
    1. ---------------------------------class: (SX2) ---------------------------------------------------------------------------------------------------------
    import ............
    public class SX2
    public static void main(String[] args) throws Exception
    MyDefaultHandler dh = new MyDefaultHandler();
    SX1 sx = new SX1();
    SAXParser sp = sx.getParser(args[0]);
    stbuf1 sb = new stbuf1();
    sp.parse(sb.createxml(args[1]),dh); <<<<<< createxml( ) see <<<<<<< below
    class MyDefaultHandler extends DefaultHandler {
    public int errcnt;
    "SX2.java" 87 lines, 2563 characters
    2. ----------------------------------class: (stbuf1) method: (createxml) ----------------------------------------------------------------------------
    public stbuf1 () { }
    public StringBufferInputStream createxml( String inputFile ) <<<<<< createxml(
    BufferedReader textReader = null;
    if ( (inputFile == null) || (inputFile.length() <= 1) )
    {     throw new NullPointerException("Delimiter Input File does not exist");
    String ele = new String();
    try {
    ele = new String();
    textReader = new BufferedReader(new FileReader(inputFile));
    String line = null; String SEPARATOR = "\\|"; String sToken = null;
    String hdr1=("<?xml version=#1.0# encoding=#UTF-8#?>"); hdr1=hdr1.replace('#','"');
    String hdr2=("<hlp_data>");
    String hdr3=("</hlp_data>");
    String hdr4=("<"+TABLE_NAME+">");
    String hdr5=("</"+TABLE_NAME+">");
    while ( (line = textReader.readLine()) != null )
    String[] sa = line.split(SEPARATOR);
    String elel = new String();
    for (int i = 0; i < NUM_COLS; i++)
    if (i>(sa.length-1)) { sToken = new String(); } else { sToken = sa; }
    elel="<"+_columnNames[i]+">"+sToken+"</"+_columnNames[i]+">";
    if (i==0) {
    ele=ele.concat(hdr1);ele=ele.concat(hdr2);ele=ele.concat(hdr4);ele=ele.concat(elel);
    else
    if (i==NUM_COLS - 1) {
    ele=ele.concat(elel);ele=ele.concat(hdr5);ele=ele.concat(hdr3);
    else {
    ele=ele.concat(elel);
    textReader.close();
    catch (IOException e) {
    return (new StringBufferInputStream(ele));
    public static void main( String args[] ) {
    stbuf1 genxml_obj = new stbuf1 ();
    String ptxt=new String(args[0]);
    genxml_obj.createxml(ptxt); }}

    Well,i think you can use the streaming API for xml processing provided by weblogic.It is pull model,not push model like SAX.with it,you can select the events you want without having to react to every event,and you can filter the events out.
    Sun also provide such streaming API for xml processing,and i got an very simple introduction about it on the Chinese Sun developer site.but i couldn't find any other infomation about it elsewhere! If you have such materials,please send to my email:[email protected],and if I have it,i will be sure to post the links here.hope it helps more or less:)
    @smile@

  • I need help using a SAX parser

    Hello,
    I have a class that goes out to the web and retrieves an xml document w/o ever writing it to disk. I like to parse that object using SAX and a class that extends HandlerBase. Can any one show me how to parse an xml document that lives in memory rather than in a file? I've found many examples, but all of them assume you are reading from a file system. Any help is greatly appreciated.
    Here is the jist of my code :
    // Class 1 makes a request to another server based on some variable data. I've created a messenger object that handles all of the communications and returns the data in a DataInputStream
    XMLMessage xml = new XMLMessage();
    xml.createRequest(args[0],args[1],args[2]);
    XMLMessenger messenger = new XMLMessenger(xml.getDoc());
    DataInputStream dis = messenger.transferDocument();
    ResponseObj resp = new ResponseObj();
    resp.read(dis);
    System.out.println(resp.getvalue1() + "*" + resp.getvalue2()+ "*");
    // Class 2 is the ResponseObj and I want to use something similar to this read method, but
    // I can't get it to work with a data stream, instead of a string that represents a file.
    public void read(String filename) throws java.lang.Exception {
    Class c = Class.forName("com.ibm.xml.parser.SAXDriver");
         org.xml.sax.Parser parser = (org.xml.sax.Parser)c.newInstance();
         parser.setDocumentHandler(this);
         parser.parse(filename);

    Thanks for you help. I've modified my read() method to work like this :
    org.xml.sax.Parser parser = new com.ibm.xml.parser.SAXDriver();
    parser.setDocumentHandler(this);
    InputSource is = new InputSource(dis);
    parser.parse(is);
    This makes sense, but I'm still not getting any thing returned from the startElement(),characters(), or endElement() methods. It reports no errors, it just doesn't happen. Do you have any suggestions on how to debug this? I'm stumped, but really need to get this working.

  • I've a problem with errors message in the sax parser

    Hello,
    I'm quite new to Java.
    I'm trying to create a SAX parser for my XML file, with a XML schema.
    When my pars run a XML file, the following mistakes go out:
    1)org.sax.SAXParseException:s4s-elt-must-match: the content of 'restriction' must match (annotation?,.....)
    2)org.sax.SAXParseException: src-ct.0.1:Complex Type Definition Rappresentation Error for type..... Element 'sequence' is invalid,misplaced or occurs to often
    but I don't understand where I must go in the schema to correct.
    Many thanks!

    Hi,
    Perhaps it would help if you post your XML file.

  • Use the same style of type in different documents

    Hello,
    I was exploring the type styles and I get the impression that if you save a style in a particular file you cannot use that saved style in another file, am i correct?
    I have several files and i want to use the same consistent style in all of them, what the best way to do it?
    Thank you!

    You can load both the Paragraph and Character Styles from one AI file into another by using Window>Type>Paragraph Styles. Open your new document and go here

Maybe you are looking for

  • Custom Script using Classifications

    Hi. I am wondering if it is possible and how I would go about using custom scripts to use classifications to display particular things. This is the goal: To classify events (back end) and then based on the classification of an event, on the detail pa

  • IPod Touch/iPhone 4s in a BMW - Screen size problems while playing videos since iOS5

    Hi, I have an iPod Touch 4th generation 64GB that i use connected to my 2011 BMW M5 purchased with all multimedia and iPod/iPhone options available from factory. The car lets me play music and video playlists from the iPod and the videos can be seen

  • Difference b/w Java Class and Bean class

    hi, can anybody please tell me the clear difference between ordinary java class and java Bean class. i know that bean is also a java class but i donno the exact difference between the both. can anybody please do help me in understanding the concept b

  • Basic question: Can OID map LDAP query to custom SQL query?

    Hi all, I have custom data in my Oracle Database and I wat to give them LDAP interface. Is it possible to use OID to achieve this or OID is for other purposes? To be more specific: I have schema MYSCHEMA and table MYSCHEMA.MYTABLE. Is it possible to

  • Field order issues after export to DV Pal from uncompressed

    We are experiencing some difficulty exporting clips from FCP to our Omneon Server for TX. We are capturing 8 bit uncompressed (AJA and Blackmagic) from digibeta for our edit and all editing stays in uncompressed until final output. Our transmission s