Getting a iterator over a huge xml-file

Hi !
I have to persist data from some huge xml-files.
The structure of the xml is simple, like in:
<items>
<item>
</item>
<item>
</item>
</items>
The best way for me, is to get an iterator over the item-elements represented as a DOM strcture for easy access.
Because the files are so huge (80-150 MBytes) and I have no control over the creation of them, the underlying parser should be SAX oriented.
So I need a solution which will parse until one item-element (maybe spezified through a XPATH query) is completed and build a DOM-tree from it.
Then I can pass this DOM-element to my persistence layer. When this item is persisted, and the Iterator hasNext(), then I could do a simple .next() to get the next DOM representation of the item.
Is there such a solution out ? Or do I have to 'invent' such kind of parsing ?
Would by great to get some hints ... this problem is driving me crazy :)
Thank you in advance,
Gerald

Vaguely familiar, http://lists.xml.org/archives/xml-dev/200201/msg00032.html
I can't think of a DOM builder that would work off a fragment; you would probably have to create one yourself.
There's also http://drizzle.stanford.edu/~peastman/pax.html which might solve your problem.
Pete

Similar Messages

  • Getting error when try to upload xml file into Data Template

    Hi,
    Getting error when try to upload xml file into Data Template.error:"The uploaded file XXSLARPT.xml is invalid. The file should be in XML-DATA-TEMPLATE format."Plz anybody help me.
    Thanks,
    Prasad.

    Hi,
    Anybody Help Plzzzzzz.
    thx,
    Prasad

  • How to get the attribute value of an XML file??

    How to get the attribute value of an XML file??
    For example, how to get name and age attributes?
    <student name="Joe" age="20" />

    What are you using to read the XML file??
    On the assumption of JDOM - www.jdom.org. Something along the lines of:SAXBuilder builder = new SAXBuilder(true);
    Document doc = builder.build(filename);
    Element root = doc.getRootElement();
    List children = root.getChildren();
    Element thisElement = (Element)children.get(n);
    String name = thisElement.getAttributeValue("name")
    try
         int age = Integer.parseInt(thisElement.getAttributeValue("age"));
    catch (Exception ex)
         throw new InvalidElementException("Expected an int.....");
    }Ben

  • How to get the real path of the xml file

    I have a java application
    following is the package structure
    com>>gts>>xml
    having file---------> MyXML.xml
    com>>gts>>java
    having java program to read the file
    Problem is if I use File file = new File("..\\xml\\MyXML.xml");
    java.io.FileNotFoundException: E:\LEARNING_WORK_SPACE\JavaXml\..\xml\MyXml.xml (The system cannot find the path specified)
         at java.io.FileInputStream.open(Native Method)
         at java.io.FileInputStream.<init>(Unknown Source)
         at java.io.FileInputStream.<init>(Unknown Source)
         at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
    How do I get the real path of the xml file.
    Edited by: shashiwagh on Jan 29, 2010 11:46 AM

    Hi,
    if your XML file is inside a package you can easily get it from the classloader.
    Note that your application maybe packaged inside a jar so it is not safe to use java.io.File for this purpose.
    You have an xml file in :
    com/gts/xml/MyXML.xml
    in a class of the same module (that will packaged in the same jar) for example com.gts.java.XmlLoader :
    // To get the stream :
    InputStream is = this.getClass().getResourceAsStream("/com/gts/xml/MyXML.xml");
    // Or the URL :
    URL xml = this.getClass().getResource("/com/gts/xml/MyXML.xml");Hope it helps.

  • Reading  huge xml files in OSB11gR1(11.1.1.6.0)

    Hi,
    I want to read a huge xml file of size 1GB in OSB(11.1.1.6.0)?
    I will be creating a (JCA)file adapter in jdeveloper and importing artifacts to OSB.
    Please let me know the maximum file size that could be handled in OSB?
    Thanks in advance.
    Regards,
    Suresh

    Depends on what you intend to do after reading the file.
    Do you want to parse the file contents and may be do some transformation? Or do you just have to move the file from one place to another for ex. reading from local system and moving to a remote system using FTP?
    If you just have to move the file, I would suggest using JCA File/FTP adapter's Move operation.
    If you have to parse and process the file contents within OSB, then it may be possible depending on the file type and what logic you need to implement. For ex. for very large CSV files you can use JCA File Adapter batching to read a few records at a time.

  • How to break up a huge XML file and generate serialized JSP pages

    I have a huge xml file, with about 100 <item> nodes.
    I want to process this xml file with pagination and generate jsp pages.
    For example:
    Display items from 0 to 9 on page 1 page1.jsp , 10 to 19 on page2.jsp, and so on...
    Is it possible to generate JSP pages like this?
    I have heard of Velocity but dont know if it will be the right technology for this kind of a job.

    Thank you for your reply, I looked at the display tag library and it looks pretty neat with a lot of features, I could definitely use it in a different situation.
    The xml file size is about 1.35 MB, and the size is unpredictable it could shrink or grow periodically depending on the number of items available.
    I was hoping to create a documentation style (static pages) of the xml feed instead of having 1 jsp with dynamic pages
    I was looking at Anakia : http://jakarta.apache.org/velocity/docs/anakia.html , may be it has features that enable me to create static pages but not very sure.
    I think for now, I will transform the xml with an xsl file and pass the page numbers as input parameters to the xsl file
    null

  • Can anyone get LPX to actually read FCPX XML files properly?

    Can anyone get LPX to actually read FCPX XML files properly?  I'm in the latest versions of FCPX, LPX, and OS X Mavericks, and it simply doesn't work at all.  Total failure.  Can anyone get this to work properly?

    I haven't used it on a current project, but I did test it, and it worked fine. I was able to save an XML file, render a proxy video, and import them both into LPX. I was able to make some tweaks and export it all back to FCPX.
    While Roles are not addressed as elegantly as the app X2Pro does for Pro Tools, the process works.
    I'm running Mavericks on a MBP and a "CustoMac".

  • How to compare two huge xml files(50MB+) using Java Code

    I want to compare two huge xml files using java code and need to find the difference of those xml files
    is there any API for that

    You should find third party API

  • Trying to parse a huge XML file

    I'm trying to parse a large (~200 Meg) XML file. I get out of memory errors with DOM, and so far using SAX has been a huge disaster. My program has to run on IE and Netscape as an applet. I've run into a HUGE amount of errors. While I have been doing Java programming for about 2 years, this is the first time I've ever worked with XML. The error I am currently getting is an abstract method error associated with the following code :
    SAXParserFactory spf = new
                             com.sun.xml.parser.SAXParserFactoryImpl();
                   spf.setValidating(false);
                   try {
                   out.write("Gets past SAXParserFactory creation");
                   out.flush();
                        catch (Exception e) {}
                   xmlReader = null;
                   try {
                             // Create a JAXP SAXParser
                             saxParser = spf.newSAXParser();
                             // Get the encapsulated SAX XMLReader
                             out.write("gets to assigning an XMLReader");
    ***THIS LINE OF CODE ***                         xmlReader = saxParser.getXMLReader();
                             out.write("gets past assigning an XMLReader");
                             out.flush();
    I've run out of ideas and would greatly appreciate any help anyone could offer on this most frustrating problem.

    With XML, we don't have to check each line of the file, only the data in those tags of each element that we are searching on.That's true, once you have read the entire XML and parsed it. Reading the entire XML would of course take much longer than reading the text file it was generated from, so you would want to pre-parse it into a DOM and store that somehow. But then to do a search you'd have to load that DOM, which is in turn much larger than the XML it was generated from...
    What you describe sounds to me a lot like a database application. Of course I don't know anything about the specifics of what you are doing with the text, but storing it in a database with indexes that support the searches seems to me like a better approach. Of course you then have the challenge of distributing the database and its supporting software in a form that doesn't require installation. But I'd respectfully suggest that banning any installation at all is a bit extreme. After all, where I live you can buy an encyclopedia on a CD, from a gas station, for $10. And when you load the CD, it runs through an installation procedure. No big deal these days.

  • How to get ALL validate-errors while insert xml-file into xml_schema_table

    How to get all validate-errors while using insert into xml_schema when having a xml-instance with more then one error inside ?
    Hi,
    I can validate a xml-file by using isSchemaValid() - function to get the validate-status 0 or 1 .
    To get a error-output about the reason I do validate
    the xml-file against xdb-schema, by insert it into schema_table.
    When more than one validate-errors inside the xml-file,
    the exception shows me the first error only.
    How to get all errors at one time ?
    regards
    Norbert
    ... for example like this matter:
    declare
         xmldoc CLOB;
         vStatus varchar
    begin     
    -- ... create xmldoc by using DBMS_XMLGEN ...
    -- validate by using insert ( I do not need insert ;-) )      
         begin
         -- there is the xml_schema in xdb with defaultTable XML_SCHEMA_DEFAULT_TABLE     
         insert into XML_SCHEMA_DEFAULT_TABLE values (xmltype(xmldoc) ) ;
         vStatus := 'XML-Instance is valid ' ;
         exception
         when others then
         -- it's only the first error while parsing the xml-file :     
              vStatus := 'Instance is NOT valid: '||sqlerrm ;
              dbms_output.put_line( vStatus );      
         end ;
    end ;

    If I am not mistaken, the you probably could google this one while using "Steven Feuerstein Validation" or such. I know I have seen a very decent validation / error handling from Steven about this.

  • How can I get iWeb to regenerate my rss.xml file

    I use Leopard/ iWeb to create a podcast. When published to the web, the podcast consists of two pages - a home page and an archive page. The home page contains the last 5 or so postings, and the archive page contains all the other postings. Both the home page and the archive page have "Subscribe" icons/links enabling folks to subscribe to my podcast (both icons/links point to same address). So far, all pretty standard stuff.
    The focus of my problem is subscribe icon/link, or more accurately, the rss.xml file that it links to. In the past, the rss.xml feed would contain all my postings in the correct date order. Then I entered some new entries retrospectively (i.e. with dates a long time in the past) however they appear in the wrong place in the RSS feed. How can I force the RSS feed to show entries in correct date order?

    Hi,
    I've asked for this post to be moved to the [iLife > iWeb Forum|http://discussions.apple.com/category.jspa?categoryID=188] for you as that is where you are more likely to get an answer.
    Regards,
    Colin R.

  • URGENT --  Perofrmance issue while creating Huge XML file

    All XML Experts Please Help....Thanks a lot in Advance
    We are trying to create a XML file for a huge table.. 5 million rows and the performance is ver very bad.. Can some body help by giving me an idea what what my best approch could be... or what am I doing wrong in in the code below
    CREATE OR REPLACE PROCEDURE Sales_1_Generate_Xml IS
    temp_clob CLOB;
    temp_buffer VARCHAR2(1);
    amount BINARY_INTEGER := 1;
    position INTEGER := 1;
    filehandle utl_file.file_type;
    error_number NUMBER;
    error_message VARCHAR2(100);
    length_count INTEGER;
    qryctx dbms_xmlgen.ctxhandle;
    BEGIN
    qryctx := dbms_xmlgen.newcontext('select /* INDEX UF_SALES(UF_SALES_IX16) */
    TRANSACTION_NUMBER     "Transaction_Number",
    TRANSACTION_TYPE_ID     "Transaction_Type_ID",
    PROCESS_FISCAL_DATE_ID     "Process_Fiscal_Date_ID",
    INVOICE_FISCAL_DATE_ID     "Invoice_Fiscal_Date_ID",
    ORDER_FISCAL_DATE_ID     "Order_Fiscal_Date_ID",
    PROCESS_CALENDAR_DATE_ID     "Process_Calendar_Date_ID",
    INVOICE_CALENDAR_DATE_ID     "Invoice_Calendar_Date_ID",
    ORDER_CALENDAR_DATE_ID     "Order_Calendar_Date_ID",
    CURRENT_TM_ID     "Current_TM_ID",
    CUSTOMER_ID     "Customer_ID",
    CUSTOMER_TYPE_ID     "Customer_Type_ID",
    CUSTOMER_LEVEL_ID     "Customer_Level_ID",
    ACCOUNT_TYPE_ID     "Account_Type_ID",
    TRADE_CLASS_ID     "Trade_Class_ID",
    DISTRIBUTOR_ID     "Distributor_ID",
    PRODUCT_ID     "Product_ID",
    ORDERED_PRODUCT_ID     "Ordered_Product_ID",
    BRAND_TYPE_ID     "Brand_Type_ID",
    LABEL_TYPE_ID     "Label_Type_ID",
    BRAND_LABEL_ID     "Brand_Label_ID",
    PRICED_BY_ID     "Priced_By_ID",
    SALES_UOM_ID     "Sales_UOM_ID",
    PURCHASING_UOM_ID     "Purchasing_UOM_ID",
    PRICING_UOM_ID     "Pricing_UOM_ID",
    NET_COST     "Net_Cost",
    NPA_S     "NPA_S",
    CMA_S     "CMA_S",
    NOT_S     "NOT_S",
    TOTAL_NATIONAL_ALLOWANCE_S     "Total_National_Allowance_S",
    LPA_S     "LPA_S",
    LMA_S     "LMA_S",
    LOT_S     "LOT_S",
    TOTAL_LOCAL_ALLOWANCE_S     "Total_Local_Allowance_S",
    TOTAL_ALLOWANCES_S     "Total_Allowances_S",
    LPC     "LPC",
    LPC_EXTENDED     "LPC_Extended",
    LPF     "LPF",
    LPF_EXTENDED     "LPF_Extended",
    TRUE_COST     "True_Cost",
    CDE     "CDE",
    LPP     "LPP",
    SURCHARGE     "Surcharge",
    COMBINED_SURCHARGE     "Combined_Surcharge",
    TOTAL_SURCHARGES     "Total_Surcharges",
    MARKET_COST     "Market_Cost",
    INSIDE_PAD     "Inside_Pad",
    SALES_REP_COST     "Sales_Rep_Cost",
    SALES_REP_MARGIN     "Sales_Rep_Margin",
    SALES_PRICE     "Sales_Price",
    SALES_TRUE_MARGIN     "Sales_True_Margin",
    NVD     "NVD",
    LVD     "LVD",
    NID     "NID",
    LID     "LID",
    TOTAL_VD     "Total_VD",
    TOTAL_ID     "Total_ID",
    TOTAL_DEVIATIONS     "Total_Deviations",
    GP1     "GP1",
    GP2     "GP2",
    DEVIATED_COST     "Deviated_Cost",
    ACTUAL_COST     "Actual_Cost",
    SALES_TAX     "Sales_Tax",
    QUANTITY_ORDERED     "Quantity_Ordered",
    QUANTITY_SHIPPED     "Quantity_Shipped",
    QUANTITY_DEVIATED     "Quantity_Deviated",
    QUANTITY_SUBBED     "Quantity_Subbed",
    UNITS_ORDERED     "Units_Ordered",
    EACHES_ORDERED     "Eaches_Ordered",
    EACH_CONVERSION_FACTOR     "Each_Conversion_Factor",
    UNITS_SHIPPED     "Units_Shipped",
    EACHES_SHIPPED     "Eaches_Shipped",
    SHIP_WEIGHT     "Ship_Weight",
    ACTUAL_GP_DLR     "Actual_GP_Dlr",
    TRUE_GP_DLR     "True_GP_Dlr",
    LANDED_GP_DLR     "Landed_GP_Dlr",
    LANDED_ACTUAL_GP_DLR     "Landed_Actual_GP_Dlr",
    INVOICE_GP_DLR     "Invoice_GP_Dlr",
    INVOICE_ACTUAL_GP_DLR     "Invoice_Actual_GP_Dlr",
    ADJUSTED_ACTUAL_GP_DLR     "Adjusted_Actual_GP_Dlr",
    EB_S     "EB_S",
    MB_S     "MB_S",
    ACTUAL_TM_ID     "Actual_TM_ID",
    ACTUAL_TM_NAME     "Actual_TM_Name",
    ACTUAL_DSM_ID     "Actual_DSM_ID",
    ACTUAL_DSM_NAME     "Actual_DSM_Name",
    INVOICE_NUMBER      "Invoice_Number ",
    CONTRACT_NUMBER     "Contract_Number",
    CUSTOMER_NUMBER     "Customer_Number",
    CUSTOMER     "Customer",
    PRODUCT_NUMBER     "Product_Number",
    MASTER_DISTRIBUTOR_ID     "Master_Distributor_ID",
    ORDERED_PRODUCT_NUMBER     "Ordered_Product_Number",
    NATIVE_PRODUCT_STATUS     "Native_Product_Status",
    NATIVE_PRICED_BY_INDICATOR     "Native_Priced_By_Indicator",
    EXTRACTION_TIME     "Extraction_Time"
    from uf_sales where distributor_id in (''5139'',
    ''5140'',
    ''5145'',
    ''5150'',
    ''5160'',
    ''5175'',
    ''5180'',
    ''5210'',
    ''5220'',
    ''5230'')
    DBMS_XMLGen.setRowTag(qryctx,'Sales_Record');
    DBMS_XMLGen.setRowSetTag(qryctx,'Sales_Set');
    temp_clob:=dbms_xmlgen.getxml(qryctx);
    length_count := dbms_lob.getlength(temp_clob);
    dbms_output.put_line('Internal LOB size is: ' || length_count);
    filehandle := utl_file.fopen('DATA_EXTRACT','Sales_1.xml','Wb',32767);
    WHILE length_count <> 0 LOOP
    dbms_lob.read (temp_clob, amount, position, temp_buffer);
    --utl_file.put (filehandle, temp_buffer);
    utl_file.put_raw(filehandle, utl_raw.cast_to_raw(temp_buffer));
    position := position + 1;
    length_count := length_count - 1;
    temp_buffer := null;
    END LOOP;
    dbms_output.put_line('Exit the loop');
    utl_file.fclose(filehandle);
    DBMS_XMLGen.closeContext(qryctx);
    dbms_output.put_line('Close the file');
    EXCEPTION
    WHEN OTHERS THEN
    BEGIN
    error_number := sqlcode;
    error_message := substr(sqlerrm ,1 ,100);
    dbms_output.put_line('Error #: ' || error_number);
    dbms_output.put_line('Error Message: ' || error_message);
    utl_file.fclose_all;
    END;
    END;
    /

    OK, so you are writing the file with UTL_FILE. How long is the whole process taking. Have you timed the time taken to generate the temp_clob with the result Vs the time to write the output to a file.

  • Xslt and getting data from a uri in xml file

    In my xml file, i have a node that conains a uri to another xml file
    Is it possible to use XSLT to open the hyperlink and get the content of that xml file?

    Possibly using the document() function.

  • Want parse huge xml file for CDATA

    HI all
    I want to parse hude xml file for getting values of CDATA
    can u plz give me sample code for that
    here i am attaching the part of my xml file
    <initParams>
    <param description="simulation mode " name="smtpSimulationMode" passOn="false" required="false" type="bool" varSubstitute="false">
    <![CDATA[false]]>
    </param>
    <param description="full name of smtp host" name="smtpHost" passOn="false" required="true" type="string" varSubstitute="false">
    <![CDATA[222222]]>
    </param>
    <param description="smtpUserName for authentication" name="smtpUserName" passOn="false" required="false" type="string" varSubstitute="false"/>
    <param description="smtpUserPassword for authentication" name="smtpUserPassword" passOn="false" required="false" type="string" varSubstitute="false"/>
    <param description="ip address for remote server" name="serverId" passOn="false" required="true" type="string" varSubstitute="false">
    <![CDATA[1111111]]>
    </param>
    <param description="location of file on remote server side" name="fileRemoteLocation" passOn="false" required="false" type="string" varSubstitute="false">
    <![CDATA[test/]]>
    </param>
    <param description="user name for authentication" name="userName" passOn="false" required="true" type="string" varSubstitute="false">
    <![CDATA[abc]]>
    </param>
    </initParams>
    thanks in advance

    There are several Java XML API's available. SAX, JXpath, DOM4J, Xerces, etcetera.

  • How does example field get populated in the results (pmd.xml) file ?

    How does the <example> element in the rulepack file, get populated in the pmd.xml file ?

    Hi,
    The example field does not get populated and won't be for two reasons:
    - The Xpmd products don't do it and we want to keep the same XML schema than the others so that FlexPMD reports remain compatible with third party tools (like the PMD Hudson plugin, and XSL script)
    - That would have a high impact on the report size. Imagine that you have 2000+ violations in your report...
    That being said, it will be possible to retrieve the example section from the Flash Builder plugin in the outline view.
    Best regards,
    Xavier

Maybe you are looking for

  • ITunes wont open on laptop (running windows 7)

    It was working earlier but it seems a update for either windows or apple has been automatically downloaded and as a consequence when i click on iTunes it doesnt open and does nothing?  I have uninstalled and reinstalled the most up to date version bu

  • How do I get all my music back in my library?

    I haven't been on iTunes in a while, so I went on tonight to get some new music for my phone and almost all my library was gone!! I had close to 1,500 songs as well as some movies and TV shows and now i have about 200 songs. I tried looking up soluti

  • PI 7.4 dual stack

    Dear experts,   I understood PI 7.4 SP06 can be installed on dual stack ABAP+Java. If this is the case, what about ccBPM/BPM scenarios ? can support both ? only nwBPM/ccBPM ? what about NWDS present version ? Deva

  • Error when processing bapi_salesorder_createfromdata2

    Hi all,      when i am processing bapi  bapi_salesorder_createfromdata2 , i am getting an error that please enter sold to party or ship to party.In which parameter i need to assign these fields Thanks in advance Regards Jinesh

  • When I link to update add-ons for shockwave I got a trojan!

    I followed your add-on link -- wound up with a Trojan which blocked access to my home page (and online email) It also loaded several other add-on of its own ( the only one I can uninstall is called rocket tab.) It also change some network settings. I