Indices configuration for XML document analysis (indexing time problems)

Hi all,
I'm currently developing a tool for XML Document analysis using XQuery. We have a need to analyse the content of a large CMS dump, so I am adding all documents to a berkeley DB xml to be able to run xqueries against it.
In my last run I've been running to indexing speed problems, with single documents (typically 10-20 K in size) taking around 20 sec to be added to the database after 6000 documents (I've got around 20000 in total). The time needed for adding docs to the database drops with the number of documents.
I suspect my index configuration to be the reason for this performance drop. Indeed, I've been very generous with indexes, as we have to analyse the data and don't know the structure in advance.
Currently my index configuration includes:
- 2 default indicess: edge-element-presence-none and edge-attribute-presence-none to be able to speed up every possible xquery to analyse data patterns: ex. collection()//table//p[contains(.,'help')]
- 8 edge-attribute-substring-string indices on attributes we use often (id, value, name, ...)
- 1 edge-element-substring-string index on the root element of the xml documents to be able to speed up document searches: ex. collection()//page[contains(.,'help')]
So here my questions:
- Are there any possible performance optimisations in Database config (not index config)? I only set the following:
setTransactional(false);
envConf.setCacheSize(1024*64);
envConf.setCacheMax(1024*256);
- How can I test various index configuration on the fly? Are there any db tools that allow to set/remove indexes?
- Is my index config suspect? ;-)
Greetings,
Nils

Hi Nils,
The edge-element-substring-string index on the document element is almost certainly the cause of the slow document inserts - that's really not a good idea. Substring indexes are used to optimize "=", contains(), starts-with() and ends-with() when they are applied to the named element that has the substring index, so I don't think that index will do what you want it to.
John

Similar Messages

  • Templates for XML documents  in Dreamweaver CS3?

    I have a strange question. Is it possible to create a DW
    template (.dwt) for an XML document with editable and repeatable
    regions? I would like to set up some templates that will allow my
    users to create new XML files and edit the appropriate regions with
    Contribute.
    Any advice would be greatly appreciated.
    Michael
    [email protected]

    I have a strange question. Is it possible to create a DW
    template (.dwt) for an XML document with editable and repeatable
    regions? I would like to set up some templates that will allow my
    users to create new XML files and edit the appropriate regions with
    Contribute.
    Any advice would be greatly appreciated.
    Michael
    [email protected]

  • Query language for XML documents

    Which is a better (efficiency in terms of memory management) query language for interacting with XML documents.The query language shouls support 'insert', 'delete', 'update' and 'select' commands. How fast is database as compared to XML ( database being replaced by XML) when only 'insert' n 'select' commands are issued?

    Hi,
    I suggest you use the Sunopsis JDBC for XML driver that will let you perform all kind of SQL statements on your XML files. It is a type 4 driver so it's very use to use. You may have more information and download it here:
    http://www.sunopsis.com/corporate/us/products/jdbcforxml/
    Hope that will help
    Simo Fernandez

  • Document type definitions for xml documents exported from SBO?

    I am looking for DTD or XSD specifications for the documents that are exported from SBO via the DI API. It seems that this information is not available with the standard DI API documentation. Does anyone have a clue about where to look for these specifications ?
    Best regards,
                Henry Nordströ

    The DI Company object has a GetBusinessObjectXmlSchema method which allows you to retreive the schema details.
    Below is the sample from the DI help file.
    Dim XMLStr As String
    Dim domDoc As DOMDocument
    Set domDoc = New DOMDocument
    XMLStr = vCmp.GetBusinessObjectXmlSchema(oInvoices)
    domDoc.loadXML (XMLStr)
    domDoc.save ("C:\ XMLFiles \MyXML2.xml")
    Regards,
    John.

  • IView creation/configuration for unread document in KM using EP6.0

    Hi,
    Can some one tell me that how one can configure/create a iView for unread documents for which user has subscribed.
    Thanks in advance.
    Manish

    hi,
    specify some properties for the document may help for this purpose. the property can be changed when the document is opened....
    otherwise maintain the history of the user subscription..ie documents opened by each user
    regards
    geogi

  • TO PARSE XML RESPONSE AFTER SENDING XML DOCUMENT AS URL PARAMETER PROBLEM

    Hi
    I sent the xml document (varchar variable)to the other site (to use URL).
    When I take the xml response from the other site, how can I parse this using pl/sql code? (db version oracle 8.1.7)
    What are the methods?
    Try the following URL by pasting it in browser location box:
    http://testspos.isbank.com.tr/sanalpos/spos.asp?prmstr='<?xml version="1.0" encoding="UTF-8"?><ePaymentMsgVersionInfo="2.0" TT="Request" RM="Direct" CT="Money"><OperationActionType="LiveTest"><OpData><MerchantInfo MerchantId="200000845966"MerchantPassword="kangurum"/><ActionInfo><TrnxCommon TrnxID="'||v_sipno||'"Protocol="156"></TrnxCommon></ActionInfo><PANInfo></PANInfo><OrgTrnxInfo></OrgTrnxInfo><CustomData></CustoData></OpData></Operation></ePaymentMsg>'
    You will get the response:
    <html><head><title>Error</title></head><body>The parameter is incorrect. </body></html>

    Hi
    I sent the xml document (varchar variable)to the other site (to use URL).
    When I take the xml response from the other site, how can I parse this using pl/sql code? (db version oracle 8.1.7)
    What are the methods?
    Try the following URL by pasting it in browser location box:
    http://testspos.isbank.com.tr/sanalpos/spos.asp?prmstr='<?xml version="1.0" encoding="UTF-8"?><ePaymentMsgVersionInfo="2.0" TT="Request" RM="Direct" CT="Money"><OperationActionType="LiveTest"><OpData><MerchantInfo MerchantId="200000845966"MerchantPassword="kangurum"/><ActionInfo><TrnxCommon TrnxID="'||v_sipno||'"Protocol="156"></TrnxCommon></ActionInfo><PANInfo></PANInfo><OrgTrnxInfo></OrgTrnxInfo><CustomData></CustoData></OpData></Operation></ePaymentMsg>'
    You will get the response:
    <html><head><title>Error</title></head><body>The parameter is incorrect. </body></html>

  • Problems during ADSUser Configuration  for Adobe Document services

    Dear Netweaver experrts,
    I'm trying to activate a Adobe Document Service on my local Netweaver 2004 WebAS sp20.
    I realized that my system is missing the appropiate user and grups which have to be configured
    using the Visual administrator and the Security Provider mask.
    My problem is that I can't create a user or group in my Security provider mask. The Create user and create Groups buttons are grey.
    So I don't know which settings are wrong on my system and prevent the creation of new users and groups within the Security provider configuration.
    Can you provide any hints ?
    Best regards,
    Daniel

    Hi Danile,
    Check whether you got the permissions to access Security Provider.
    If you got the permissions then you will find the pencil button above runtime tab in that screen.Click on that and it will allow you to create users.
    With regards,
    Pradeep.B

  • Searching for XML tags using Oracle Text

    I am using full text search to find documents based on a search text. It works fine for pdf, word documents, etc. However for XML documents, searching for a particular tag name does not find anything. Searching for text within tags works fine. Any thoughts?
    Edited by: miyer on Feb 21, 2011 6:25 PM

    Hi
    Try adding the following variable to ucm config.cfg and then see if a new xml checkin returns the result for FT search :
    TextIndexerFilterFormats=xml
    Save the file , restart UCM and then test .
    If the new checkin gets the results as expected then execute Collection Rebuild cycle to have the existing contents as well FT indexed to be searchable (for XML).
    Thanks
    Srinath

  • Best class to use to cache an XML document

    Hello all,
    I have an utility class representing an XML document, used withing a web application. This document shall be cached in the session since within a request and during the sessions I read it often and to several other things like XPath and XSLT.
    But I am having problems due to this Bug:
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6322678
    and keep bad file descriptor errors due to these stream "specialities". There is a standard way of using cached+precompiled XSL templates and I wonder if there is a similar things for XML documents.
    So I need an internal variable in my class containing the raw XML document - so using streams is not an ideal solution in regards of the bug - i am getting "bad file descriptor" errors from time to time.
    I am playing with the idea of storing the XML document within a simple String and create the required Class from it for the according operations like XPath evaluation and XSL transformation. But I think that is quite more resource hungry than reading the XML document from the filesystem each time...
    Has anyone some hints on how to keep an XML document in memory internally?
    Thanks and regards,
    Timo

    Rehi,
    So, everything works really fine! Thanks. As promised, here some examples
    with bad indentations etc...:
    Doing XSL transformation:
    import java.io.StringWriter;
    import org.w3c.dom.Document;
    import javax.xml.transform.dom.DOMSource;
    import javax.xml.transform.Templates;
    import javax.xml.transform.TransformerFactory;
    import javax.xml.transform.Transformer;
    import javax.xml.transform.stream.StreamResult;
    TransformerFactory tFac = TransformerFactory.newInstance();
    Templates compiledXslt = tFac.newTemplates(sheet);
    Transformer transformer = compiledXslt.newTransformer();
    StringWriter sw = new StringWriter();
    StreamResult sr = new StreamResult(sw);
    transformer.transform(new DOMSource(this.doc), sr);
    System.err.println(sw.toString());************************
    Doing XPath was a surprise and was originally my problem (I called evaluate() with an InputStream that occassioannly failed), I tried also here with new DOMSource() what also failed although not being null or so... funny: evaluate() can directly be called w/ dom.Document:
    import javax.xml.xpath.XPath;
    import javax.xml.xpath.XPathFactory;
    import org.w3c.dom.Document;
    (this.doc is a org.w3c.dom.Document)
    public String getXPathEvaluationAsString(String expr) {
         try {
              XPath xp = XPathFactory.newInstance().newXPath();
              return (String)xp.evaluate(
                   expr,
                   this.doc,
                   XPathConstants.STRING
         } catch(Exception e) {
              e.printStackTrace();
         return " ";
    }Thanks and regards,
    Timo

  • Serializing XML Documents(not Java Serialization)

    Hi,
    Iam looking for a class that can serialize the XML documents.
    Heres the problem in detail:
    - I need to create an XML String from scratch taking data from a database.
    - I created the XML Document adding the childs and attributes.
    - I need an XML string from the document. Iam not exactly sure how to do this. But Apache Xerces package provides an XMLSerializer class where we can convert the document into a string.
    Is there any functionality provided. If so where can i find it.
    Thanks,
    -Rao

    Not sure if this is a bug or not (filing one just in case it is) but the following program demonstrates that with 2.0.2.9 the internal subset is serialized correctly if the document was parsed with validationMode set to true. If set to false only the entities show up in the internal subset.
    package xmlbugs;
    import java.io.*;
    import org.w3c.dom.*;
    import oracle.xml.parser.v2.*;
    public class TestSerializeLocalSubset {
    private static final String xml =
    "<?xml version='1.0' encoding='UTF-8'?>"+
    "<!DOCTYPE bar ["+
    "<!ENTITY bar 'baz'>"+
    "<!ELEMENT foo EMPTY >"+
    "<!ELEMENT bar (foo) >"+
    "]>"+
    "<bar><foo/></bar>";
    public static void main(String[] a_ ) throws Exception {
    System.out.println("Test with parser in validation mode = false");
    DOMParser d = new DOMParser();
    d.setPreserveWhitespace(false);
    d.setValidationMode(false);
    d.parse( new StringReader(xml));
    Document x = d.getDocument();
    XMLDocument xx = (XMLDocument) x;
    xx.print(System.out);
    System.out.println("Test with parser in validation mode = true");
    DOMParser d2 = new DOMParser();
    d2.setPreserveWhitespace(false);
    d2.setValidationMode(true);
    d2.parse( new StringReader(xml));
    x = d2.getDocument();
    xx = (XMLDocument) x;
    xx.print(System.out);
    }

  • Editing XML documents from iFS WebUI

    When I click on an XML document within the iFS WebUI, it opens the file and shows it to me within Internet Explorer.
    I want the document to be opened on my workstation with Arbortext Epic Editor, but IE keeps trying to display it.
    Has anyone had a problem like this? I don't want Internet Explorer to open the document, but what the XML file is associated with within Windows...
    I have tried changing the MIME type for XML documents within JWS from default to application/octet-stream in an attempt to force IE to pass it to the operating system... but it didn't work.. unless I change the wrong mime.properties file...
    Anyone have a solution?

    Did you change your settings in IE for that mimetype ?
    <BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by David Thomas ([email protected]):
    When I click on an XML document within the iFS WebUI, it opens the file and shows it to me within Internet Explorer.
    I want the document to be opened on my workstation with Arbortext Epic Editor, but IE keeps trying to display it.
    Has anyone had a problem like this? I don't want Internet Explorer to open the document, but what the XML file is associated with within Windows...
    I have tried changing the MIME type for XML documents within JWS from default to application/octet-stream in an attempt to force IE to pass it to the operating system... but it didn't work.. unless I change the wrong mime.properties file...
    Anyone have a solution?<HR></BLOCKQUOTE>
    null

  • Generating mixed case columns for XML using Object views

    I am trying to model a query involving joins to generate hierarchical levels for XML document. I model it with an object view with a multicast subquery and the generated XML works fine except the following :
    The generated XML creates a tag <view_column_name>_ITEM for the nested multicase subquery columns in the view and all the nested subquery view columns in upper case because the underlying table columns are in upper case. To better illustrate, please see the following example :
    CREATE TYPE Tillinstance_t as object (
    "Tillinstanceid" number
    ,"Stationid" number
    ,"Comment" varchar2(2000)
    ,"DepositID" number(38)
    ,"TimestampCreate" date
    ,"UseridCreate" number
    ,"TimestampChange" date
    ,"UseridChange" number
    ,"TimestampClosed" date
    ,"UseridClosed" number
    ,"TimestampBalance" date
    ,"UserIDBalance" number )
    create type insts as table of Tillinstance_t ;
    CREATE OR REPLACE VIEW till_view AS
    SELECT t.tillid as "TillID"
    , t.description as "Descr"
    , t.word as "Word"
    , t.Scopetypeid as "ScopeTypeId"
    , t.displayOrder as "DisplayOrder"
    , t.useridcreate as "UserIDCreate"
    , t.newid as "NewID"
    , t.flagactive as "FlagActive"
    , t.timestampcreate as "TSCR"
    , t.useridcreate as "UIDCR"
    , t.timestampchange as "TSCH"
    , t.useridchange as "UIDCH"
    , CAST( MULTISET ( SELECT i.Tillinstanceid as "TillinstanceID"
    , i.stationid as "StationID"
    , i.ocomment as "Comment"
    , i.depositid as "DepositID"
    , i.Timestampcreate as "TSCR"
    , i.useridcreate as "UIDCR"
    , i.Timestampchange as "TSCH"
    , i.useridchange as "UIDCH"
    , i.timestampclosed as "TSCL"
    , i.useridclosed as "UIDCL"
    , i.timestampbalance as "TSBAL"
    , i.useridbalance as "UIDBAL"
    FROM TillInstance i
    WHERE t.tillid = i.tillid)
    AS Insts)
    AS "Insts"
    FROM ucTill t
    The generated XML shows up in the form of :
    <?xml version = '1.0'?>
    <Tills>
    <Till TillID="1002" Descr="Till #3" Word="Till3" ScopeTypeId="8"
    DisplayOrder="0" UserIDCreate="296" TSCR="3/26/2001 0:0:0" UIDCR="296"
    TSCH="5/4/2001 14:12:32" UIDCH="298">
    <Insts>
    <Insts_ITEM TILLINSTANCEID="1278" STATIONID="1057" OCOMMENT="Morning Till3"
    TIMESTAMPCREATE="3/26/2001 0:0:0" USERIDCREATE="296" TIMESTAMPCHANGE="6/7/2001
    8:26:49" USERIDCHANGE="99" TIMESTAMPCLOSED="6/7/2001 8:26:49"
    USERIDCLOSED="99"/>
    <Insts_ITEM TILLINSTANCEID="1362" STATIONID="1057" TIMESTAMPCREATE="6/7/2001
    8:27:13" USERIDCREATE="99" TIMESTAMPCHANGE="6/11/2001 11:32:58"
    USERIDCHANGE="320"/>
    </Insts>
    </Till>
    </Tills>
    Now How do I stripe out the _ITEM from the generated XML and change the columns TIMESTAMPCREATE, USERIDCLOSED etc to mixed case?
    Any idea

    I could generate the mixed case columns with no problem. It was my mistake and sorry for the inconvinience.
    However I am running into problems modelling a nested hierarchical set of queries with levels more than 2.
    Please advise of any sample code available anywhere.
    For example I could do :
    create table x1 ( id number , f1 varchar2(10));
    creat table x2 ( id number , id_x1 number references x1(id) , f1 varchar2(10)) ;
    create table x3 ( id number , id_x2 number references x2(id), f1 varchar2(10)) ;
    To model this, I did
    create type x3_typ as object ( id number, id_x2 number , f1 varchar2(10)) ;
    create type x3_typ_t is table of x3_typ ;
    create type x2_typ as object ( id number, id_x1 number , f1 varchar2(10), x3_list x3_typ_t ) ;
    create type x2_typ_t is table of ref x2_typ ;
    create type x1_typ as object ( id number
    , f1 varchar2(10) , x2_list x2_typ_t ) ;
    create or replace view x3_x2 as
    select id , f1 , cast(multiset(select * from x3 ) as x3_typ_t ) as x3_list
    If I try to use a view again like as given below, I get the Oracle inconsistent datatypes error.
    create or replace view x2_x1 as
    select id , f1 , cast(multiset(select * from x3_x2 ) as x2_typ_t ) as x2_list
    I there a better way? Am I missing something? Please help.

  • Better hardware configuration for using JDeveloper

    Hi,
    What is the better configuration for using such since every time I felt absolutely slow when using it?
    Thks & Rgds,
    HuaMin

    HuaMin,
    Kind of hard to say what's better, since you don't tell us what you currently have. However, I find that RAM is your friend when running JDeveloper. I've run JDeveloper 10.1.3.x on various laptops (both single and dual core) with 2GB of RAM and performance is quite good. The CPU speeds that I've used are generally in the 2GHz range. My current setup is a dual core 2 GHZ with 2 GB of RAM and I'm quite happy
    John

  • Tool to compare 2 XML documents

    Hi,
    I am looking for some C++ or C tool or library to compare
    XML documents (something like WinDiff but for XML documents).
    Can anybody point me to the right location or name?
    Thanks,
    Jonas
    null

    There's one at IBM's AlphaWorks http://www.alphaworks.ibm.com/tech/xmldiffmerge
    You might want to look also at http://www.xmlsoftware.com

  • How to do CHECK IN the files for the document?

    Hi ,
    By using BAPI_DOCUMENT_CHECKIN2 Function module I am uploading the Files for the Document number.
    My problem is - After uploading the files to the document number the files are not in check in status.
    I want the files to be checked in after uploading the from the FM.
    Can anyone tell me what I am missing ? How to do check in the files.
    I am passing the following values to FM.
          ls_files-SOURCEDATACARRIER = ls_file-DATA_CARRIER.
          ls_files-WSAPPLICATION     = ls_file-application.
          ls_files-DESCRIPTION       = ls_file-file_desc.
          ls_files-STORAGECATEGORY   = ls_file-category.
          ls_files-DOCFILE           = ls_file-doc_path.
          ls_files-CHECKEDIN         = 'X'.
        CALL FUNCTION 'BAPI_DOCUMENT_CHECKIN2'
          EXPORTING
            DOCUMENTTYPE            = ls_file_data-dokar
            DOCUMENTNUMBER          = ls_file_data-doknr
            DOCUMENTPART            = ls_file_data-doktl
            DOCUMENTVERSION         = ls_file_data-dokvr
          IMPORTING
            RETURN                  =  ls_return
          TABLES
            DOCUMENTFILES           =  lt_files
        COMPONENTS              =
        DOCUMENTSTRUCTURE       =
    I appriciate you response on this.
    Thanks in advance.

    Hi,
    Check the values you are passing into ls_files .
    you dont need to pass checkedin = 'X'.
    and check the storagecategory field - this should not be blnak.
    files-SOURCEDATACARRIER = lsfile-DATA_CARRIER.
    ls_files-WSAPPLICATION = ls_file-application.
    ls_files-DESCRIPTION = ls_file-file_desc.
    ls_files-STORAGECATEGORY = ls_file-category.
    ls_files-DOCFILE = ls_file-doc_path.
    ls_files-CHECKEDIN = 'X'.
    This FM will work. No issues just check the values you are psssing into this LS_FILES structure.
    Rest of the fields are corect.
    Thanks,
    Murali.

Maybe you are looking for