XMLTABLE / XQUERY performance

Hi all,
Below is a sample XML representing a spreadsheet :
<Table>
<Row>
  <Cell><Data>1</Data></Cell>
  <Cell><Data>A</Data></Cell>
  <Cell><Data>B</Data></Cell>
  <Cell Index="5"><Data>D</Data></Cell>
</Row>
<Row>
  <Cell><Data>2</Data></Cell>
  <Cell Index="3"><Data>B</Data></Cell>
  <Cell><Data>C</Data></Cell>
</Row>
<Row>
  <Cell><Data>3</Data></Cell>
  <Cell Index="3"><Data>B</Data></Cell>
  <Cell Index="5"><Data>D</Data></Cell>
</Row>
<Row>
  <Cell><Data>4</Data></Cell>
  <Cell><Data>A</Data></Cell>
  <Cell><Data>B</Data></Cell>
  <Cell><Data>C</Data></Cell>
  <Cell><Data>D</Data></Cell>
</Row>
<Row>
  <Cell><Data>5</Data></Cell>
  <Cell><Data>A</Data></Cell>
  <Cell Index="4"><Data>C</Data></Cell>
  <Cell><Data>D</Data></Cell>
</Row>
</Table>which should be interpreted as :
cols --> 1 2 3 4 5
         1 A B   D
         2   B C
         3   B   D
         4 A B C D
         5 A   C DAs you can see, for each row, empty cells are simply omitted in the document.
The next non-empty cell is then marked with an Index attribute representing its true position on the row.
My requirement is to query the document and access the values based on cells positions on the row.
Because of empty cells, Data values cannot be accessed by a simple xpath like "/Cell[n]/Data".
So I came up with the following :
WITH t AS (
select xmltype(
'<Table>
<Row>
  <Cell><Data>1</Data></Cell>
  <Cell><Data>A</Data></Cell>
  <Cell><Data>B</Data></Cell>
  <Cell Index="5"><Data>D</Data></Cell>
</Row>
<Row>
  <Cell><Data>2</Data></Cell>
  <Cell Index="3"><Data>B</Data></Cell>
  <Cell><Data>C</Data></Cell>
</Row>
<Row>
  <Cell><Data>3</Data></Cell>
  <Cell Index="3"><Data>B</Data></Cell>
  <Cell Index="5"><Data>D</Data></Cell>
</Row>
<Row>
  <Cell><Data>4</Data></Cell>
  <Cell><Data>A</Data></Cell>
  <Cell><Data>B</Data></Cell>
  <Cell><Data>C</Data></Cell>
  <Cell><Data>D</Data></Cell>
</Row>
<Row>
  <Cell><Data>5</Data></Cell>
  <Cell><Data>A</Data></Cell>
  <Cell Index="4"><Data>C</Data></Cell>
  <Cell><Data>D</Data></Cell>
</Row>
</Table>') doc from dual
SELECT x.*
FROM t, xmltable(
'for $j in /Table/Row
return <ROW> {
   for $i at $pos in $j/Cell
   let $x := $j/Cell[position()<=$pos and @Index][last()]/@Index
   let $x2 := if($x) then $x else 1
   let $p := count($j/Cell[@Index=$x]/preceding-sibling::*)+1
   return <DATA pos="{$pos - $p + $x2}">{$i/Data/text()}</DATA>
} </ROW>'
PASSING t.doc
COLUMNS cell1 number      PATH '/ROW/DATA[@pos="1"]',
         cell2 varchar2(1) PATH '/ROW/DATA[@pos="2"]',
         cell3 varchar2(1) PATH '/ROW/DATA[@pos="3"]',
         cell4 varchar2(1) PATH '/ROW/DATA[@pos="4"]',
         cell5 varchar2(1) PATH '/ROW/DATA[@pos="5"]'
) x;Basically, the XQUERY reconstructs each row and gives each cell its true position. We can then access the data with a simple position predicate.
It works well for small documents, but rapidly shows awful performance as the size increases (which is understandable).
So my question : is there a better way to achieve the requirement (or to improve performance)?
Thanks a lot.
(DB version is 10.2.0.4)
Edited by: odie_63 on 28 déc. 2009 19:21

what if you do the logic outside of the xquery?
like this:
SQL> set timi on;
SQL> WITH t AS (
  2   select xmltype(
  3  '<Table>
  4   <Row>
  5    <Cell><Data>1</Data></Cell>
  6    <Cell><Data>A</Data></Cell>
  7    <Cell><Data>B</Data></Cell>
  8    <Cell Index="5"><Data>D</Data></Cell>
  9   </Row>
10   <Row>
11    <Cell><Data>2</Data></Cell>
12    <Cell Index="3"><Data>B</Data></Cell>
13    <Cell><Data>C</Data></Cell>
14   </Row>
15   <Row>
16    <Cell><Data>3</Data></Cell>
17    <Cell Index="3"><Data>B</Data></Cell>
18    <Cell Index="5"><Data>D</Data></Cell>
19   </Row>
20   <Row>
21    <Cell><Data>4</Data></Cell>
22    <Cell><Data>A</Data></Cell>
23    <Cell><Data>B</Data></Cell>
24    <Cell><Data>C</Data></Cell>
25    <Cell><Data>D</Data></Cell>
26   </Row>
27   <Row>
28    <Cell><Data>5</Data></Cell>
29    <Cell><Data>A</Data></Cell>
30    <Cell Index="4"><Data>C</Data></Cell>
31    <Cell><Data>D</Data></Cell>
32   </Row>
33  </Table>') doc from dual
34  )
35  SELECT x.*
36  FROM t, xmltable(
37  'for $j in /Table/Row
38   return <ROW> {
39     for $i at $pos in $j/Cell
40     let $x := $j/Cell[position()<=$pos and @Index][last()]/@Index
41     let $x2 := if($x) then $x else 1
42     let $p := count($j/Cell[@Index=$x]/preceding-sibling::*)+1
43     return <DATA pos="{$pos - $p + $x2}">{$i/Data/text()}</DATA>
44   } </ROW>'
45   PASSING t.doc
46   COLUMNS cell1 number      PATH '/ROW/DATA[@pos="1"]',
47           cell2 varchar2(1) PATH '/ROW/DATA[@pos="2"]',
48           cell3 varchar2(1) PATH '/ROW/DATA[@pos="3"]',
49           cell4 varchar2(1) PATH '/ROW/DATA[@pos="4"]',
50           cell5 varchar2(1) PATH '/ROW/DATA[@pos="5"]'
51  ) x;
     CELL1 C C C C                                                             
         1 A B   D                                                             
         2   B C                                                               
         3   B   D                                                             
         4 A B C D                                                             
         5 A   C D                                                             
Elapsed: 00:00:00.64
SQL>
SQL> WITH t AS (
  2   select xmltype('<Table>
  3   <Row>
  4    <Cell><Data>1</Data></Cell>
  5    <Cell><Data>A</Data></Cell>
  6    <Cell><Data>B</Data></Cell>
  7    <Cell Index="5"><Data>D</Data></Cell>
  8   </Row>
  9   <Row>
10    <Cell><Data>2</Data></Cell>
11    <Cell Index="3"><Data>B</Data></Cell>
12    <Cell><Data>C</Data></Cell>
13   </Row>
14   <Row>
15    <Cell><Data>3</Data></Cell>
16    <Cell Index="3"><Data>B</Data></Cell>
17    <Cell Index="5"><Data>D</Data></Cell>
18   </Row>
19   <Row>
20    <Cell><Data>4</Data></Cell>
21    <Cell><Data>A</Data></Cell>
22    <Cell><Data>B</Data></Cell>
23    <Cell><Data>C</Data></Cell>
24    <Cell><Data>D</Data></Cell>
25   </Row>
26   <Row>
27    <Cell><Data>5</Data></Cell>
28    <Cell><Data>A</Data></Cell>
29    <Cell Index="4"><Data>C</Data></Cell>
30    <Cell><Data>D</Data></Cell>
31   </Row>
32  </Table>') doc from dual)
33  select
34    case when v.cell_1_index is null then v.cell_1 end cell_one
35  , case when nvl(v.cell_1_index,-1)=2 then v.cell_1
36         when v.cell_2_index is null and v.cell_1_index is null then v.cell_2
37    end cell_two
38  , case when nvl(v.cell_1_index,-1)=3 then v.cell_1
39         when nvl(v.cell_2_index,-1)=3 then v.cell_2
40         when v.cell_1_index=2 and v.cell_2_index is null then v.cell_2
41         when coalesce(v.cell_1_index,v.cell_2_index,v.cell_3_index) is null then v.cell_3
42    end cell_three
43  , case when nvl(v.cell_1_index,-1)=4 then v.cell_1
44         when nvl(v.cell_2_index,-1)=4 then v.cell_2
45         when nvl(v.cell_3_index,-1)=4 then v.cell_3
46         when v.cell_1_index=2 and v.cell_2 is not null and v.cell_3_index is null then v.cell_3
47         when v.cell_1_index=3 and v.cell_2 is not null and v.cell_4_index is null then v.cell_2
48         when v.cell_2_index=3 and v.cell_3_index is null then v.cell_3
49         when coalesce(v.cell_1_index,v.cell_2_index,v.cell_3_index,v.cell_4_index) is null then v.cell_4
50    end cell_four
51  , case when nvl(v.cell_1_index,-1)=5 then v.cell_1
52         when nvl(v.cell_2_index,-1)=5 then v.cell_2
53         when nvl(v.cell_3_index,-1)=5 then v.cell_3
54         when nvl(v.cell_4_index,-1)=5 then v.cell_4
55         when v.cell_1_index=3 and v.cell_2_index is null then v.cell_3
56         when v.cell_1_index=4 then v.cell_2
57         when v.cell_3_index=4 then v.cell_4
58         when v.cell_2_index=3 and v.cell_3_index is null and v.cell_4_index is null then v.cell_4
59         when coalesce(v.cell_1_index,v.cell_2_index,v.cell_3_index,v.cell_4_index) is null then v.cell_5
60    end cell_five
61  from
62  t,xmltable('/Table/Row'
63  passing t.doc
64  columns
65   cell_1 varchar2(10) path 'Cell[1]/Data'
66  ,cell_1_index number path 'Cell[1]/@Index'
67  ,cell_2 varchar2(10) path 'Cell[2]/Data'
68  ,cell_2_index number path 'Cell[2]/@Index'
69  ,cell_3 varchar2(10) path 'Cell[3]/Data'
70  ,cell_3_index number path 'Cell[3]/@Index'
71  ,cell_4 varchar2(10) path 'Cell[4]/Data'
72  ,cell_4_index number path 'Cell[4]/@Index'
73  ,cell_5 varchar2(10) path 'Cell[5]/Data'
74  ) v;
CELL_ONE   CELL_TWO   CELL_THREE CELL_FOUR  CELL_FIVE                          
1          A          B                     D                                  
2                     B          C                                             
3                     B                     D                                  
4          A          B          C          D                                  
5          A                     C          D                                  
Elapsed: 00:00:00.04
SQL> spool off;Edited by: Ants Hindpere on Mar 10, 2010 1:03 AM

Similar Messages

  • EXTREMELY SLOW XQUERY PERFORMANCE AND SLOW DOCUMENT INSERTS

    EXTREMELY SLOW XQUERY PERFORMANCE AND SLOW DOCUMENT INSERTS.
    Resolution History
    12-JUN-07 15:01:17 GMT
    ### Complete Problem Description ###
    A test file is being used to do inserts into a schemaless XML DB. The file is inserted and then links are made to 4
    different collection folders under /public. The inserts are pretty slow (about
    15 per second and the file is small)but the xquery doesn't even complete when
    there are 500 documents to query against.
    The same xquery has been tested on a competitors system and it has lightening fast performance there. I know it
    should likewise be fast on Oracle, but I haven't been able to figure out what
    is going on except that I suspect somehow a cartesian product is the result of
    the query on Oracle.
    ### SQLXML, XQUERY, PL/SQL syntax used ###
    Here is the key plsql code that calls the DBMS_XDB procedures:
    CREATE OR REPLACE TYPE "XDB"."RESOURCEARRAY" AS VARRAY(500) OF VARCHAR2(256);
    PROCEDURE AddOrReplaceResource(
    resourceUri VARCHAR2,
    resourceContents SYS.XMLTYPE,
    public_collections in ResourceArray
    ) AS
    b BOOLEAN;
    privateResourceUri path_view.path%TYPE;
    resource_exists EXCEPTION;
    pragma exception_init(resource_exists,-31003);
    BEGIN
    /* Store the document in private folder */
    privateResourceUri := GetPrivateResourceUri(resourceUri);
    BEGIN
    b := dbms_xdb.createResource(privateResourceUri, resourceContents);
    EXCEPTION
    WHEN resource_exists THEN
    DELETE FROM resource_view WHERE equals_path(res, privateResourceUri)=1;
    b := dbms_xdb.createResource(privateResourceUri, resourceContents);
    END;
    /* add a link in /public/<collection-name> for each collection passed in */
    FOR i IN 1 .. public_collections.count LOOP
    BEGIN
    dbms_xdb.link(privateResourceUri,public_collections(i),resourceUri);
    EXCEPTION
    WHEN resource_exists THEN
    dbms_xdb.deleteResource(concat(concat(public_collections(i),'/'),resourceUri));
    dbms_xdb.link(privateResourceUri,public_collections(i),resourceUri);
    END;
    END LOOP;
    COMMIT;
    END;
    FUNCTION GetPrivateResourceUri(
    resourceUri VARCHAR2
    ) RETURN VARCHAR2 AS
    BEGIN
    return concat('/ems/docs/',REGEXP_SUBSTR(resourceUri,'[a-zA-z0-9.-]*$'));
    END;
    ### Info for XML Querying ###
    Here is the XQuery and a sample of the output follows:
    declare namespace c2ns="urn:xmlns:NCC-C2IEDM";
    for $cotEvent in collection("/public")/event
    return
    <cotEntity>
    {$cotEvent}
    {for $d in collection("/public")/c2ns:OpContextMembership[c2ns:Entity/c2ns:EntityIdentifier
    /c2ns:EntityId=xs:string($cotEvent/@uid)]
    return
    $d
    </cotEntity>
    Sample output:
    <cotEntity><event how="m-r" opex="o-" version="2" uid="XXX541113454" type="a-h-G-" stale="2007-03-05T15:36:26.000Z"
    start="2007-03-
    05T15:36:26.000Z" time="2007-03-05T15:36:26.000Z"><point ce="" le="" lat="5.19098483230079" lon="-5.333597827082126"
    hae="0.0"/><de
    tail><track course="26.0" speed="9.26"/></detail></event></cotEntity>

    19-JUN-07 04:34:27 GMT
    UPDATE
    =======
    Hi Arnold,
    you wrote -
    Please use Sun JDK 1.5 java to perform the test case.Right now I have -
    $ which java
    /usr/bin/java
    $ java -version
    java version "1.4.2"
    gcj (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3)
    sorry as I told you before I am not very knowledgeable in Java. Can you tell me what setting
    s I need to change to make use of Sun JDK 1.5. Please note I am testing on Linux
    . Do I need to test this on a SUN box? Can it not be modify to run on Linux?
    Thanks,
    Rakesh
    STATUS
    =======
    @CUS -- Waiting for requested information

  • XQuery performance in Oracle 10gR2

    Hello
    I'm actually trying to measure the performances of XQuery FLOWR queries on Oracle 10gR2. For that, I've created a simple table with one integer field (ID) and one XMLType field. This XMLType field contains 10000 documents. The size of these documents varies between 2Kb and 14Kb approximately.
    A simple XQuery like below (without "WHERE" or complexe "RETURN" clause) runs quite well and I get the query results in a resonable time.
    SELECT xtab.COLUMN_VALUE
    FROM contractXDraft, XMLTABLE(
         'declare namespace ctxCD="contractX/contractXDraft";for $x in /ctxCD:contractXDraft
         return
         <response>
              Hello
         </response>'
    PASSING OBJECT_VALUE) xtab;
    On the other hand, if a add a "WHERE" clause to filter the results, the query execute time become very long (~hours...) and I must always abort the query execution by killing the "oracle" process because the page file memory used by this process increase linearly ! Here below is represented a such query :
    SELECT xtab.COLUMN_VALUE
    FROM contractXDraft, XMLTABLE(
         'declare namespace ctxCD="contractX/contractXDraft";for $x in /ctxCD:contractXDraft
    where $x/ctxCD:ReferenceDetails/ctxCD:ContractHeaderDetails/ctxCD:ContractNumber = "19163-contract657-2.xml"
         return
         <response>
              Hello
         </response>'
    PASSING OBJECT_VALUE) xtab;
    These above queries are executed on a server dedicated for these tests. This server is a Pentium IV 3.2 GHz with 1GB of RAM and Windows 2003 Server Enterprise SP1 is installed on it.
    I want to know if someone can give me an explanation about this huge difference of execution time between the two queries above ?? Is it a syntax mistake ? Is it an hardware problem ?
    Thank you very much for your help !!
    Regards
    MD

    MD,
    Sounds like your XMLType field is using unstructured storage (i.e., LOB-based) instead of the structured storage (i.e., O-R based). You can check out the XML DB online doc (http://download-west.oracle.com/docs/cd/B19306_01/appdev.102/b14259/xdb01int.htm#BABECDCF) to learn more about the differences between the two. Essentially, without any associated indexes, using XQuery over unstructured storage will result in a full table scan, which can be very slow when there are large number of rows.
    Regards,
    Geoff

  • 10g vs 11g xquery performance with XBRL

    Finally,I set up 11g on small notebook with 1G memory.
    The result was impresive compared to 10g ,but I need more than that.
    I used this query generating 761 rows for testing
    SELECT c.seqno,xt.ns,xt.name,nvl(xt.lang,'na') as lang,xt.unit,xt.decimals,
    xt.value
    FROM FINES_CTX c,FINES_XBRL_CLOB r,
    XMLTABLE(
    XMLNAMESPACES(
    'http://www.xbrl.org/2003/linkbase' AS "link",
    'http://www.w3.org/1999/xlink' AS "xlink",
    'http://www.w3.org/2001/XMLSchema' AS "xsd",
    'http://www.xbrl.org/2003/instance' AS "xbrli",
    'http://fss.xbrl.or.kr/kr/br/f/aa/2007-06-30' AS
    "fines-f-aa",
    'http://fss.xbrl.or.kr/kr/br/b/aa/2007-06-30' AS
    "fines-b-aa",
    'http://fss.xbrl.or.kr/kr/br/f/ad/2007-06-30' AS
    "fines-f-ad",
    'http://fss.xbrl.or.kr/kr/br/b/ad/2007-06-30' AS
    "fines-b-ad",
    'http://fss.xbrl.or.kr/kr/br/f/af/2007-06-30' AS
    "fines-f-af",
    'http://fss.xbrl.or.kr/kr/br/b/af/2007-06-30' AS
    "fines-b-af",
    'http://fss.xbrl.or.kr/kr/br/f/ai/2007-06-30' AS
    "fines-f-ai",
    'http://fss.xbrl.or.kr/kr/br/b/ai/2007-06-30' AS
    "fines-b-ai",
    'http://fss.xbrl.or.kr/kr/br/f/ak/2007-06-30' AS
    "fines-f-ak",
    'http://fss.xbrl.or.kr/kr/br/b/ak/2007-06-30' AS
    "fines-b-ak",
    'http://fss.xbrl.or.kr/kr/br/f/bs/2007-06-30' AS
    "fines-f-bs",
    'http://fss.xbrl.or.kr/kr/br/b/bs/2007-06-30' AS
    "fines-b-bs",
    'http://xbrl.org/2005/xbrldt' AS "xbrldt",
    'http://www.xbrl.org/2004/ref' AS "ref",
    'http://www.xbrl.org/2003/XLink' AS "xl"),
    for $item in $doc/xbrli:xbrl/*[not(starts-with(name(),"xbrli:")) and not(starts-with(name(),"link:"))]
    where $item/@contextRef
    return <item decimals="{$item/@decimals}" contextRef="{$item/@contextRef}" xml:lang="{$item/@xml:lang}" unitRef="{$item/@unitRef}" name="{local-name($item)}" ns="{namespace-uri($item)}">{$item/text()}</item>'
    PASSING r.xbrl as "doc"
    COLUMNS context_id varchar2(128) PATH '@contextRef',
    ns varchar2(128) PATH '@ns',
    name varchar2(128) PATH '@name',
    lang varchar2(2) PATH '@xml:lang',
    unit varchar2(16) PATH '@unitRef',
    decimals varchar2(64) PATH '@decimals',
    value varchar(256) PATH '.'
    ) xt
    WHERE c.report_cd = r.report_cd and c.finance_cd = r.finance_cd and
    c.base_month = r.base_month and c.gubn_cd = r.gubn_cd
    and c.seqno = 109299 and c.context_id = xt.context_id
    all the tables have 500 rows and non-schema-based xmltype clolumn.
    FINES_XBRL_CLOB - xmltype stored as clob
    FINES_XBRL_BINARY - xmltype stored as binary with xml index
    FINES_XBRL_BINARY_NI - xmltype stored as binary without xml index.
    case 1 : run on 10g with XMLType stored as CLOB
    time: took 1270 secs.- quite disappointed.
    plan: 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=26 Card=82 Bytes=173K)
    1 0 NESTED LOOPS (Cost=26 Card=82 Bytes=173K)
    2 1 NESTED LOOPS (Cost=2 Card=1 Bytes=2K)
    3 2 TABLE ACCESS (BY INDEX ROWID) OF 'FINES_CTXB' (TABLE) (Cost=1 Card=1 Bytes=119)
    4 3 INDEX (UNIQUE SCAN) OF 'PK_FINES_CTXB' (INDEX (UNIQUE)) (Cost=1 Card=1)
    5 2 TABLE ACCESS (BY INDEX ROWID) OF 'FINES_XBRLB' (TABLE) (Cost=1 Card=82 Bytes=164K)
    6 5 INDEX (UNIQUE SCAN) OF 'PK_FINES_XBRLB' (INDEX (UNIQUE)) (Cost=0 Card=1)
    7 1 COLLECTION ITERATOR (PICKLER FETCH) OF 'SYS.XQSEQUENCEFROMXMLTYPE' (PROCEDURE)
    case 2: run on 11g with XMLType stored as CLOB
    time: took 27 secs. - almost 50 times faster
    plan:
    0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=114 Card=1 Bytes=2K)
    1 0 FILTER
    2 1 NESTED LOOPS (Cost=32 Card=82 Bytes=173K)
    3 2 NESTED LOOPS (Cost=3 Card=1 Bytes=2K)
    4 3 TABLE ACCESS (BY INDEX ROWID) OF 'FINES_CTX' (TABLE) (Cost=2 Card=1 Bytes=119)
    5 4 INDEX (UNIQUE SCAN) OF 'PK_FINES_CTX' (INDEX (UNIQUE)) (Cost=1 Card=1)
    6 3 TABLE ACCESS (BY INDEX ROWID) OF 'FINES_XBRL_CLOB' (TABLE) (Cost=1 Card=5K Bytes=10M)
    7 6 INDEX (UNIQUE SCAN) OF 'PK_FINES_XBRL_CLOB' (INDEX (UNIQUE)) (Cost=0 Card=1)
    8 2 COLLECTION ITERATOR (PICKLER FETCH) OF 'SYS.XMLSEQUENCEFROMXMLTYPE' (PROCEDURE)
    9 1 COLLECTION ITERATOR (PICKLER FETCH) OF 'SYS.XQSEQUENCEFROMXMLTYPE' (PROCEDURE)
    case 3: run on 11g with XMLType stored as BINARY no XMLIndex
    time: 10 secs (9.6 sec exactly) , 120 times faster..
    0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=113 Card=1 Bytes=2K)
    1 0 FILTER
    2 1 NESTED LOOPS (Cost=33 Card=80 Bytes=169K)
    3 2 NESTED LOOPS (Cost=3 Card=1 Bytes=2K)
    4 3 TABLE ACCESS (BY INDEX ROWID) OF 'FINES_CTX' (TABLE) (Cost=2 Card=1 Bytes=119)
    5 4 INDEX (UNIQUE SCAN) OF 'PK_FINES_CTX' (INDEX (UNIQUE)) (Cost=1 Card=1)
    6 3 TABLE ACCESS (BY INDEX ROWID) OF 'FINES_XBRL_BINARY_NI' (TABLE) (Cost=1 Card=82 Bytes=164K)
    7 6 INDEX (UNIQUE SCAN) OF 'PK_FINES_BINARY_XBRL_NI' (INDEX (UNIQUE)) (Cost=0 Card=1)
    8 2 XPATH EVALUATION
    9 1 XPATH EVALUATION
    CREATE INDEX fines_xbrl_binary_ix ON fines_xbrl_binary (xbrl) INDEXTYPE IS XDB.XMLIndex
    case 4: run on 11g with XMLType stored as BINARY and XMLIndex
    time: 574 secs. - oops...not good.
    plan: quite long..
    0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=16 Card=1 Bytes=5K)
    1 0 FILTER
    2 1 NESTED LOOPS
    3 2 NESTED LOOPS (Cost=4 Card=1 Bytes=4K)
    4 3 TABLE ACCESS (BY INDEX ROWID) OF 'XDB.X$PT1MP1MWL3978FCE0G24J0CM85AM' (TABLE) (Cost=0 Card=1 Bytes=1008)
    5 4 INDEX (RANGE SCAN) OF 'XDB.X$PR1MP1MWL3978FCE0G24J0CM85AM' (INDEX (UNIQUE)) (Cost=0 Card=1)
    6 3 INDEX (RANGE SCAN) OF 'SYS69876_FINES_XBRL_PATHID_IX' (INDEX) (Cost=2 Card=3)
    7 2 TABLE ACCESS (BY INDEX ROWID) OF 'SYS69876_FINES_XBRL_PATH_TABLE' (TABLE) (Cost=4 Card=1 Bytes=3K)
    8 0 FILTER
    9 8 NESTED LOOPS
    10 9 NESTED LOOPS (Cost=4 Card=1 Bytes=4K)
    11 10 TABLE ACCESS (BY INDEX ROWID) OF 'XDB.X$PT1MP1MWL3978FCE0G24J0CM85AM' (TABLE) (Cost=0 Card=1 Bytes=1008)
    12 11 INDEX (RANGE SCAN) OF 'XDB.X$PR1MP1MWL3978FCE0G24J0CM85AM' (INDEX (UNIQUE)) (Cost=0 Card=1)
    13 10 INDEX (RANGE SCAN) OF 'SYS69876_FINES_XBRL_PATHID_IX' (INDEX) (Cost=2 Card=3)
    14 9 TABLE ACCESS (BY INDEX ROWID) OF 'SYS69876_FINES_XBRL_PATH_TABLE' (TABLE) (Cost=4 Card=1 Bytes=3K)
    15 0 FILTER
    16 15 NESTED LOOPS
    17 16 NESTED LOOPS (Cost=4 Card=1 Bytes=4K)
    18 17 TABLE ACCESS (BY INDEX ROWID) OF 'XDB.X$PT1MP1MWL3978FCE0G24J0CM85AM' (TABLE) (Cost=0 Card=1 Bytes=1008)
    19 18 INDEX (RANGE SCAN) OF 'XDB.X$PR1MP1MWL3978FCE0G24J0CM85AM' (INDEX (UNIQUE)) (Cost=0 Card=1)
    20 17 INDEX (RANGE SCAN) OF 'SYS69876_FINES_XBRL_PATHID_IX' (INDEX) (Cost=2 Card=3)
    21 16 TABLE ACCESS (BY INDEX ROWID) OF 'SYS69876_FINES_XBRL_PATH_TABLE' (TABLE) (Cost=4 Card=1 Bytes=3K)
    22 0 FILTER
    23 22 NESTED LOOPS
    24 23 NESTED LOOPS (Cost=4 Card=1 Bytes=4K)
    25 24 TABLE ACCESS (BY INDEX ROWID) OF 'XDB.X$PT1MP1MWL3978FCE0G24J0CM85AM' (TABLE) (Cost=0 Card=1 Bytes=1008)
    26 25 INDEX (RANGE SCAN) OF 'XDB.X$PR1MP1MWL3978FCE0G24J0CM85AM' (INDEX (UNIQUE)) (Cost=0 Card=1)
    27 24 INDEX (RANGE SCAN) OF 'SYS69876_FINES_XBRL_PATHID_IX' (INDEX) (Cost=2 Card=3)
    28 23 TABLE ACCESS (BY INDEX ROWID) OF 'SYS69876_FINES_XBRL_PATH_TABLE' (TABLE) (Cost=4 Card=1 Bytes=3K)
    29 0 SORT (AGGREGATE) (Card=1 Bytes=3K)
    30 29 FILTER
    31 30 TABLE ACCESS (BY INDEX ROWID) OF 'SYS69876_FINES_XBRL_PATH_TABLE' (TABLE) (Cost=5 Card=32 Bytes=110K)
    32 31 INDEX (RANGE SCAN) OF 'SYS69876_FINES_XBRL_ORDKEY_IX' (INDEX) (Cost=3 Card=92)
    33 0 FILTER
    34 33 NESTED LOOPS
    35 34 NESTED LOOPS (Cost=4 Card=1 Bytes=4K)
    36 35 TABLE ACCESS (BY INDEX ROWID) OF 'XDB.X$PT1MP1MWL3978FCE0G24J0CM85AM' (TABLE) (Cost=0 Card=1 Bytes=1008)
    37 36 INDEX (RANGE SCAN) OF 'XDB.X$PR1MP1MWL3978FCE0G24J0CM85AM' (INDEX (UNIQUE)) (Cost=0 Card=1)
    38 35 INDEX (RANGE SCAN) OF 'SYS69876_FINES_XBRL_PATHID_IX' (INDEX) (Cost=2 Card=3)
    39 34 TABLE ACCESS (BY INDEX ROWID) OF 'SYS69876_FINES_XBRL_PATH_TABLE' (TABLE) (Cost=4 Card=1 Bytes=3K)
    40 0 FILTER
    41 40 NESTED LOOPS
    42 41 NESTED LOOPS (Cost=4 Card=1 Bytes=4K)
    43 42 TABLE ACCESS (BY INDEX ROWID) OF 'XDB.X$PT1MP1MWL3978FCE0G24J0CM85AM' (TABLE) (Cost=0 Card=1 Bytes=1008)
    44 43 INDEX (RANGE SCAN) OF 'XDB.X$PR1MP1MWL3978FCE0G24J0CM85AM' (INDEX (UNIQUE)) (Cost=0 Card=1)
    45 42 INDEX (RANGE SCAN) OF 'SYS69876_FINES_XBRL_PATHID_IX' (INDEX) (Cost=2 Card=3)
    46 41 TABLE ACCESS (BY INDEX ROWID) OF 'SYS69876_FINES_XBRL_PATH_TABLE' (TABLE) (Cost=4 Card=1 Bytes=3K)
    -- continue....
    With very limited test case, I personally concluded that oracle 11g's engine related XML is much better than 10g, especially when using Binary type ,getting additional performance boost.
    xbrl document is basically flat ,not heirarchical structured, that makes XMLIndex inefficient ,I guess.
    Is there any good way to use XMLIndex more efficient just with this kind of case ?
    Please point out anything I can do more.
    thanks.

    I guess you meant to say / instead of "...oracle 11g's engine related XML is much better than 10g..." - "oracle 11g's XQuery engine related XML is much better than 10g"...
    Did you create the XMLIndex as described (case 4)...
    CREATE INDEX fines_xbrl_binary_ix ON fines_xbrl_binary (xbrl) INDEXTYPE IS XDB.XMLIndexIn different words, you didn't use "path subsetting" (http://www.liberidu.com/blog/?p=242) ?
    I guess you created statistics ?
    Thanks for sharing !!!

  • XQuery Performance in BerkeleyDB

    We are migrating from IPedo to Berkeley DB.
    IPedo did not support multiple indices in their Xqueries, so we had to
    concatenate some fields in to one field and index that field, the
    Xqueries were really fast.
    Unfortunately the same XQuery does not perform well in BerkeleyDB.
    This is how we create the index for this filed (ContentKey) in
    BerkeleyDB
    addIndex '' 'ContentKey' edge-element-equality-string
    and this is how I query using Java API.
    queryContext.setEvaluationType(XmlQueryContext.Eager);
    queryContext.setVariableValue("ContentKey", new XmlValue(
    "a0a0188000001115348efcc00000003XXXXXXXXXXXXXYYYYYYYYYYYYYY"));
    // Declare the query string
    String myQuery = "collection('db/title')/Record[ContentKey=
    $ContentKey]";
    // Prepare (compile) the query
    XmlQueryExpression xmlQueryExpression =
    dbManager.prepare(myQuery,queryContext);
    1. What is wrong with the index or the way I am using the Java API ?
    Changing the evaluation type to Lazy did not help at all.
    2. The Query performs OK in dbxml.
    3. Are there any other commercial/open source tools to evaluate the
    performance of a Xquery in BerkeleyDB? Stylus Studio does not support
    BDB - 2.3.10 yet.
    Any help would be appreciated.
    Thanks,
    Suresh

    Hi John:
    Thanks for your mail.
    I did declare variable as xs:string external, it did not work. I heard from other engineers in the group that since 2.3.8, “external” variables in BerkeleyDB stopped working. We are using 2.3.10.
    I also noticed that the query plan when we using external variables is not valid XML (<GlobalVar name="var external="true">). I hope this is just a toString() issue and nothing major.
    I have attached the query plans; I do not see anything different between the two. I would really appreciate your help on this.
    Thanks,
    Suresh
    Here are the query plans:
    Query that executes in 2 ms (which has the hard coded value):
    Query:
    String myQuery = "declare namespace tf = \"http://aplaud.com/ns/0.1/tts/format\";" +
    "count (collection('db/title')/Record[ContentKey=\"a0a0188000001115348efcc00000003http://daxweb.org/ns/1.0/taxonomy/Product Type/Gift Receipt\"]/tf:TitleDocument/tf:Title/Content/Detail/GiftInfo/Gift)";
    Query Plan:
    <XQuery>
    <Function name="{http://www.w3.org/2005/xpath-functions}:count">
    <DocumentOrder>
    <DbXmlNav>
    <LookupIndex container="db/title">
    <ValueQP index="edge-element-equality-string" operation="eq" parent="Record" child="ContentKey" value="a0a0188000001115348efcc00000003http://daxweb.org/ns/1.0/taxonomy/Product Type/Gift Receipt"/>
    </LookupIndex>
    <Join type="parent-of-child" return="argument">
    <DbXmlNav>
    <QueryPlanFunction result="collection" container="db/title">
    <OQPlan>V(edge-element-equality-string,Record.ContentKey,=,'a0a0188000001115348efcc00000003http://daxweb.org/ns/1.0/taxonomy/Product Type/Gift Receipt')</OQPlan>
    </QueryPlanFunction>
    <DbXmlStep axis="child" name="Record" nodeType="element"/>
    </DbXmlNav>
    </Join>
    <DbXmlStep axis="child" prefix="tf" uri="http://aplaud.com/ns/0.1/tts/format" name="TitleDocument" nodeType="element"/>
    <DbXmlStep axis="child" prefix="tf" uri="http://aplaud.com/ns/0.1/tts/format" name="Title" nodeType="element"/>
    <DbXmlStep axis="child" name="Content" nodeType="element"/>
    <DbXmlStep axis="child" name="Detail" nodeType="element"/>
    <DbXmlStep axis="child" name="GiftInfo" nodeType="element"/>
    <DbXmlStep axis="child" name="Gift" nodeType="element"/>
    </DbXmlNav>
    </DocumentOrder>
    </Function>
    </XQuery>
    Query that executes takes 4 seconds (which has the declared var as xs:string external):
    Query:
    String myQuery = "declare namespace tf = \"http://aplaud.com/ns/0.1/tts/format\"; declare variable $var as xs:string external;" + "count (collection('db/title')/Record[ContentKey=$var]/tf:TitleDocument/tf:Title/Content/Detail/GiftInfo/Gift)";
    Query Plan:
    <XQuery>
    <GlobalVar name="var external="true">
    <SequenceType occurrence="exactly_one" testType="atomic-type" type="http://www.w3.org/2001/XMLSchema:string"/>
    </GlobalVar>
    <Function name="{http://www.w3.org/2005/xpath-functions}:count">
    <DocumentOrder>
    <DbXmlNav>
    <LookupIndex container="db/title">
    <ValueQP index="edge-element-equality-string" operation="eq" parent="Record" child="ContentKey">
    <Variable name="var"/>
    </ValueQP>
    </LookupIndex>
    <Join type="parent-of-child" return="argument">
    <DbXmlNav>
    <QueryPlanFunction result="collection" container="db/title">
    <OQPlan>P(edge-element-equality-string,prefix,Record.ContentKey)</OQPlan>
    </QueryPlanFunction>
    <DbXmlStep axis="child" name="Record" nodeType="element"/>
    </DbXmlNav>
    </Join>
    <DbXmlStep axis="child" prefix="tf" uri="http://aplaud.com/ns/0.1/tts/format" name="TitleDocument" nodeType="element"/>
    <DbXmlStep axis="child" prefix="tf" uri="http://aplaud.com/ns/0.1/tts/format" name="Title" nodeType="element"/>
    <DbXmlStep axis="child" name="Content" nodeType="element"/>
    <DbXmlStep axis="child" name="Detail" nodeType="element"/>
    <DbXmlStep axis="child" name="GiftInfo" nodeType="element"/>
    <DbXmlStep axis="child" name="Gift" nodeType="element"/>
    </DbXmlNav>
    </DocumentOrder>
    </Function>
    </XQuery>

  • XQuery Performance in BerkeleyDB More options

    We are migrating from IPedo to Berkeley DB.
    IPedo did not support multiple indices in their Xqueries, so we had to
    concatenate some fields in to one field and index that field, the
    Xqueries were really fast.
    Unfortunately the same XQuery does not perform well in BerkeleyDB.
    This is how we create the index for this filed (ContentKey) in
    BerkeleyDB
    addIndex '' 'ContentKey' edge-element-equality-string
    and this is how I query using Java API.
    queryContext.setEvaluationType(XmlQueryContext.Eager);
    queryContext.setVariableValue("ContentKey", new XmlValue(
    "a0a0188000001115348efcc00000003XXXXXXXXXXXXXYYYYYYYYYYYYYY"));
    // Declare the query string
    String myQuery = "collection('db/title')/Record[ContentKey=
    $ContentKey]";
    // Prepare (compile) the query
    XmlQueryExpression xmlQueryExpression =
    dbManager.prepare(myQuery,queryContext);
    1. What is wrong with the index or the way I am using the Java API ?
    Changing the evaluation type to Lazy did not help at all.
    2. The Query performs OK in dbxml.
    3. Are there any other commercial/open source tools to evaluate the
    performance of a Xquery in BerkeleyDB? Stylus Studio does not support
    BDB - 2.3.10 yet.
    Any help would be appreciated.
    Thanks,
    Suresh

    Hi,
    I'm sorry, you're in the wrong forum. Please post to the Berkeley DB XML forum:
    Berkeley DB XML
    Thanks,
    Mark

  • XQuery Performance

    Hi,
    I have a question about XQuery 's performance and its java applications.
    I have a bulky flat file, about 500 MB that I have to parse/use. I already made an XML representation for its entries. The problem is that the project is still new, and it would be easier for me to manipulate a 'data definition' in XML rather than in a DBMS. I have to admit that the tree structure of a 'node' from this file is not really very deep.
    - Is it faster to search an XML file using XQuery than to search a flat file using techniques like regular expressions, String class methods, ...etc?
    - Would a DBMS be faster than both?
    - Any free and reliable java APIs for XQuery (feedback from someone who has actually used it)?
    Thanks.

    Your questions about performance can't be answered because they are highly dependent on your data and the code you write.
    As for the DBMS versus XML question, I would prefer a DBMS if my data fit nicely into tables, but if it were tree-structured I would consider XML. But most XML search software likes to load the entire tree into memory, so 500 MB is going to be hard to deal with. In this case I would seriously consider a database.
    As for implementations of XQuery, if I wanted one I would use Michael Kay's SAXON product which implements XQuery and XSLT 2. I haven't used it myself but I have used its earlier incarnation which was XSLT only, and following its mailing list leads me to believe it is reliable. The schema-aware version costs money but there's a free version that doesn't do schemas.

  • Oracle XQuery performance issue in XMLType column

    Dear All,
    As for oracle I'm using oracle 11g to measure the performance.
    I'm using data from XMark project which is a >100MB data of XML for bencmarking purposes.
    I make a table that contains an XMLType column and upload the data into that column, after doing that I try to do a query like this:
    select xmlquery(
    'for $i in /site/people/person
    where $i/id = "person0"
    return $i/name'
    passing BookXMLContent Returning Content)
    from Book;
    The purpose of this query is to retrieve the name of a person that have id = 'person0'
    My questions are:
    1. Did I do something wrong with my query?
    2. Is there any setting on the database that I should done prior to doing the query to done significantly better result?
    3. Is there any other approach that are much better than I currently used?
    Regards,
    Anthony Steven
    Edited by: mdrake on Nov 4, 2009 6:01 AM

    Anthony
    First, please read the licencing terms for Oracle ( And I suspect DB2, MSFT) . You are not allowed to publish externally (in any form, including forum posts :) ) the results of any benchmarking activities. I have edited your post accordingly. I hope this research is not part of a thesis or similar work that would intend making public as you and your institution would be in violation of your licence agreeement were you to do so.
    Now back to your question, how can you improve performance for XMark
    #1. Can you show us the create table statement you used, so we can see how you created your XMLType column BOOKXMLCONTENT.
    #2. Did you create any indexes
    #3. Did you look at the explain plan output.
    -Mark
    Edited by: mdrake on Nov 4, 2009 6:06 AM

  • XQuery performance vs. Lucene

    I'm trying to tune some databases for better query performance. I'm using the BDB XML Java API. The queries I'm trying to tune all return counts, i.e. count(...). Presently a query might take about 500ms to execute, but ideally it would take 50-100ms. I have indexes defined on all the elements/attributes I'm referencing in the queries, but still I can't get them to execute any faster. I was hoping to achieve the 50-100ms query time by comparing the execution time of Lucene queries on a similar dataset, i.e. I take the same set of data and then both index it with Lucene and also store it in BDB XML, then run equivalent queries in each. Lucene consistently can execute in the 50ms range, and BDB XML consistently 10x-20x slower.
    Is this just an inherit property of BDB XML? Should the btree indices in BDB execute in the same order of time as Lucene indices, or are my expectations too high? I realize this is highly dependent on the queries, but again BDB is using indexes in all its lookups, and the query can be expressed as an simple XPath.
    I have tweaked my BDB cache settings, but db_stat lists 99% cache hits, like this:
    31MB 256KB 740B Total cache size
    1 Number of caches
    31MB 264KB Pool individual cache size
    0 Maximum memory-mapped file size
    0 Maximum open file descriptors
    0 Maximum sequential buffer writes
    0 Sleep after writing maximum sequential buffers
    55 Requested pages mapped into the process' address space
    169M Requested pages found in the cache (99%)
    145924 Requested pages not found in the cache
    24572 Pages created in the cache
    145924 Pages read into the cache
    97563 Pages written from the cache to the backing file
    148863 Clean pages forced from the cache
    17646 Dirty pages forced from the cache
    0 Dirty pages written by trickle-sync thread
    3971 Current total page count
    3952 Current clean page count
    19 Current dirty page count
    4099 Number of hash buckets used for page location
    168M Total number of times hash chains searched for a page (168995875)
    6 The longest hash chain searched for a page
    274M Total number of hash chain entries checked for page (274808266)
    0 The number of hash bucket locks that required waiting (0%)
    0 The maximum number of times any hash bucket lock was waited for
    0 The number of region locks that required waiting (0%)
    170665 The number of page allocations
    334509 The number of hash buckets examined during allocations
    9 The maximum number of hash buckets examined for an allocation
    166509 The number of pages examined during allocations
    2 The max number of pages examined for an allocation
    Is there anything I'm missing, configuration-wise perhaps?

    Thanks for the info, John. Without changing the index definitions, changing the query to
    count(collection('sales.dbxml')//als:match-back-matches[@sale-month=200512 and @sale-model='Jetta'])
    with a query plan of
    n(V(node-attribute-equality-string,@sale-model,=,'Jetta'),V(node-attribute-equality-decimal,@sale-month,=,'200512'),P(node-element-presence-none,=,match-back-matches:http://autoleadservice.com/xml/als))
    seems to make the query consistently on the low-end of the previous query's speed, meaning around 3100ms for 11415 results.
    Changing the query to
    count(collection('sales.dbxml')//als:match-back-matches[@sale-month=200512][@sale-model='Jetta'])
    with a query plan of
    n(V(node-attribute-equality-decimal,@sale-month,=,'200512'),V(node-attribute-equality-string,@sale-model,=,'Jetta'),P(node-element-presence-none,=,match-back-matches:http://autoleadservice.com/xml/als))
    does not seem to make any difference in speed.
    Then I added the edge indices as you described, and for the same previous query the query plan becomes
    n(V(edge-attribute-equality-decimal,match-back-matches:http://autoleadservice.com/xml/als.@sale-month,=,'200512'),V(edge-attribute-equality-string,match-back-matches:http://autoleadservice.com/xml/als.@sale-model,=,'Jetta'))
    I saw this execute in as little as 2625ms... still 187x longer than Lucene.
    I appreciate how Lucene and BDB XML are quite different and the things pointed out in this thread have been very helpful. I only mean to compare them in this very simplified view of direct index lookups, and I wanted to know if it would be reasonable to expect BDB XML index lookups, for queries as similar as possible to a Lucene index query, could perform in the same order of time.
    For reference to the XML I'm using, I have a XML collection defined with 170,000-ish documents loaded that look similar to the XML below. The Lucene index I'm querying contains all all of the same 170,000 documents as well as some more data loaded bringing it's collection to about 195,000 documents.
    <als:match-back-matches xmlns:als="http://autoleadservice.com/xml/als" direct-sale="true"
    has-match="true" sale-area="18 " sale-date="2005-12-14-05:00" sale-day="20051214"
    sale-dealer="409460" sale-model="Jetta" sale-month="200512" sale-region="MAR"
    sale-year-month="2005-12-05:00" vin="XXX">
    <als:match lead-area="18 " lead-date="2005-12-12-05:00" lead-day="20051212"
    lead-dealer="409460" lead-id="196973" lead-model="Jetta" lead-month="200512"
    lead-region="MAR" lead-source="cobalt-vw" lead-unique-all="true" lead-unique-area="true"
    lead-unique-dealer="true" lead-unique-region="true" lead-year-month="2005-12-05:00"
    match-range="0-30" owner-address="1200 Main St." owner-alternate-phone="555-863-7264"
    owner-email="[email protected]" owner-first-name="rani" owner-last-name="adzarne"
    owner-phone="703-742-0900" owner-postal-code="10191"/>
    <als:match lead-area="18 " lead-date="2005-12-12-05:00" lead-day="20051212"
    lead-dealer="409460" lead-id="197007" lead-model="Jetta" lead-month="200512"
    lead-region="MAR" lead-source="vw.com" lead-unique-all="false" lead-unique-area="true"
    lead-unique-dealer="false" lead-unique-region="false" lead-year-month="2005-12-05:00"
    match-range="0-30" owner-address="1200 Main St." owner-email="[email protected]"
    owner-first-name="rani" owner-last-name="zarnegar" owner-postal-code="20191"/>
    </als:match-back-matches>
    <als:match-back-matches xmlns:als="http://autoleadservice.com/xml/als" direct-sale="true"
    has-match="true" sale-area="29 " sale-date="2005-12-29-05:00" sale-day="20051229"
    sale-dealer="425213" sale-model="Jetta" sale-month="200512" sale-region="SER"
    sale-year-month="2005-12-05:00" vin="YYY">
    <als:match lead-area="29 " lead-date="2005-12-14-05:00" lead-day="20051214"
    lead-dealer="425213" lead-id="199347" lead-model="Jetta" lead-month="200512"
    lead-region="SER" lead-source="edmunds" lead-unique-all="true" lead-unique-area="true"
    lead-unique-dealer="true" lead-unique-region="true" lead-year-month="2005-12-05:00"
    match-range="0-30" owner-email="[email protected]" owner-first-name="Monique"
    owner-last-name="single" owner-phone="555-495-8933" owner-postal-code="60130"/>
    </als:match-back-matches>
    and the indexes I have defined presently are:
    Default Index: node-element-presence-none
    Index: node-attribute-equality-boolean for node {}:captured-sale
    Index: node-attribute-equality-boolean for node {}:direct-sale
    Index: node-attribute-equality-boolean for node {}:has-match
    Index: node-attribute-equality-string for node {}:lead-area
    Index: node-attribute-equality-date for node {}:lead-date
    Index: node-attribute-equality-decimal for node {}:lead-day
    Index: node-attribute-equality-string for node {}:lead-dealer
    Index: edge-attribute-equality-decimal for node {}:lead-id
    Index: node-attribute-equality-string for node {}:lead-model
    Index: node-attribute-equality-decimal for node {}:lead-month
    Index: node-attribute-equality-string for node {}:lead-region
    Index: node-attribute-equality-string for node {}:lead-source
    Index: node-attribute-equality-yearMonth for node {}:lead-year-month
    Index: node-attribute-equality-string for node {}:match-range
    Index: unique-node-metadata-equality-string for node {http://www.sleepycat.com/2002/dbxml}:name
    Index: node-attribute-equality-string for node {}:sale-area
    Index: node-attribute-equality-date for node {}:sale-date
    Index: node-attribute-equality-decimal for node {}:sale-day
    Index: node-attribute-equality-string for node {}:sale-dealer
    Index: node-attribute-equality-string edge-attribute-equality-string for node {}:sale-model
    Index: node-attribute-equality-decimal edge-attribute-equality-decimal for node {}:sale-month
    Index: node-attribute-equality-string for node {}:sale-region
    Index: node-attribute-equality-yearMonth for node {}:sale-year-month
    Index: node-attribute-equality-string for node {}:vin

  • XMLTYPE insert performance

    I am experiencing performance problems when inserting a 30 MB XML file into an XMLTYPE field - under Oracle 11 with the schema I am using the minimum time I can achieve is around 9 minutes which is too long... can anyone comment on whether this performance is normal and possibly suggest how it could be improved while retaining the benefits of structured storage...thanks in advance for the help :)

    sorry for the late reply - I didn't notice that you had replied to my earlier post...
    To answer your questions in order:
    - I am using "structured" storage because I read ( in this article: [http://www.oracle.com/technology/pub/articles/jain-xmldb.html] ) that this would result in higher xquery performance.
    - the schema isn't very large but it is complex. ( as discussed in above article )
    I built my table by first registering the schema and then adding the xml elements to the table such that they would be stored in structured storage. i.e.
    --// Register schema /////////////////////////////////////////////////////////////
    begin
    dbms_xmlschema.registerSchema(
    schemaurl=>'fof_fob.xsd',
    schemadoc=>bfilename('XFOF_DIR','fof_fob.xsd'),
    local=>TRUE,
    gentypes=>TRUE,
    genbean=>FALSE,
    force=>FALSE,
    owner=>'FOF',
    csid=>nls_charset_id('AL32UTF8')
    end;
    COMMIT;
    and then created the table using ...
    --// Create the XCOMP table /////////////////////////////////////////////////////////////
    create table "XCOMP" (
         "type" varchar(128) not null,
         "id" int not null,
         "idstr1" varchar(50),
         "idstr2" varchar(50),
         "name" varchar(255),
         "rev" varchar(20) not null,
         "tstamp" varchar(30) not null,
         "xmlfob" xmltype)
    XMLTYPE "xmlfob" STORE AS OBJECT RELATIONAL
    XMLSCHEMA "fof_fob.xsd"
    ELEMENT "FOB";
    No indexing was specified for this table. Then I inserted the offending 30 MB xml file using (in c#, using ODP.NET under .NET 3.5):
    void test(string myName, XElement myXmlElem)
    OracleConnection connection = new OracleConnection();
    connection.Open();
    string statement = "INSERT INTO XCOMP ( \"name\", \"xmlfob\"") values( :1, :2 )";
    XDocument xDoc = new XDocument(new XDeclaration("1.0", "utf-8", "yes"), myXmlElem);
    OracleCommand insCmd = new OracleCommand(statement, connection);
    OracleXmlType xmlinfo = new OracleXmlType(connection, xDoc.CreateReader());
    insCmd.Parameters.Add(FofDbCmdInsert.Name, OracleDbType.Varchar2, 255);
    insCmd.Parameters.Add(FofDbCmdInsert.Xmldoc, OracleDbType.XmlType);
    insCmd.Parameters[0].Value = myName;
    insCmd.Parameters[1].Value = xmlinfo;
    insCmd.ExecuteNonQuery();
    connection.Close();
    It took around 9 minutes to execute the ExecuteNonQuery statement, usingOracle 11 standard edition running under Windows 2008-64 with 8 GB RAM and 2.5 MHZ single core ( of a quad-core running under VMWARE )
    I would much appreciate any suggestions that could speed up the insert performance here - as a temporary solution I chopped some of the information out of the XML document and store it seperately in another table, but this approach has the disadvantage that I using xqueries is a bit inflexible, although the performance is now in seconds rather than minutes...
    I can't see any reason why Oracle's shredding mechanism should be less efficient than manual shredding the information.
    Thanks in advance for any helpful hints you can provide!

  • Generating large amounts of XML without running out of memory

    Hi there,
    I need some advice from the experienced xdb users around here. I´m trying to map large amounts of data inside the DB (Oracle 11.2.0.1.0) and by large I mean files up to several GB. I compared the "low level" mapping via PL/SQL in combination with ExtractValue/XMLQuery with the elegant XML View Mapping and the best performance gave me the View Mapping by using the XMLTABLE XQuery PATH constructs. So now I have a View that lies on several BINARY XMLTYPE Columns (where the XML files are stored) for the mapping and another view which lies above this Mapping View and constructs the nested XML result document via XMLELEMENT(),XMLAGG() etc. Example Code for better understanding:
    CREATE OR REPLACE VIEW MAPPING AS
    SELECT  type, (...)  FROM XMLTYPE_BINARY,  XMLTABLE ('/ROOT/ITEM' passing xml
         COLUMNS
          type       VARCHAR2(50)          PATH 'for $x in .
                                                                let $one := substring($x/b012,1,1)
                                                                let $two := substring($x/b012,1,2)
                                                                return
                                                                    if ($one eq "A")
                                                                      then "A"
                                                                    else if ($one eq "B" and not($two eq "BJ"))
                                                                      then "AA"
                                                                    else if (...)
    CREATE OR REPLACE VIEW RESULT AS
    select XMLELEMENT("RESULTDOC",
                     (SELECT XMLAGG(
                             XMLELEMENT("ITEM",
                                          XMLFOREST(
                                               type "ITEMTYPE",
    ) as RESULTDOC FROM MAPPING;
    ----------------------------------------------------------------------------------------------------------------------------Now all I want to do is materialize this document by inserting it into a XMLTYPE table/column.
    insert into bla select * from RESULT;
    Sounds pretty easy but can´t get it to work, the DB seems to load a full DOM representation into the RAM every time I perform a select, insert into or use the xmlgen tool. This Representation takes more than 1 GB for a 200 MB XML file and eventually I´m running out of memory with an
    ORA-19202: Error occurred in XML PROCESSING
    ORA-04030: out of process memory
    My question is how can I get the result document into the table without memory exhaustion. I thought the db would be smart enough to generate some kind of serialization/datastream to perform this task without loading everything into the RAM.
    Best regards

    The file import is performed via jdbc, clob and binary storage is possible up to several GB, the OR storage gives me the ORA-22813 when loading files with more than 100 MB. I use a plain prepared statement:
            File f = new File( path );
           PreparedStatement pstmt = CON.prepareStatement( "insert into " + table + " values ('" + id + "', XMLTYPE(?) )" );
           pstmt.setClob( 1, new FileReader(f) , (int)f.length() );
           pstmt.executeUpdate();
           pstmt.close(); DB version is 11.2.0.1.0 as mentioned in the initial post.
    But this isn´t my main problem, the above one is, I prefer using binary xmltype anyway, much easier to index. Anyone an idea how to get the large document from the view into a xmltype table?

  • How can define an outer join in the where clause of a flowr statement?

    Hi- In the sample below I'm joining two views based on username but in this case what I really want to use is an outer join instead. What is the syntax for that? I tried the (+) notation but that didn't seem to work..
    CREATE OR REPLACE PROCEDURE proc_ctsi_all is
    XMLdoc XMLType;
    BEGIN
    DBMS_XDB.deleteResource('/public/CTSI/ctsi_all_rpt1.xml',1);
    SELECT XMLQuery(
    '<Progress_Report>
    <Personnel_Roster>
    {for $c in ora:view("CTSI_INVEST_NONPHS_SOURCE_V"),
    $cphs in ora:view("CTSI_INVEST_PHS_SOURCE_V")
      let $username  := $c/ROW/COMMONS_USERNAME/text(),
    $expertise  := $c/ROW/AREA_OF_EXPERTISE/text(),
    $phsorg  := $cphs/ROW/PHS_ORGANIZATION/text(),
    $activitycode  := $cphs/ROW/ACTIVITY_CODE/text(),
    $username2  := $cphs/ROW/COMMONS_USERNAME/text()
    where $username eq $username2
         return
      <Investigator>
       <Commons_Username>{$username}</Commons_Username>
    <Area_of_Expertise>{$expertise}</Area_of_Expertise>
    <Federal_PHS_Funding>
    <Organization>{$phsorg}</Organization>
    <Activity_Code>{$activitycode}</Activity_Code>
    <Six_Digit_Grant_Number>{$grantnumber}</Six_Digit_Grant_Number>
    </Federal_PHS_Funding>
    </Investigator>}
    </Personnel_Roster>
    </Progress_Report>'
    RETURNING CONTENT) INTO XMLdoc FROM DUAL;
    IF(DBMS_XDB.CREATERESOURCE('/public/CTSI/ctsi_all_rpt1.xml', XMLdoc)) THEN
    DBMS_OUTPUT.PUT_LINE('Resource is created');
    ELSE
    DBMS_OUTPUT.PUT_LINE('Cannot create resource');
    END IF;
    COMMIT;
    END;
    /

    What you could do is use query within an XMLTable syntax. Via the COLUMNS parameter you then pass the "column" as XMLType to the following XMLtable statement.
    little bit like the following
    select hdjfdf
    from xmltable
       ({xquery}
        PASSING
        COLUMNS xmlfrag xmltype path 'xxx'
       ) a
    ,  xmltable
       ({the other stuff you need}
        PASSING a.xmlfrag
       ...etc
      ...etc                 I guess something simular can be done via XQuery straight away as well

  • Term comparison for InPath

    I have a clob column with XML data.
    <attrs><attr name="ESB_Availability_Status"><string>D</string></attr><attr name="ESB_Available_Stock"><int>0</int></attr><attr name="ESB_IsTaxable"><boolean>true</boolean></attr><attr name="ESB_isLeaseAvailable"><boolean>true</boolean></attr></attrs>
    When I use the following query it does not match and find any rows.
    SELECT
      extractValue(
        XmlType(attributes),
        '/attrs/attr[@name="ESB_Availability_Status"]/string'
      ) AS ESB_Availability_Status
    FROM
      MyTable
      WHERE
      CONTAINS(
        attributes,
        '{D} INPATH (/attrs/attr[@name="ESB_Availability_Status"]/string)'
      ) > 0
    But when I update the column with data like this with value P (or for that matter any other charcter N,DQ etc.). It retrieves data.
    <attrs><attr name="ESB_Availability_Status"><string>P</string></attr><attr name="ESB_Available_Stock"><int>0</int></attr><attr name="ESB_IsTaxable"><boolean>true</boolean></attr><attr name="ESB_isLeaseAvailable"><boolean>true</boolean></attr></attrs>
    SELECT
      extractValue(
        XmlType(attributes),
        '/attrs/attr[@name="ESB_Availability_Status"]/string'
      ) AS ESB_Availability_Status
    FROM
      MyTable
      WHERE
      CONTAINS(
        attributes,
        '{P} INPATH (/attrs/attr[@name="ESB_Availability_Status"]/string)'
      ) > 0
    What is happening with the comparison term?

    As this question has nothing to do with the XML DB, you have lowered your chance of getting the answer you seek.
    I think you might be looking for
    https://forums.oracle.com/community/developer/english/oracle_database/text
    Without knowing your version, or apparently having an index setup like you do what about something like
    SELECT
      extractValue(
        XmlType(attributes),
        '/attrs/attr[@name="ESB_Availability_Status"]/string[text()="D"]'
      ) AS ESB_Availability_Status
    FROM
      MyTable
    which does return empty rows if the condition is not meet or
    SELECT *
      FROM (SELECT
            extractValue(
              XmlType(attributes),
              '/attrs/attr[@name="ESB_Availability_Status"]/string'
            ) AS ESB_Availability_Status
          FROM
            MyTable)
      WHERE ESB_Availability_Status = 'D';
    Of course there are also XMLTable/XQuery based approaches as well if you so desire.

  • Re: weird behaviour xml query

    From the XML DB FAQ (#5 in the announcement list)
    How to I use namespaces with XMLQuery() ?
    How do I declare namespace prefix mapping with XMLTable() ?
    Your XML has a default namespace associated with it, so you will need to supply the default namespace to XMLTable/XQuery as well.  Also the XPaths in the COLUMNS clause are relative to the XPath for the XMLTable itself, so I adjusted that as well. The below returns rows, but I did not verify it is what you desired.  This should give you a start at least.
    select a.*, b.*
       from cas_nummers,  -- changed
            xmltable
                     XMLNamespaces(default 'http://echa.europa.eu/schemas/ecInventory'),  -- added
                     '/ECSubstanceInventory/ecSubstances/ECSubstance'
                     passing cas_nummers.object_value  -- changed
                     columns
                     creationDate varchar2(30) path '@creationDate',
                     status varchar2(20) path '@status',
                     ecnumber  varchar2(20) path 'ecNumber',  -- changed
                     casnumber varchar2(20) path 'casNumber',  -- changed
                     molecularFormula varchar2(20) path 'molecularFormula',  -- changed
                     namelist xmltype path 'ecNames'  -- changed
            ) a,
            xmltable
                     XMLNamespaces(default 'http://echa.europa.eu/schemas/ecInventory'),  -- added
                     'ecNames'  -- changed
                     passing a.namelist
                     columns
                     ecName      varchar2(5) path '.'  -- changed
            ) b

    I have replace the file test1.xml with a file of about 28MB with about 100000 records(say casNumbers)
    I have created a table with the columns
    casnumber,ecnumber, molecularformula, cname, status,creationdate
    I use the following plsql anohymous block
    declare
    cursor c0
    is
    select to_date(substr(a.creationdate,1,10),'yyyy-mm-dd') as creationdate
         , decode(a.status,'active',1,0) as status
         , a.ecnumber
         , a.casnumber
         , a.molecularformula
         , b.ecName
       from cas_nummers,
            xmltable 
                     XMLNamespaces(default 'http://echa.europa.eu/schemas/ecInventory'),
                     '/ECSubstanceInventory/ecSubstances/ECSubstance' 
                     passing cas_nummers.object_value 
                     columns 
                     creationDate varchar2(30) path '@creationDate', 
                     status varchar2(20) path '@status', 
                     ecnumber  varchar2(20) path 'ecNumber', 
                     casnumber varchar2(20) path 'casNumber', 
                     molecularFormula varchar2(20) path 'molecularFormula', 
                     namelist xmltype path 'ecNames' 
            ) a, 
            xmltable 
                     XMLNamespaces(default 'http://echa.europa.eu/schemas/ecInventory'),  -- added 
                     'ecNames'  -- changed 
                     passing a.namelist 
                     columns 
                     ecname      varchar2(50) path '.'  -- changed 
            ) b
    v_errm varchar2(200);
    i pls_integer;
    begin
       for r_casnr in c0 loop
       begin
          insert into csa_cas_nummers
          ( casnummer_id
          , cas_nummer
          , ec_nummer 
          , stofnaam
          , moleculair_formule
          , dd_creatie
          , status  
          , volgnr
          values
          (csa_cas_nummers_seq.nextval
          ,r_casnr.casnumber
          ,r_casnr.ecnumber
          , trim(r_casnr.stofnaam)
          , r_casnr.molecularformula
          , r_casnr.creationdate
          , r_casnr.status  
          , 0
        if mod(i,1000) = 0 then
           commit;
        end if;
        exception
           when others then
              v_errm :=substr(sqlerrm,1,200);
              insert into foute_casnrs
              values
              (r_casnr.casnumber, v_errm);
              dbms_output.put_line( r_casnr.casnumber||' -> '||sqlerrm);       
       end;
       end loop;
    end;
    The strange thing is that it stops without error at +/- 38000 records (when i do select count(1) from cas_nummers and no records in the table foute_casnrs)
    I have check some cas_nummers which is in the xml file but not in the tcas_nummers table....
    Any idea what is wrong with my code?
    Thanks in advance,
    Henk

  • Very slow queries

    I have a query that I precompile and invoke after setting a variable. I do this to precompile the queries in stored modules as precompiling the queries would seem like a good performance enhancement.
    When I invoke the query, it returns in 122168.68 (ms)! Obviously this is not ideal so in an attempt to diagnose the problem, I run the same query (same code) but I hard code the value instead of using a variable. Much to my pleasure, the query returned in 17.039 (ms). Now this is great! The only problem is that we can’t go into production with a hard coded variable. Are there any ideas how I can work around this (or better, if there is a patch)?
    The code is as follows (the element names have been changed to protect the innocent):
    import java.io.File;
    import com.sleepycat.db.Environment;
    import com.sleepycat.db.EnvironmentConfig;
    import com.sleepycat.dbxml.XmlContainer;
    import com.sleepycat.dbxml.XmlManager;
    import com.sleepycat.dbxml.XmlManagerConfig;
    import com.sleepycat.dbxml.XmlQueryContext;
    import com.sleepycat.dbxml.XmlQueryExpression;
    import com.sleepycat.dbxml.XmlResults;
    import com.sleepycat.dbxml.XmlValue;
    class BDBTest
        public BDBTest()
            XmlContainer container;
            Environment dbEnv;
            XmlManager dbManager;
            XmlQueryContext queryContext;
            EnvironmentConfig envConf;
            XmlManagerConfig managerConfig;
            String xquery;
            XmlQueryExpression  xmlQueryExpression;
            container = null;
            dbEnv = null;
            dbManager = null;
            try
                envConf = new EnvironmentConfig();
                envConf.setAllowCreate(true);
                envConf.setInitializeCache(true);
                envConf.setInitializeLocking(true);
                envConf.setInitializeLogging(true);
                envConf.setTransactional(true);
                dbEnv = new Environment(new File("/opt/db"), envConf);
                managerConfig = new XmlManagerConfig();
                managerConfig.setAdoptEnvironment(true);
                managerConfig.setAllowAutoOpen(true);
                dbManager = new XmlManager(dbEnv, managerConfig);
                dbManager.setDefaultContainerType(XmlContainer.NodeContainer);
                container = dbManager.openContainer("db/test");
                queryContext = dbManager.createQueryContext();
                queryContext.setEvaluationType(XmlQueryContext.Eager);
                queryContext.setVariableValue("contentKey", new XmlValue("AlexUserhttp://mydomain.org/ns/1.0/some/test/value/HERE"));
                // This query is very slow.
                xquery = "declare namespace tf = \"http://mydomain.org/ns/0.1/test/format\"; " +
                    "count (collection('db/test')/Record[ContentKey=$contentKey]/tf:TestDocument)";
                // This query is very fast.
                // xquery = "declare namespace tf = \"http://mydomain.org/ns/0.1/test/format\"; " +
                   // "count (collection('db/test')/Record[ContentKey=\"AlexUserhttp://mydomain.org/ns/1.0/some/test/value/HERE\"]/tf:TestDocument)";
                xmlQueryExpression = dbManager.prepare(xquery, queryContext);
                String qPlan = xmlQueryExpression.getQueryPlan();
                System.out.println("--------------------------------------------------");
                System.out.println(qPlan);
                System.out.println("--------------------------------------------------");
                long ns0 = System.nanoTime();
                XmlResults results = xmlQueryExpression.execute(queryContext);
                long ns1 = System.nanoTime() - ns0;
                double ms1 = (double) ns1 / 1000000;
                String message = "Found ";
                message += results.size() + " documents for query: '";
                message += xquery + " Time to execute: " + ms1 + " (ms)\n";
                System.out.println(message);
                System.out.println(results.next().asNumber());
            catch (Exception e)
                e.printStackTrace(System.err);
        public static void main(String args[]) throws Throwable
            new BDBTest();
    }The query plans are as follows:
    SLOW:
    <XQuery>
      <Function name="{http://www.w3.org/2005/xpath-functions}:count">
        <DocumentOrder>
          <DbXmlNav>
            <LookupIndex container="db/test">
              <ValueQP index="edge-element-equality-string" operation="eq" parent="Record" child="ContentKey">
                <Variable name="contentKey"/>
              </ValueQP>
            </LookupIndex>
            <Join type="parent-of-child" return="argument">
              <DbXmlNav>
                <QueryPlanFunction result="collection" container="db/test">
                  <OQPlan>P(edge-element-equality-string,prefix,Record.ContentKey)</OQPlan>
                </QueryPlanFunction>
                <DbXmlStep axis="child" name="Record" nodeType="element"/>
              </DbXmlNav>
            </Join>
            <DbXmlStep axis="child" prefix="tf" uri="http://mydomain.org/ns/0.1/test/format" name="TestDocument" nodeType="element"/>
          </DbXmlNav>
        </DocumentOrder>
      </Function>
    </XQuery>
    Found 1 documents for query: 'declare namespace tf = "http://mydomain.org/ns/0.1/test/format"; count (collection('db/test')/Record[ContentKey=$contentKey]/tf:TestDocument) Time to execute: 122168.68 (msFAST:
    <XQuery>
      <Function name="{http://www.w3.org/2005/xpath-functions}:count">
        <DocumentOrder>
          <DbXmlNav>
            <LookupIndex container="db/test">
              <ValueQP index="edge-element-equality-string" operation="eq" parent="Record" child="ContentKey" value="AlexUserhttp://mydomain.org/ns/1.0/some/test/value/HERE"/>
            </LookupIndex>
            <Join type="parent-of-child" return="argument">
              <DbXmlNav>
                <QueryPlanFunction result="collection" container="db/test">
                  <OQPlan>V(edge-element-equality-string,Record.ContentKey,=,'AlexUserhttp:// ://mydomain.org/ns/1.0/some/test/value/HERE')</OQPlan>
                </QueryPlanFunction>
                <DbXmlStep axis="child" name="Record" nodeType="element"/>
              </DbXmlNav>
            </Join>
            <DbXmlStep axis="child" prefix="tf" uri="http://mydomain.org/ns/0.1/test/format" name="TestDocument" nodeType="element"/>
          </DbXmlNav>
        </DocumentOrder>
      </Function>
    </XQuery>
    Found 1 documents for query: 'declare namespace tf = "http://mydomain.org/ns/0.1/test/format"; count (collection('db/test')/Record[ContentKey="AlexUserhttp://mydomain.org/ns/1.0/some/test/value/HERE"]/tf:TestDocument) Time to execute: 17.039 (ms)We’re using Java with BDB 2.3.10 on CentOs 64 bit (but see this in other environments as well). I’m happy to give any more info.
    Thank you for your help,
    Alex

    Hi Alex,
    I've just answered this question here:
    Re: XQuery Performance in BerkeleyDB
    John

Maybe you are looking for

  • Can I add forms or text boxes to a pdf in Acrobat that can then be edited in Reader?

    Hi, I'm trying to add forms to a pre-existing pdf in Acrobat which can be saved then filled with text in Adobe Reader. So far, I've managed to add two forms to my document in Acrobat, but when I enter text into one, it automatically copies into the o

  • DVD Player for X61

    Can someone tell me where can I download DVD Player for my X61. I like Intervideo but need to pay. My X61 does not have DVD Player. Window Media Player cannot play some of my DVD format.

  • Unexpected XMP_Error Invalid UTF-8 data byte

    Hi all, I'm using the XMP SDK 4.4.2 and with it the sample application "xmpcommands". This one is extremely helpful - but it fails when using files that have Umlauts(öäü) in the name. When calling, i.e. xmpcommand.exe get c:\Fileöäü.jpg the command r

  • CS4  Upgrade

    :clock; Hello, I was wondering what the general consensus was regarding CS4 and if there were any compelling reasons to upgrade from CS3? For me, $599 is a hefty price to upgrade and one that I can't really afford unless there are compelling reasons/

  • EXPORT EXCEL *.xls

    Any body know how to when export excel make that the columns in my xls file look the size needed to show the all data???? Sorry about my english but I just started to work with a multinational corporation that ask, post, and answer in this language.