Oracle text performance

hi all.
i have useing orcle text for the indexing purposes ..
the below query goes for an wildcard search and it performance is very poor ..
/* Formatted on 2009/08/11 16:06 (Formatter Plus v4.8.5) */
SELECT *
  FROM (SELECT z.*, ROWNUM r
          FROM (SELECT   *
                    FROM (SELECT *
                            FROM (SELECT score (1) score,
                                         SUBSTR (note, 1, 200) note, tmplt_id,
                                         appl_name, appl_use, appl_reference,
                                         appl_entity, effctv_start_date,
                                         effctv_end_date, eq_name, note_id,
                                         Bfcommon.getvaluebasedontemplate
                                                             (tmplt_id,
                                                              note_id
                                                             ) VALUE,
                                         (SELECT em_name
                                            FROM ip_eq
                                           WHERE eq_name = a.eq_name) em_name,
                                         Bfcommon.is_edit_allowed_note1
                                                   (a.note_id,
                                                    'PACRIM1\E317329',
                                                    SYSDATE,
                                                    SYSDATE
                                                   ) LOCKED
                                    FROM bf_note a
                                   WHERE tmplt_id NOT IN (19, 14, 16)
                                     AND effctv_start_date >=
                                            TO_DATE ('08/12/2008 00:00:00',
                                                     'MM/DD/YYYY HH24:MI:SS'
                                     AND (   effctv_end_date <=
                                                TO_DATE
                                                      ('08/12/2009 00:00:00',
                                                       'MM/DD/YYYY HH24:MI:SS'
                                          OR effctv_end_date IS NULL
                                     AND contains (note, '%test%', 1) > 0) r
                          UNION
                          SELECT score (1) score, SUBSTR (note, 1, 200) note,
                                 tmplt_id, appl_name, appl_use,
                                 appl_reference, appl_entity,
                                 effctv_start_date, effctv_end_date, eq_name,
                                 a.note_id, b.trgt_name VALUE,
                                 (SELECT em_name
                                    FROM ip_eq
                                   WHERE eq_name = a.eq_name) em_name,
                                 Bfcommon.is_edit_allowed_note1
                                                   (a.note_id,
                                                    'PACRIM1\E317329',
                                                    SYSDATE,
                                                    SYSDATE
                                                   ) LOCKED
                            FROM bf_note a, om_limit_hist b
                           WHERE (   date_changed =
                                        (SELECT MAX (date_changed)
                                           FROM om_limit_hist h1
                                          WHERE date_changed <
                                                   (SELECT MAX (date_changed)
                                                      FROM om_limit_hist h2
                                                     WHERE h2.trgt_name =
                                                                   b.trgt_name)
                                            AND h1.trgt_name = b.trgt_name)
                                  OR date_changed =
                                           (SELECT MAX (date_changed)
                                              FROM om_limit_hist h1
                                             WHERE h1.trgt_name = b.trgt_name)
                             AND b.note_id = a.note_id(+)
                             AND tmplt_id = 19
                             AND effctv_start_date >=
                                    TO_DATE ('08/12/2008 00:00:00',
                                             'MM/DD/YYYY HH24:MI:SS'
                             AND (   effctv_end_date <=
                                        TO_DATE ('08/12/2009 00:00:00',
                                                 'MM/DD/YYYY HH24:MI:SS'
                                  OR effctv_end_date IS NULL
                             AND contains (note, '%test%', 1) > 0)
                ORDER BY 1 DESC) z
         WHERE ROWNUM <= 50)
WHERE r >= 1here it goes for the wild card search for the string test ...Plz tell me how to index it in this case ..
and its plan is
Execution Plan
Plan hash value: 3535478881
| Id  | Operation                            | Name                | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT                     |                     |    50 |   128K|       |   632   (1)| 00:00:08 |
|*  1 |  VIEW                                |                     |    50 |   128K|       |   632   (1)| 00:00:08 |
|*  2 |   COUNT STOPKEY                      |                     |       |       |       |         |     |
|   3 |    VIEW                              |                     |    69 |   177K|       |   632   (1)| 00:00:08 |
|*  4 |     SORT ORDER BY STOPKEY            |                     |    69 |   177K|   376K|   632   (1)| 00:00:08 |
|   5 |      VIEW                            |                     |    69 |   177K|       |   590   (1)| 00:00:08 |
|   6 |       SORT UNIQUE                    |                     |    69 |  7281 |       |   590   (2)| 00:00:08 |
|   7 |        UNION-ALL                     |                     |       |       |       |         |     |
|*  8 |         TABLE ACCESS BY INDEX ROWID  | BF_NOTE             |    68 |  7140 |       |   585   (1)| 00:00:08 |
|*  9 |          DOMAIN INDEX                | BF_NOTE_TEXT_SEARCH |       |       |       |   221   (0)| 00:00:03 |
|* 10 |         FILTER                       |                     |       |       |       |         |     |
|  11 |          NESTED LOOPS                |                     |     1 |   141 |       |     3   (0)| 00:00:01 |
|  12 |           TABLE ACCESS FULL          | OM_LIMIT_HIST       |     1 |    36 |       |     2   (0)| 00:00:01 |
|* 13 |           TABLE ACCESS BY INDEX ROWID| BF_NOTE             |     1 |   105 |       |     1   (0)| 00:00:01 |
|* 14 |            INDEX UNIQUE SCAN         | BF_NOTE_PK          |     1 |       |       |     1   (0)| 00:00:01 |
|  15 |          SORT AGGREGATE              |                     |     1 |    23 |       |         |     |
|* 16 |           INDEX RANGE SCAN           | OM_LIMIT_HIST_IDX1  |     1 |    23 |       |     0   (0)| 00:00:01 |
|  17 |            SORT AGGREGATE            |                     |     1 |    23 |       |         |     |
|* 18 |             INDEX RANGE SCAN         | OM_LIMIT_HIST_IDX1  |     1 |    23 |       |     0   (0)| 00:00:01 |
|  19 |            SORT AGGREGATE            |                     |     1 |    23 |       |         |     |
|* 20 |             INDEX RANGE SCAN         | OM_LIMIT_HIST_IDX1  |     1 |    23 |       |     0   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   1 - filter("R">=1)
   2 - filter(ROWNUM<=50)
   4 - filter(ROWNUM<=50)
   8 - filter("EFFCTV_START_DATE">=TO_DATE(' 2008-08-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
              "TMPLT_ID"19 AND "TMPLT_ID"14 AND "TMPLT_ID"16 AND ("EFFCTV_END_DATE" IS NULL OR
              "EFFCTV_END_DATE"<=TO_DATE(' 2009-08-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
   9 - access("CTXSYS"."CONTAINS"("NOTE",'%test%',1)>0)
  10 - filter("DATE_CHANGED"= (SELECT /*+ */ MAX("DATE_CHANGED") FROM "OM_LIMIT_HIST" "H1" WHERE
              "H1"."TRGT_NAME"=:B1 AND "DATE_CHANGED"< (SELECT /*+ */ MAX("DATE_CHANGED") FROM "OM_LIMIT_HIST" "H2" WHER
E
              "H2"."TRGT_NAME"=:B2)) OR "DATE_CHANGED"= (SELECT /*+ */ MAX("DATE_CHANGED") FROM "OM_LIMIT_HIST" "H1"
              WHERE "H1"."TRGT_NAME"=:B3))
  13 - filter("TMPLT_ID"=19 AND "EFFCTV_START_DATE">=TO_DATE(' 2008-08-12 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "CTXSYS"."CONTAINS"("NOTE",'%test%',1)>0 AND ("EFFCTV_END_DATE" IS NULL OR
              "EFFCTV_END_DATE"<=TO_DATE(' 2009-08-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
  14 - access("B"."NOTE_ID"="A"."NOTE_ID")
  16 - access("H1"."TRGT_NAME"=:B1 AND "DATE_CHANGED"< (SELECT /*+ */ MAX("DATE_CHANGED") FROM
              "OM_LIMIT_HIST" "H2" WHERE "H2"."TRGT_NAME"=:B2))
       filter("DATE_CHANGED"< (SELECT /*+ */ MAX("DATE_CHANGED") FROM "OM_LIMIT_HIST" "H2" WHERE
              "H2"."TRGT_NAME"=:B1))
  18 - access("H2"."TRGT_NAME"=:B1)
  20 - access("H1"."TRGT_NAME"=:B1)

What version of Oracle are you using?
Your sample query is a good example of the benefits and costs of 'mixed' queries, and the challenges of mixed query performance. Oracle 11g has some very helpful new features (search for SDATA) that can really improve performance. (Specifically, it looks like your query does some useful date-range bounding. You need to get that into the FT index).
In the end, it's not going to be easy to look at your reasonably complex query, understand the data and relationships, and wave the magic wand to make the thing go fast.

Similar Messages

  • Oracle text performance with context search indexes

    Search performance using context index.
    We are intending to move our search engine to a new one based on Oracle Text, but we are meeting some
    bad performance issues when using search.
    Our application allows the user to search stored documents by name, object identifier and annotations(formerly set on objects).
    For example, suppose I want to find a document named ImportSax2.c: according to user set parameters, our search engine format the following
    search queries :
    1) If the user explicitely ask for a search by document name, the query is the following one =>
         select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c WITHIN objname' , 1 ) > 0;
    2) If the user don't specify any extra parameters, the query is the following one =>
         select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c' , 1 ) > 0;
    Oracle text only need around 7 seconds to answer the second query, whereas it need around 50 seconds to give an answer for the first query.
    Here is a part of the sql script used for creating the Oracle Text index on the column OBJFIELDURL
    (this column stores a path to an xml file containing properties that have to be indexed for each object) :
    begin
    Ctx_Ddl.Create_Preference('wildcard_pref', 'BASIC_WORDLIST');
    ctx_ddl.set_attribute('wildcard_pref', 'wildcard_maxterms', 200) ;
    ctx_ddl.set_attribute('wildcard_pref','prefix_min_length',3);
    ctx_ddl.set_attribute('wildcard_pref','prefix_max_length',6);
    ctx_ddl.set_attribute('wildcard_pref','STEMMER','AUTO');
    ctx_ddl.set_attribute('wildcard_pref','fuzzy_match','AUTO');
    ctx_ddl.set_attribute('wildcard_pref','prefix_index','TRUE');
    ctx_ddl.set_attribute('wildcard_pref','substring_index','TRUE');
    end;
    begin
    ctx_ddl.create_preference('doc_lexer_perigee', 'BASIC_LEXER');
    ctx_ddl.set_attribute('doc_lexer_perigee', 'printjoins', '_-');
    ctx_ddl.set_attribute('doc_lexer_perigee', 'BASE_LETTER', 'YES');
    ctx_ddl.set_attribute('doc_lexer_perigee','index_themes','yes');
    ctx_ddl.create_preference('english_lexer','basic_lexer');
    ctx_ddl.set_attribute('english_lexer','index_themes','yes');
    ctx_ddl.set_attribute('english_lexer','theme_language','english');
    ctx_ddl.set_attribute('english_lexer', 'printjoins', '_-');
    ctx_ddl.set_attribute('english_lexer', 'BASE_LETTER', 'YES');
    ctx_ddl.create_preference('german_lexer','basic_lexer');
    ctx_ddl.set_attribute('german_lexer','composite','german');
    ctx_ddl.set_attribute('german_lexer','alternate_spelling','GERMAN');
    ctx_ddl.set_attribute('german_lexer','printjoins', '_-');
    ctx_ddl.set_attribute('german_lexer', 'BASE_LETTER', 'YES');
    ctx_ddl.set_attribute('german_lexer','NEW_GERMAN_SPELLING','YES');
    ctx_ddl.set_attribute('german_lexer','OVERRIDE_BASE_LETTER','TRUE');
    ctx_ddl.create_preference('japanese_lexer','JAPANESE_LEXER');
    ctx_ddl.create_preference('global_lexer', 'multi_lexer');
    ctx_ddl.add_sub_lexer('global_lexer','default','doc_lexer_perigee');
    ctx_ddl.add_sub_lexer('global_lexer','german','german_lexer','ger');
    ctx_ddl.add_sub_lexer('global_lexer','japanese','japanese_lexer','jpn');
    ctx_ddl.add_sub_lexer('global_lexer','english','english_lexer','en');
    end;
    begin
         ctx_ddl.create_section_group('axmlgroup', 'AUTO_SECTION_GROUP');
    end;
    drop index ADSOBJ_XOBJFIELDURL force;
    create index ADSOBJ_XOBJFIELDURL on ADSOBJ(OBJFIELDURL) indextype is ctxsys.context
    parameters
    ('datastore ctxsys.file_datastore
    filter ctxsys.inso_filter
    sync (on commit)
    lexer global_lexer
    language column OBJFIELDURLLANG
    charset column OBJFIELDURLCHARSET
    format column OBJFIELDURLFORMAT
    section group axmlgroup
    Wordlist wildcard_pref
    Oracle created a table named DR$ADSOBJ_XOBJFIELDURL$I which now contains around 25 millions records.
    ADSOBJ is the table contaings information for our documents,OBJFIELDURL is the field that contains the path to the xml file containing
    data to index. That file looks like this :
    <?xml version="1.0" encoding="UTF-8" ?>
    <fields>
    <OBJNAME><![CDATA[NomLnk_177527o.jpgp]]></OBJNAME>
    <OBJREM><![CDATA[Z_CARACT_141]]></OBJREM>
    <OBJID>295926o.jpgp</OBJID>
    </fields>
    Can someone tell me how I can make that kind of request
    "select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c WITHIN objname' , 1 ) > 0;"
    run faster ?

    Below are the execution plan for both the 2 requests :
    select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c WITHIN objname' , 1 ) > 0
    PLAN_TABLE_OUTPUT
    |     Id     | Operation                              |Name                         |Rows     |Bytes     |Cost (%CPU)|
    |     0     | SELECT STATEMENT                    |                              |1272     |119K     |     4     (0)     |
    |     1      | TABLE ACCESS BY INDEX ROWID     |ADSOBJ      |1272     |119K     |     4     (0)     |
    |     2      |     DOMAIN INDEX                    |ADSOBJ_XOBJFIELDURL     |          |          |     4     (0)     |
    Note
    - 'PLAN_TABLE' is old version
    Executed in 2 seconds
    select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c' , 1 ) > 0
    PLAN_TABLE_OUTPUT
    |     Id     |Operation                              |Name                         |Rows     |Bytes     |Cost (%CPU)|
    |     0     | SELECT STATEMENT                    |                              |1272     |119K     |     4     (0)     |
    |     1     | TABLE ACCESS BY INDEX ROWID     |ADSOBJ                         |1272     |119K     |     4     (0)     |
    |     2     | DOMAIN INDEX                    |ADSOBJ_XOBJFIELDURL     |          |          |     4     (0)     |
    Sorry for the result formatting, I can't get it "easily" readable :(

  • Regular expression vs oracle text performance

    Does anyone have experience with comparig performance of regular expression vs oracle text?
    We need to implement a text search on a large volume table, 100K-500K rows.
    The select stmt will select from a VL, a view joining 2 tables, B and _TL.
    We need to search 2 text columns from this _VL view.
    Using regex seems less complex, but the deciding factor is of course performace.
    Would oracle text search perform better than regular expression in general?
    Thanks,
    Margaret

    Hi Dominc,
    Thanks, we'll try both...
    Would you be able to validate our code to create the multi-table index:
    CREATE OR REPLACE PACKAGE requirements_util AS
    PROCEDURE concat_columns(i_rowid IN ROWID, io_text IN OUT NOCOPY VARCHAR2);
    END requirements_util;
    CREATE OR REPLACE PACKAGE BODY requirements_util AS
    PROCEDURE concat_columns(i_rowid IN ROWID, io_text IN OUT NOCOPY VARCHAR2)
    AS
    tl_req pjt_requirements_tl%ROWTYPE;
    b_req pjt_requirements_b%ROWTYPE;
    CURSOR cur_req_name (i_rqmt_id IN pjt_requirements_tl.rqmt_id%TYPE) IS
    SELECT rqmt_name FROM pjt_requirements_tl
    WHERE rqmt_id = i_rqmt_id;
    PROCEDURE add_piece(i_add_str IN VARCHAR2) IS
    lx_too_big EXCEPTION;
    PRAGMA EXCEPTION_INIT(lx_too_big, -6502);
    BEGIN
    io_text := io_text||' '||i_add_str;
    EXCEPTION WHEN lx_too_big THEN NULL; -- silently don't add the string.
    END add_piece;
    BEGIN
         BEGIN
              SELECT * INTO b_req FROM pjt_requirements_b WHERE ROWID = i_rowid;
              EXCEPTION
              WHEN NO DATA_FOUND THEN
              RETURN;
         END;
         add_piece(b_req.req_code);
         FOR tl_req IN cur_req_name(b_req.rqmt_id) LOOP
         add_piece(tl_req.rqmt_name);
    END concat_columns;
    END requirements_util;
    EXEC ctx_ddl.drop_section_group('rqmt_sectioner');
    EXEC ctx_ddl.drop_preference('rqmt_user_ds');
    BEGIN
    ctx_ddl.create_preference('rqmt_user_ds', 'USER_DATASTORE');
    ctx_ddl.set_attribute('rqmt_user_ds', 'procedure', sys_context('userenv','current_schema')||'.'||'requirements_util.concat_columns');
    ctx_ddl.set_attribute('rqmt_user_ds', 'output_type', 'VARCHAR2');
    END;
    CREATE INDEX rqmt_cidx ON pjt_requirements_b(req_code)
    INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS ('DATASTORE rqmt_user_ds
    SYNC (ON COMMIT)');

  • Oracle Text performance -- failed attempts

    We are trying to implement a simple search of text data stored in a heavily used table (inserts/updates). There are 3 columns to index --
    Headline (varchar2(255))
    Subheadline (varchar2(255))
    Teaser (varchar2(4000))
    The first attempt to implement Oracle text w/ CATSEARCH
    begin
    ctx_ddl.create_index_set('cms_iset');
    ctx_ddl.add_index('cms_iset','poolid_cp, mediaid_cp'); /* sub-index A */
    end;
    ---- We knew we were going to filter on poolid_cp and mediaid_cp ---
    CREATE INDEX cms_headlineidx ON con_properties (headline)
    INDEXTYPE IS ctxsys.CTXCAT
    PARAMETERS ('index set cms_iset');
    CREATE INDEX cms_subheadlineidx ON con_properties (subheadline)
    INDEXTYPE IS ctxsys.CTXCAT
    PARAMETERS ('index set cms_iset');
    CREATE INDEX cms_teaseridx ON con_properties (teaser)
    INDEXTYPE IS ctxsys.CTXCAT
    PARAMETERS ('index set cms_iset');
    *********THE RESULTS*************
    Our application server would spin up threads that would appear to be hanging. The load on the DB servers (RAC) were higher than normal. This implementation would have saved on having to do resync's manually.
    The next attempt was implementing w/ CONTEXT:
    alter table con_properties add (dummy varchar2(1));
    begin
    ctx_ddl.create_preference('con_propsearch', 'MULTI_COLUMN_DATASTORE');
    ctx_ddl.set_attribute('con_propsearch', 'columns', 'headline,subheadline,teaser');
    end;
    CREATE INDEX con_properties_searchidx
    ON con_properties(dummy)
    INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS ('datastore CTXSYS.con_propsearch')
    Records getting put into the ctx_user_pending table a few hundred per hour.
    ********THE RESULTS*************
    Same issue with the application servers spinning off threads that seem to be hung. Spikey load on the DB servers (RAC).
    NOTE: In both implementations, running search querys ran OK. However, dropping the text index in BOTH cases caused the application servers to behave normally.
    Can anyone tell me what's going on internally with Oracle TEXT when a table is heavily inserted and updated? What is going on in the background. Is there some sort of lock that the app servers are waiting on? I know there is "overhead" with inserts on a normal b-tree index. Is it "exponential" with Oracle Text?
    Thank you!

    When documents in the base table are inserted, updated, or deleted, their ROWIDs are held in a DML queue until you synchronize the index. You can view this queue with the CTX_USER_PENDING view. Apparently, you are not synchronizing your context index, so the queue is building infinitely. You need to establish some method of synchronizing your index. You can use parameters('sync(on commit)') in your index creation or create an after insert or update statement level trigger, not row trigger, that uses dbms_job.submit to schedule ctx_ddl.sync_index to synchronize the index upon commit of the dml or you can manually run ctx_ddl.sync_index periodically or schedule it or you can alter and rebuild your index periodically or you can drop and recreate it periodically. Which method you choose depends on how current the information that you query needs to be. If your data needs to be current up to the moment, the you should sync on commit. Otherwise it may be better to do it in periodic batches.

  • "Oracle text" performance Problem

    Architecture for Performance on a web site Search!
    I wanna use text service of ORACLE.But I am worried about the performance ....
    How should I design the system if I want the best perfomance and scalability ?
    1.Should I build a seperate coloumn in my every table and merge all the information into one coloumn and full text index that column.
    2.Put a full text index in all column in the table and use OR clause and reverse rank it for AND clause,using CONTAINSTABLE function.
    3.Make a different table and put ID,TYPE and _VALUE fields and search in that table with less coloumns.
    4.Seperate the full text database and search in a seperate db so that I can scale better?
    did anybody have a similiar problem ? Any books on full text search ?

    The number of indexes is irrelevant as such. If you really need 100 tables and you really need full text search on all of them, you need 100 indexes. When you are inserting data in any given table, the fact that there are 99 other tables with 99 other Text indexes is irrelevant.
    That being said, I would seriously question whether a data model that involves doing full-text searches on 100 separate tables was actually a proper data model. That strikes me as highly unlikely.
    Justin

  • Performance issues and options to reduce load with Oracle text implementation

    Hi Experts,
    My database on Oracle 11.2.0.2 on Linux. We have Oracle Text implemented for fuzzy search. Our oracle text indexes are defined as sync on commit as we can not afford to have stale data.  Now our application does literally thousands of inserts/updates/deletes to those columns where we have these Oracle text indexes defined. As a result, we are seeing a lot of performance impact due to the oracle text sync routines being called on each commit. We are doing the index optimization every night (full optimization every night at 3 am).  The oracle text index related internal operations are showing up as top sql in our AWR report and there are concerns that it is causing lot of load on the DB.  Since we do the full index optimization only once at night, I am thinking should I change that , and if I do so, will it help us?
    For example here are some data from my one day's AWR report:
    Elapsed Time (s)
    Executions
    Elapsed Time per Exec (s)
    %Total
    %CPU
    %IO
    SQL Id
    SQL Module
    SQL Text
    27,386.25
    305,441
    0.09
    16.50
    15.82
    9.98
    ddr8uck5s5kp3
    begin ctxsys.drvdml.com_sync_i...
    14,618.81
    213,980
    0.07
    8.81
    8.39
    27.79
    02yb6k216ntqf
    begin ctxsys.syncrn(:idxownid,...
    Full Text of above top sql:
    ddr8uck5s5kp3
    begin ctxsys.drvdml.com_sync_index(:idxname, :idxmem, :partname);
    end
    02yb6k216ntqf
    begin ctxsys.syncrn(:idxownid, :idxoname, :idxid, :ixpid, :rtabnm, :flg); end;
    Now if I do the full index optimization more often and not just once at night 3 PM, will that mean, the load on DB due to sync on commit will decrease? If yes how often should I optimized and doesn't the optimization itself lead to some load? Can someone suggest?
    Thanks,
    OrauserN

    You can query the ctx_parameters view to see what your default and maximum memory values are:
    SCOTT@orcl12c> COLUMN bytes    FORMAT 9,999,999,999
    SCOTT@orcl12c> COLUMN megabytes FORMAT 9,999,999,999
    SCOTT@orcl12c> SELECT par_name AS parameter,
      2          TO_NUMBER (par_value) AS bytes,
      3          par_value / 1048576 AS megabytes
      4  FROM   ctx_parameters
      5  WHERE  par_name IN ('DEFAULT_INDEX_MEMORY', 'MAX_INDEX_MEMORY')
      6  ORDER  BY par_name
      7  /
    PARAMETER                               BYTES      MEGABYTES
    DEFAULT_INDEX_MEMORY               67,108,864             64
    MAX_INDEX_MEMORY                1,073,741,824          1,024
    2 rows selected.
    You can set the memory value in your index parameters:
    SCOTT@orcl12c> CREATE INDEX EMPLOYEE_IDX01
      2  ON EMPLOYEES (EMP_NAME)
      3  INDEXTYPE IS CTXSYS.CONTEXT
      4  PARAMETERS ('SYNC (ON COMMIT) MEMORY 1024M')
      5  /
    Index created.
    You can also modify the default and maximum values using CTX_ADM.SET_PARAMETER:
    http://docs.oracle.com/cd/E11882_01/text.112/e24436/cadmpkg.htm#CCREF2096
    The following contains general guidelines for what to set the max_index_memory parameter and others to:
    http://docs.oracle.com/cd/E11882_01/text.112/e24435/aoptim.htm#CCAPP9274

  • Performance issue with Oracle Text index

    Hi Experts,
    We are on Oracle 11.2..0.3 on Solaris 10. I have implemented Oracle Text in our environment and I am facing a strange performance issue that is happening in our environment.
    One sql having CONTAINS clause is taking forever - more than 20 minutes and still does not complete. This sql has a contains clause and an exists clause and a not exists clause.
    Now if I remove the exists clause and a not exists clause , it completes fast. but with those two clauses it is just taking forever. It is late night so i am not able to post the table and sql query details and will do so tomorrow but based on this general description, are there any pointers for me to review?
    sql query doing fine:
    SELECT
        U.CLNT_OID, U.USR_OID, S.MAILADDR
    FROM
        access_usr U
        INNER JOIN access_sia S
            ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
        WHERE U.CLNT_OID = 'ABCX32S'
        AND CONTAINS(LAST_NAME , 'TO%' ) >0
    --sql query that hangs forever:
    SELECT
        U.CLNT_OID, U.USR_OID, S.MAILADDR
    FROM
        access_usr U
        INNER JOIN access_sia S
            ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
        WHERE U.CLNT_OID = 'ABCX32S'
        AND CONTAINS(LAST_NAME , 'TO%' ) >0
    and exists (--one clause here wiht a few table joins)
    and not exists (--one clause here wiht a few table joins);
    --Now another strange thing I found is if instead of 'TO%' in this sql, if I were to use 'ZZ%' or 'L1%' it works fast but for 'TO%' it goes slow with those two exists not exists clauses!
    I will be most thankful for the inputs.
    OrauserN

    Hi Barbara,
    First of all, thanks a lot for reviewing the issue.
    Unluckily making the change to empty_stoplist did not work out. I am today copying the entire sql here that has this issue and will be most thankful for more insights/pointers on what can be done.
    Here is the entire sql:
    SELECT U.CLNT_OID,
           U.USR_OID,
           S.EMAILADDRESS,
           U.FIRST_NAME,
           U.LAST_NAME,
           S.JOBCODE,
           S.LOCATION,
           S.DEPARTMENT,
           S.ASSOCIATEID,
           S.ENTERPRISECOMPANYCODE,
           S.EMPLOYEEID,
           S.PAYGROUP,
           S.PRODUCTLOCALE
      FROM    ACCESS_USR U
           INNER JOIN
              ACCESS_SIA S
           ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
    WHERE     U.CLNT_OID = 'G39NY3D25942TXDA'
           AND EXISTS
                  (SELECT 1
                     FROM ACCESS_USR_GROUP_XREF UGX
                          INNER JOIN ACCESS_GROUP RELG
                             ON     RELG.CLNT_OID = UGX.CLNT_OID
                                AND RELG.GROUP_OID = UGX.GROUP_OID
                          INNER JOIN ACCESS_GROUP G
                             ON     G.CLNT_OID = RELG.CLNT_OID
                                AND G.GROUP_TYPE_OID = RELG.GROUP_TYPE_OID
                    WHERE     UGX.CLNT_OID = U.CLNT_OID
                          AND UGX.USR_OID = U.USR_OID
                          AND G.GROUP_OID = 920512943
                          AND UGX.INCLUDED = 1)
           AND NOT EXISTS
                      (SELECT 1
                         FROM    ACCESS_USR_GROUP_XREF UGX
                              INNER JOIN
                                 ACCESS_GROUP G
                              ON     G.CLNT_OID = UGX.CLNT_OID
                                 AND G.GROUP_OID = UGX.GROUP_OID
                        WHERE     UGX.CLNT_OID = U.CLNT_OID
                              AND UGX.USR_OID = U.USR_OID
                              AND G.GROUP_OID = 920512943
                              AND UGX.INCLUDED = 1)
           AND CONTAINS (U.LAST_NAME, 'Bon%') > 0;
    Like I said before if the EXISTS and NOT EXISTS clause are removed it works in sub-second. But with those EXISTS and NOT EXISTS CLAUSE IT TAKES ANY WHERE FROM 25 minutes to more than one hour.
    NOte also that it was not TO% but Bon% in the CONTAINS clause that is giving the issue - sorry that was wrong on my part.
    Also please see below the ORACLE TEXT index defined on the table ACCESS_USER:
    --definition of preferences used in the index:
    SET SERVEROUTPUT ON size unlimited
    WHENEVER SQLERROR EXIT SQL.SQLCODE
    DECLARE
       v_err       VARCHAR2 (1000);
       v_sqlcode   NUMBER;
       v_count     NUMBER;
    BEGIN
       ctxsys.ctx_ddl.create_preference ('cust_lexer', 'BASIC_LEXER');
       ctxsys.ctx_ddl.set_attribute ('cust_lexer', 'base_letter', 'YES'); -- removes diacritics
    EXCEPTION
       WHEN OTHERS
       THEN
          v_err := SQLERRM;
          v_sqlcode := SQLCODE;
          v_count := INSTR (v_err, 'DRG-10701');
          IF v_count > 0
          THEN
             DBMS_OUTPUT.put_line (
                'The required preference named CUST_LEXER with BASIC LEXER is already set up');
          ELSE
             RAISE;
          END IF;
    END;
    DECLARE
       v_err       VARCHAR2 (1000);
       v_sqlcode   NUMBER;
       v_count     NUMBER;
    BEGIN
       ctxsys.ctx_ddl.create_preference ('cust_wl', 'BASIC_WORDLIST');
       ctxsys.ctx_ddl.set_attribute ('cust_wl', 'SUBSTRING_INDEX', 'true'); -- to improve performance
    EXCEPTION
       WHEN OTHERS
       THEN
          v_err := SQLERRM;
          v_sqlcode := SQLCODE;
          v_count := INSTR (v_err, 'DRG-10701');
          IF v_count > 0
          THEN
             DBMS_OUTPUT.put_line (
                'The required preference named CUST_WL with BASIC WORDLIST is already set up');
          ELSE
             RAISE;
          END IF;
    END;
    --now below is the code of the index:
    CREATE INDEX ACCESS_USR_IDX3 ON ACCESS_USR
    (FIRST_NAME)
    INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS('LEXER cust_lexer WORDLIST cust_wl SYNC (ON COMMIT)');
    CREATE INDEX ACCESS_USR_IDX4 ON ACCESS_USR
    (LAST_NAME)
    INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS('LEXER cust_lexer WORDLIST cust_wl SYNC (ON COMMIT)');
    The strange thing is that, like I said, If I remove the exists clause the query returns very fast. Also if I modify the query to use only one NOT EXISTS clause and remove the other EXISTS clause it returns in less than one second.  Also if I remove the EXISTS clause and use only the NOT EXISTS  clause it returns in less than 4 seconds. But with both clauses it runs forever!
    When I tried to get dbms_xplan.display_cursor to get the query plan (for the case of both exists and not exists clause in the query), it said that previous statement's sql id was 0 or something like that so that I was not able to see the query plan. I will keep trying to get this plan (it takes 25 minutes to one hour each time but will get this info soon). Again any pointers are most helpful.
    Regards
    OrauserN

  • Oracle Full text Performance

    Hi,
    I created the indexes using the following commands. I have a few questions:
    <b>1. Is the way i have created the indexes is correct? Is it must that i need to use the data store?If yes how do I use that given that there are multiple tables involved</b>
    <br>CREATE INDEX TITLESEARCH ON SCS_PRODUCT(title_printed) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX SUBTITLESEARCH ON SCS_PRODUCT(subtitle_printed) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX FIRSTNAMESEARCH ON SCS_CONTRIBUTOR(first_name) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX LASTNAMESEARCH ON SCS_CONTRIBUTOR(LAST_name) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX PRINTEDNAMESEARCH ON SCS_CONTRIBUTOR(PRINTED_name) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX ESSNSEARCH ON SCS_JOURNAL(ESSN) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX ISSNSEARCH ON SCS_JOURNAL(ISSN) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX ISBN10SEARCH ON scs_sku(isbn_old) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX ISBN13SEARCH ON scs_sku(isbn) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX AVAILABLEUSSEARCH ON SCS_PRODUCT(AVAILABLE_SAGE) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX AVAILABLEUKSEARCH ON SCS_PRODUCT(AVAILABLE_SAGE_UK) INDEXTYPE IS ctxsys.context;<br>
    CREATE INDEX KEYWORDSEARCH ON DCS_PRD_KEYWRDS(KEYWORD) INDEXTYPE IS ctxsys.context;<br>
    <b>2. The performace is very slow.<br>
    3. I am getting the following error sometimes when i try to search though my web application.</b><br>
    ORA-29861: domain index is marked LOADING/FAILED/UNUSABLE
    Any help would be really appreciated!
    Thanks
    Gautam

    1) It is correct - but may be not so performance (you using default options).
    I would not use oracle text for columns isbn and isbn_old. Normal index it better.
    2) Could you post sql, execution plan, more info about computer.
    3) When your computer is too slow for Oracle it can occurs... (LOADING).
    oerr ora 29861
    29861, 00000, "domain index is marked LOADING/FAILED/UNUSABLE"
    // *Cause: An attempt has been made to access a domain index that is
    // being built or is marked failed by an unsuccessful DDL
    // or is marked unusable by a DDL operation.
    // *Action: Wait if the specified index is marked LOADING
    // Drop the specified index if it is marked FAILED
    // Drop or rebuild the specified index if it is marked UNUSABLE.

  • Oracle Text ALTER INDEX Performance

    Greetings,
    We have encountered some ehancement issues with Oracle Text and really need assistance.
    We are using Oracle 9i (Release 9.0.1) Standard Edition
    We are using a very simple Oracle text environmet, with CTXSYS.CONTEXT indextype on Domain Indexes.
    We have indexed two text columns in one table, one of these columns is CLOB.
    Currently if one of these columns is modified, we are using a trigger to automatically ALTER the index.
    This is very slow, it is just like dropping the index and creating it again.
    Is this right? should it be this slow?
    We are also trying to use the ONLINE parameter for ALTER INDEX and CREATE INDEX, but it gives an error saying this feature is not enabled.
    How can we enable it?
    Is there any way in improving the performance of this automatic update of the indexes?
    Would using a trigger be the best way to do this?
    How can we optimize it to a more satifactory performance level?
    Also, are we able to use the language lexers for indexes with the Standard Edition. If so, how do you enable the CTX_DLL?
    Many thanks for any assistance.
    Chi-Shyan Wang

    If you are going to sync your index on every update, you need to make sure that you are optmizing it on a regular basis to remove index fragmentation and remove deleted rows.
    you can set up a dmbs_job to do a ctx_ddl.optmize and run a full optmize periodically.
    Also, depending on the number of rows you have, and also the size of the data, you might want to look at using a CTXCAT index, which is transactional, stays in sync automatically and does not need to be optimized. CTXCAT indexes do not work well on large text objects (they are good for a couple lines of text at most) so they may not suit your dataset.

  • Bad Query Performance in Oracle Text

    Hello everyone, I have the following problem:
    I have a table, TABLE_A from now on, a table of more or less 1,000.000 rows, with a CONTEXT index, using FILE_DATASTORE, CTXSYS.DEFAULT_STORAGE, CTXSYS.NULL_FILTER, CTXSYS.BASIC_LEXER and querying the index in the following way:
    SELECT /*+FIRST_ROWS*/ A.ID, B.ID2, SCORE(1) FROM TABLE_A A, TABLE_B WHERE A.ID = B.ID AND CONTAINS(A.PATH, '<SOME KW>', 1) > 0 ORDER BY SCORE(1) DESC
    where TABLE_B has another 1,000.000 rows.
    The problem is that the query response time is much higher after some time of inactivity regarding those tables. How can I avoid this behavior?. The fact is that those inactivity times (not more than 20min) are core to my application, so I always get long long response times for my queries.
    Is there any cache or cache time parameter that affects this behavior? I have checked the Oracle Text documentation without finding anything about that...
    More data: I am using Oracle 9.2.0.1, but I have tested with the latest patches an the behavior is the same...
    Thank you very much in advance.

    Pablo,
    This appears to be a generic database or OS issue, not a Text specific issue. It really depends on what your application is doing.
    If your application is doing some other database activity such as queries or DMLs on other non-text tables, chances are Oracle Text related data blocks are being aged out of cache. You can either increase the db_cache_size init
    parmater or try to keep the text tables and index tables blocks in cache using ALTER TABLE commands.
    If your app is doing NON-database activity, then chances are your application is taking up much of the machine's physical memory such that OS is swapping ORACLE out of the memory. In which case, you may want to consider to add more memory to the machine or have ORACLE run on a separate machine by itself.

  • Performance of Oracle Text

    Hi,
    I'm tasked to help design an application that will have Oracle Text powering the searching logic. The application will have millions of records (in the 30 million to 50 million range), but there's a restriction that 95% of all searches must be able to complete in 1 second or less (!)
    So, my question is, is it possible for Oracle Text to meet this criteria? Assuming we have the best hardware, etc. Or should I look for another solution/approach?
    Regards,
    Roy

    Hi Roy,
    It's pretty hard to give a yes/no answer based on the limited information. I will say that Oracle's method of indexing is fairly standard (dictionary/postings list) so you are not likely to find a better solution if your records are stored in Oracle. The Oracle Text advantage - tight integration with the database already. You have storage and query optimization features of Oracle when using Oracle Text. < 1 sec response time is pretty tight for any search, but I don't think you'd have any better chance at hitting it with another solution.
    Thanks,
    Ron

  • How do I get Oracle Text to index files on a file server?

    I am new to Oracle (I'm a MS-SQL DBA looking for a Full-Text Search solution that is better than linking to a MS index server.)
    So - Here's the objective:
    I have Oracle Server(Express) installed on a Windows server.
    I would like for Oracle to build a Full-Text Catalog of the files on a separate file server based on file paths in a table in the database.
    (No desire to store terabytes of images and documents inside the database)
    I can get Oracle text up and running, using the URL_Datastore:
    CREATE TABLE files (id NUMBER PRIMARY KEY, issue_id NUMBER, path VARCHAR(255) UNIQUE, ot_format VARCHAR(6), ot_version VARCHAR(10));
    The Compaq server is a remote windows server on my local workgroup, so the fully qualified path is just "compaq" and the URL is valid:
    INSERT INTO files VALUES (9,9,'file://Compaq/FTQ/00000003.pdf',NULL,NULL);
    INSERT INTO files VALUES (13,13,'file://Compaq/FTQ/01.txt',NULL,NULL);
    CREATE INDEX file_index ON files(path) INDEXTYPE IS ctxsys.context
    PARAMETERS ('datastore ctxsys.URL_DATASTORE format column ot_format');
    but when I enter:
    Select * from CTX_User_Index_errors, I see the following errors:
    DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/00000003.pdf
    DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/01.txt
    Did I miss something?
    Do I need to install anything on the file server?
    I would like to convince my company that Oracle can be much quicker than Microsoft's Indexing Service because it can avoid joining two large result sets (one result set from Full_text (indexing service) and one for specific data contained in fields in the MS-SQL database.) Full Text Searches commonly take 40 - 60 seconds where there are 1.5 million multi-page PDF files for a particular set that I sample search on. Without this massive join, I believe I can get the search to run in under 10 seconds.

    Thank you!
    File_Datastore worked fine.
    I was staying away from File_Datastore because the information I gathered from googling suggested that file_datastore would only work locally.
    Now I just have to get Oracle to pull data out of tables in a MS-SQL database on the local network (don't have a clue yet), and then have it index compiled file paths.
    Then MS-SQL can query Oracle with index and full-text criteria and Oracle can send back a result set
    It may sound like a bad way of performing Full-Text Queries, but anything will be better than the way things are currently running. We are currently performing Full Text Searches on a table that is rebuilt nightly, so the table containing millions of file paths is not live..
    It would be so much better if we just migrated to Oracle, but we currently do not have the resources.

  • Pre-loading Oracle text in memory with Oracle 12c

    There is a white paper from Roger Ford that explains how to load the Oracle index in memory : http://www.oracle.com/technetwork/database/enterprise-edition/mem-load-082296.html
    In our application, Oracle 12c, we are indexing a big XML field (which is stored as XMLType with storage secure file) with the PATH_SECTION_GROUP. If I don't load the I table (DR$..$I) into memory using the technique explained in the white paper then I cannot have decent performance (and especially not predictable performance, it looks like if the blocks from the TOKEN_INFO columns are not memory then performance can fall sharply)
    But after migrating to oracle 12c, I got a different problem, which I can reproduce: when I create the index it is relatively small (as seen with ctx_report.index_size) and by applying the technique from the whitepaper, I can pin the DR$ I table into memory. But as soon as I do a ctx_ddl.optimize_index('Index','REBUILD') the size becomes much bigger and I can't pin the index in memory. Not sure if it is bug or not.
    What I found as work-around is to build the index with the following storage options:
    ctx_ddl.create_preference('TEST_STO','BASIC_STORAGE');
    ctx_ddl.set_attribute ('TEST_STO', 'BIG_IO', 'YES' );
    ctx_ddl.set_attribute ('TEST_STO', 'SEPARATE_OFFSETS', 'NO' );
    so that the token_info column will be stored in a secure file. Then I can change the storage of that column to put it in the keep buffer cache, and write a procedure to read the LOB so that it will be loaded in the keep cache. The size of the LOB column is more or less the same as when creating the index without the BIG_IO option but it remains constant even after a ctx_dll.optimize_index. The procedure to read the LOB and to load it into the cache is very similar to the loaddollarR procedure from the white paper.
    Because of the SDATA section, there is a new DR table (S table) and an IOT on top of it. This is not documented in the white paper (the white paper was written for Oracle 10g). In my case this DR$ S table is much used, and the IOT also, but putting it in the keep cache is not as important as the token_info column of the DR I table. A final note: doing SEPARATE_OFFSETS = 'YES' was very bad in my case, the combined size of the two columns is much bigger than having only the TOKEN_INFO column and both columns are read.
    Here is an example on how to reproduce the problem with the size increasing when doing ctx_optimize
    1. create the table
    drop table test;
    CREATE TABLE test
    (ID NUMBER(9,0) NOT NULL ENABLE,
    XML_DATA XMLTYPE
    XMLTYPE COLUMN XML_DATA STORE AS SECUREFILE BINARY XML (tablespace users disable storage in row);
    2. insert a few records
    insert into test values(1,'<Book><TITLE>Tale of Two Cities</TITLE>It was the best of times.<Author NAME="Charles Dickens"> Born in England in the town, Stratford_Upon_Avon </Author></Book>');
    insert into test values(2,'<BOOK><TITLE>The House of Mirth</TITLE>Written in 1905<Author NAME="Edith Wharton"> Wharton was born to George Frederic Jones and Lucretia Stevens Rhinelander in New York City.</Author></BOOK>');
    insert into test values(3,'<BOOK><TITLE>Age of innocence</TITLE>She got a prize for it.<Author NAME="Edith Wharton"> Wharton was born to George Frederic Jones and Lucretia Stevens Rhinelander in New York City.</Author></BOOK>');
    3. create the text index
    drop index i_test;
      exec ctx_ddl.create_section_group('TEST_SGP','PATH_SECTION_GROUP');
    begin
      CTX_DDL.ADD_SDATA_SECTION(group_name => 'TEST_SGP', 
                                section_name => 'SData_02',
                                tag => 'SData_02',
                                datatype => 'varchar2');
    end;
    exec ctx_ddl.create_preference('TEST_STO','BASIC_STORAGE');
    exec  ctx_ddl.set_attribute('TEST_STO','I_TABLE_CLAUSE','tablespace USERS storage (initial 64K)');
    exec  ctx_ddl.set_attribute('TEST_STO','I_INDEX_CLAUSE','tablespace USERS storage (initial 64K) compress 2');
    exec  ctx_ddl.set_attribute ('TEST_STO', 'BIG_IO', 'NO' );
    exec  ctx_ddl.set_attribute ('TEST_STO', 'SEPARATE_OFFSETS', 'NO' );
    create index I_TEST
      on TEST (XML_DATA)
      indextype is ctxsys.context
      parameters('
        section group   "TEST_SGP"
        storage         "TEST_STO"
      ') parallel 2;
    4. check the index size
    select ctx_report.index_size('I_TEST') from dual;
    it says :
    TOTALS FOR INDEX TEST.I_TEST
    TOTAL BLOCKS ALLOCATED:                                                104
    TOTAL BLOCKS USED:                                                      72
    TOTAL BYTES ALLOCATED:                                 851,968 (832.00 KB)
    TOTAL BYTES USED:                                      589,824 (576.00 KB)
    4. optimize the index
    exec ctx_ddl.optimize_index('I_TEST','REBUILD');
    and now recompute the size, it says
    TOTALS FOR INDEX TEST.I_TEST
    TOTAL BLOCKS ALLOCATED:                                               1112
    TOTAL BLOCKS USED:                                                    1080
    TOTAL BYTES ALLOCATED:                                 9,109,504 (8.69 MB)
    TOTAL BYTES USED:                                      8,847,360 (8.44 MB)
    which shows that it went from 576KB to 8.44MB. With a big index the difference is not so big, but still from 14G to 19G.
    5. Workaround: use the BIG_IO option, so that the token_info column of the DR$ I table will be stored in a secure file and the size will stay relatively small. Then you can load this column in the cache using a procedure similar to
    alter table DR$I_TEST$I storage (buffer_pool keep);
    alter table dr$i_test$i modify lob(token_info) (cache storage (buffer_pool keep));
    rem: now we must read the lob so that it will be loaded in the keep buffer pool, use the prccedure below
    create or replace procedure loadTokenInfo is
      type c_type is ref cursor;
      c2 c_type;
      s varchar2(2000);
      b blob;
      buff varchar2(100);
      siz number;
      off number;
      cntr number;
    begin
        s := 'select token_info from  DR$i_test$I';
        open c2 for s;
        loop
           fetch c2 into b;
           exit when c2%notfound;
           siz := 10;
           off := 1;
           cntr := 0;
           if dbms_lob.getlength(b) > 0 then
             begin
               loop
                 dbms_lob.read(b, siz, off, buff);
                 cntr := cntr + 1;
                 off := off + 4096;
               end loop;
             exception when no_data_found then
               if cntr > 0 then
                 dbms_output.put_line('4K chunks fetched: '||cntr);
               end if;
             end;
           end if;
        end loop;
    end;
    Rgds, Pierre

    I have been working a lot on that issue recently, I can give some more info.
    First I totally agree with you, I don't like to use the keep_pool and I would love to avoid it. On the other hand, we have a specific use case : 90% of the activity in the DB is done by queuing and dbms_scheduler jobs where response time does not matter. All those processes are probably filling the buffer cache. We have a customer facing application that uses the text index to search the database : performance is critical for them.
    What kind of performance do you have with your application ?
    In my case, I have learned the hard way that having the index in memory (the DR$I table in fact) is the key : if it is not, then performance is poor. I find it reasonable to pin the DR$I table in memory and if you look at competitors this is what they do. With MongoDB they explicitly says that the index must be in memory. With elasticsearch, they use JVM's that are also in memory. And effectively, if you look at the awr report, you will see that Oracle is continuously accessing the DR$I table, there is a SQL similar to
    SELECT /*+ DYNAMIC_SAMPLING(0) INDEX(i) */    
    TOKEN_FIRST, TOKEN_LAST, TOKEN_COUNT, ROWID    
    FROM DR$idxname$I
    WHERE TOKEN_TEXT = :word AND TOKEN_TYPE = :wtype    
    ORDER BY TOKEN_TEXT,  TOKEN_TYPE,  TOKEN_FIRST
    which is continuously done.
    I think that the algorithm used by Oracle to keep blocks in cache is too complex. A just realized that in 12.1.0.2 (was released last week) there is finally a "killer" functionality, the in-memory parameters, with which you can pin tables or columns in memory with compression, etc. this looks ideal for the text index, I hope that R. Ford will finally update his white paper :-)
    But my other problem was that the optimize_index in REBUILD mode caused the DR$I table to double in size : it seems crazy that this was closed as not a bug but it was and I can't do anything about it. It is a bug in my opinion, because the create index command and "alter index rebuild" command both result in a much smaller index, so why would the guys that developped the optimize function (is it another team, using another algorithm ?) make the index two times bigger ?
    And for that the track I have been following is to put the index in a 16K tablespace : in this case the space used by the index remains more or less flat (increases but much more reasonably). The difficulty here is to pin the index in memory because the trick of R. Ford was not working anymore.
    What worked:
    first set the keep_pool to zero and set the db_16k_cache_size to instead. Then change the storage preference to make sure that everything you want to cache (mostly the DR$I) table come in the tablespace with the non-standard block size of 16k.
    Then comes the tricky part : the pre-loading of the data in the buffer cache. The problem is that with Oracle 12c, Oracle will use direct_path_read for FTS which basically means that it bypasses the cache and read directory from file to the PGA !!! There is an event to avoid that, I was lucky to find it on a blog (I can't remember which, sorry for the credit).
    I ended-up doing that. the events to 10949 is to avoid the direct path reads issue.
    alter session set events '10949 trace name context forever, level 1';
    alter table DR#idxname0001$I cache;
    alter table DR#idxname0002$I cache;
    alter table DR#idxname0003$I cache;
    SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT),  SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0001$I;
    SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT),  SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0002$I;
    SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT),  SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0003$I;
    SELECT /*+ INDEX(ITAB) CACHE(ITAB) */  SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0001$I ITAB;
    SELECT /*+ INDEX(ITAB) CACHE(ITAB) */  SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0002$I ITAB;
    SELECT /*+ INDEX(ITAB) CACHE(ITAB) */  SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0003$I ITAB;
    It worked. With a big relief I expected to take some time out, but there was a last surprise. The command
    exec ctx_ddl.optimize_index(idx_name=>'idxname',part_name=>'partname',optlevel=>'REBUILD');
    gqve the following
    ERROR at line 1:
    ORA-20000: Oracle Text error:
    DRG-50857: oracle error in drftoptrebxch
    ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
    ORA-06512: at "CTXSYS.DRUE", line 160
    ORA-06512: at "CTXSYS.CTX_DDL", line 1141
    ORA-06512: at line 1
    Which is very much exactly described in a metalink note 1645634.1 but in the case of a non-partitioned index. The work-around given seemed very logical but it did not work in the case of a partitioned index. After experimenting, I found out that the bug occurs when the partitioned index is created with  dbms_pclxutil.build_part_index procedure (this enables  enables intra-partition parallelism in the index creation process). This is a very annoying and stupid bug, maybe there is a work-around, but did not find it on metalink
    Other points of attention with the text index creation (stuff that surprised me at first !) ;
    - if you use the dbms_pclxutil package, then the ctx_output logging does not work, because the index is created immediately and then populated in the background via dbms_jobs.
    - this in combination with the fact that if you are on a RAC, you won't see any activity on the box can be very frightening : this is because oracle can choose to start the workers on the other node.
    I understand much better how the text indexing works, I think it is a great technology which can scale via partitioning. But like always the design of the application is crucial, most of our problems come from the fact that we did not choose the right sectioning (we choosed PATH_SECTION_GROUP while XML_SECTION_GROUP is so much better IMO). Maybe later I can convince the dev to change the sectionining, especially because SDATA and MDATA section are not supported with PATCH_SECTION_GROUP (although it seems to work, even though we had one occurence of a bad result linked to the existence of SDATA in the index definition). Also the whole problematic of mixed structured/unstructured searches is completly tackled if one use XML_SECTION_GROUP with MDATA/SDATA (but of course the app was written for Oracle 10...)
    Regards, Pierre

  • Suggestion: Oracle text CONTEXT index on one or more columns ?

    Hi,
    I'm implementing Oracle text using CONTEXT ..... and would like to ask you for performance suggestion ...
    I have a table of Articles .... with columns .. TITLE, SUBTITLE , BODY ...
    Now is it better from performance point of view to move all three columns into one dummy column ... with name like FULLTEXT ... and put index on this single column,
    and then use CONTAINS(FULLTEXT,'...')>0
    Or is it almost the same for oracle if i put indexes on all three columns and then call:
    CONTAINS(TITLE,'...')>0 OR CONTAINS(SUBTITLE,'...')>0 OR CONTAINS(BODY,'...')>0
    I actually don't care if the result is a match in TITLE OR SUBTITLE OR BODY ....
    So if i move into some FULLTEXT column, then i have duplicate data in a article row ... but if i create indexes for each column, than oracle has 2x more to index,optimize and search ... am I wright ?
    Table has 1.8mil records ...
    Thank you.
    Kris

    mackrispi wrote:
    Now is it better from performance point of view to move all three columns into one dummy column ... with name like FULLTEXT ... and put index on this single column,
    and then use CONTAINS(FULLTEXT,'...')>0What version of Oracle are you on? If 11 then you could use a virtual column to do this, otherwise you'd have to write code to maintain the column which can get messy.
    mackrispi wrote:
    Or is it almost the same for oracle if i put indexes on all three columns and then call:
    CONTAINS(TITLE,'...')>0 OR CONTAINS(SUBTITLE,'...')>0 OR CONTAINS(BODY,'...')>0Benchmark it and find out :)
    Another option would be something like this.
    http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:9455353124561
    Were i you, i would try out those 3 approaches and see which meet your performance requirements and weigh that with the ease of implementation and administration.

  • Oracle iRecruitment: Keyword Search within Resumes using Oracle Text

    Dear All,
    As per my understanding (and Note: 247064.1) simple Keyword searches can be performed in iRecruitment if oracle Text is installed. However searching for Keywords within resumes is not possible using Oracle Text and is possible ONLY if Resume Parsing is enabled via a third party (non-oracle) service provider.
    Can you please let me know if my understanding is correct and if not provide further inputs on this.
    Thanks,
    Subrat

    Got this confirmation from Oracle via SR:
    Resume searching is independent of resume parsing and not required to search resumes.
    Oracle Text is the text engine that allows you to search documents using content-based queries. Oracle Text allows you to upload documents, search documents, parse resumes, etc.
    Hence to conclude - Installation of Oracle Text will allow Keyword Searches on resumes.
    Thanks,
    Subrat

Maybe you are looking for

  • DBLP data -- Unknown exception from NsSAX2Reader

    I was trying to read DBLP data (474 MB, [ http://dblp.uni-trier.de/xml/dblp.xml| http://dblp.uni-trier.de/xml/dblp.xml] , + DTD) into Berkeley DB XML container, but I receive the following exception: [java] Exception in thread "main" com.sleepycat.db

  • I just downloaded IOS 8.1 and my iPhone 5S reboots.

    I just downloaded IOS 8.1 on my iPhone 5S. Now all it will do is reboot.  Is there a way to regress or correct whatever the problem is?

  • Populate of field catalog

    HI i want to populate a filecatlogu, i have dtaa in the itab as  A B C 1 2  3 4 5 6 BUT IN MY OUTPUT DATA  is coming as ABC 123 456 THE CODE I HAVE written to build  a filedcalaogue i have written as data: count type i,count1 type i." intial value 0.

  • How to remove SHARED connected as VNC ?

    Hello to all, We are sharing folder with my wife laptop. The laptop appears in the SHARED sections with her username , but the long usernamefollowed by (username_local) appears just below . Selecting it, the grey bar in the Finder window says "Connec

  • Institutions using the Public Site Manager?

    I'm just getting into the Public Site Manager, and would love to see what other institutions are doing with it. If you are using the Public Site Manager, please post a link to your iTunes U store in this discussion thread.