Oracle text Improve perfomance

What is the best way to improve performance in Oracle text?
I retrieve some data in a csv file, i verify the data in a table (CUSTOMER) to
know if must make a INSERT or no.
It make 10 seconds to process only two rows.
How can i improve performance?

What are you exactly doing? Do you have an Oracle Text index on that table, or are you just manipulating text in the table?
Give us the code what you are doing and we can help you further.
Herald ten Dam

Similar Messages

ODF support in Oracle Text 10g R2 version ??

Currently, we are using Oracle Text 10g Release 2 version for HTML section searching in our application. we don't have any issues in Microsoft office 2003 documents.
But, when we use Open office documents(ODF), it is not working. It is throwing the following exception:
java.sql.SQLException: ORA-20000: Oracle Text error:
DRG-11207: user filter command exited with status 1
DRG-11222: Third-party filter does not support this known document format.
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DOC", line 825
ORA-06512: at line 1
We are using "AUT_FILTER" filter technology.
Any ideas for solving this issue?

You start to have to think outside the box at this point -- AUTO_LEXER isn't going to be able to support you natively.
You could file an SR, and let Oracle tell you whether they'd be willing to integrate changes (like new Verity libraries as they are developed) to 10.2.
That assumes that Autonomy (owner of Verity) has improved their support for ODF.
The OpenOffice formats are all xml-based; you could write something custom to extract the text from your openoffice files and submit them to Oracle as straight XML. I've done something similar to support Office 2007 formats.
You could write a custom USER_LEXER (which is essentially the same as custom extraction, but may be an easier place to hook in your custom code).
That's the main reason I suggested moving up to 11g -- none of the other choices have any easy, short-term fix or workaround.

Need Help on Oracle text

Hi Masters,
I am working on Oracle Text. I have executed the below step/commands. All are executed successfully. But I didn't seen any improvement in my task. But I have one doubt, will explain below.
create table ent_dnt as select * from entitlement_dnt;
BEGIN
    CTX_DDL.CREATE_PREFERENCE ('oracletext_datastore', 'MULTI_COLUMN_DATASTORE');
    CTX_DDL.SET_ATTRIBUTE
      ('oracletext_datastore', 'COLUMNS',
        'ORDER_NUMBER, GENERIC_PRODUCT_NAME_EXT, ENTITLEMENT_REF_ID, DEVICE_ASSET_ID, DEVICE_UNIQUE_ID, SWSERVICETAG, PRODUCT_DESC');
END;
CREATE INDEX idx_oracle_text
   ON Ent_dnt (search_cols)
    INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('DATASTORE oracletext_datastore');
BEGIN
DBMS_STATS.GATHER_TABLE_STATS('EE', 'ent_DNT', cascade=>TRUE);
DBMS_STATS.GATHER_TABLE_STATS('EE', 'upd_DNT', cascade=>TRUE);
dbms_stats.gather_index_stats('EE', 'idx_oracle_text1');
dbms_stats.gather_index_stats('EE', 'idx_oracle_text');
END;
the above steps are created / executed successfully. But when execute my select query..I didn't seen anywhere oracle text index name in my explain plan.
Plan
SELECT STATEMENT ALL_ROWSCost: 28,393 Bytes: 49,675 Cardinality: 5
3 SORT AGGREGATE Bytes: 8 Cardinality: 1
2 TABLE ACCESS BY INDEX ROWID TABLE EE.EE_PROD_GRP_ENTITLEMENT Cost: 4 Bytes: 8 Cardinality: 1
1 INDEX RANGE SCAN INDEX EE.IDX_PGE_ENT_ID Cost: 3 Cardinality: 1
5 SORT AGGREGATE Bytes: 8 Cardinality: 1
4 TABLE ACCESS FULL TABLE EE.ENT_DNT Cost: 26,781 Bytes: 8 Cardinality: 1
20 VIEW EE. Cost: 28,393 Bytes: 49,675 Cardinality: 5
19 COUNT STOPKEY
18 VIEW EE. Cost: 28,393 Bytes: 49,610 Cardinality: 5
17 SORT GROUP BY STOPKEY Cost: 28,393 Bytes: 2,295 Cardinality: 5
16 HASH JOIN OUTER Cost: 28,392 Bytes: 2,295 Cardinality: 5
14 NESTED LOOPS OUTER Cost: 28,388 Bytes: 1,808 Cardinality: 4
11 NESTED LOOPS OUTER Cost: 28,384 Bytes: 1,600 Cardinality: 4
8 HASH JOIN Cost: 28,383 Bytes: 1,552 Cardinality: 4
6 TABLE ACCESS FULL TABLE EE.UPD_DNT Cost: 1,089 Bytes: 174 Cardinality: 6
7 TABLE ACCESS FULL TABLE EE.ENT_DNT Cost: 27,292 Bytes: 110,648,108 Cardinality: 308,212
10 TABLE ACCESS BY INDEX ROWID TABLE EE.PRODUCT_LICENSE_PART Cost: 1 Bytes: 12 Cardinality: 1
9 INDEX RANGE SCAN INDEX EE.IDX_PLP_PD_DATA_ID Cost: 0 Cardinality: 1
13 TABLE ACCESS BY INDEX ROWID TABLE EE.PD_KT_DETAILS Cost: 1 Bytes: 52 Cardinality: 1
12 INDEX RANGE SCAN INDEX EE.IDX_PKD_PART_NUM Cost: 0 Cardinality: 1
15 TABLE ACCESS FULL TABLE EE.LEGACY_CONFIG Cost: 3 Bytes: 35 Cardinality: 5
and cost is also so high. But when I ran the below query. I didn't see any $ tables.
TEST@orcl_11gR2> SELECT object_name, object_type
2 FROM   user_objects
3 WHERE object_name LIKE '%oracle%'
4 /
Usually DR$ $I, $K,$N,$R,$X tables are not creaed. where is the problem? please help me. I have to complete this task.
Regards
AR

Hi Roger,
Thanks alot for your reply. This is my query. Yes I didn't used contain clause in my query. But i don't know how to use.
SELECT B.*,
       CASE WHEN ISBOUND = 'Y' AND ALLOWRESEND = 'Y' THEN 'Y' ELSE 'N' END
          AS Allowunbind,
       CASE
          WHEN ISBOUND = 'Y' AND IsThisAnUpgrade = 'N' AND Allowresend = 'N'
          THEN
             'Y'
          ELSE
             CASE
                WHEN     ISBOUND = 'N'
                     AND BINDING_TYPE = INITCAP ('TRUSTED')
                     AND ALLOWRESEND = 'N'
                THEN
                   'Y'
                ELSE
                   'N'
             END
       END
          AS AllowBind,
       FNC_GET_GROUPNAME_V3 (B.ENTITLEMENT_ID) GROUP_NAME,
       FNC_GET_USERGROUPNAME_V3 (B.ENTITLEMENT_ID, '[email protected]')
          USER_GROUP_NAME,
       FNC_GET_ROLE_V3 (B.ENTITLEMENT_ID, '[email protected]') ROLE_NAME,
       (SELECT MAX (PGE_IS_ASSIGNED)
          FROM ENT_DNT
         WHERE ENTITLEMENT_ID = B.ENTITLEMENT_ID)
          AS IS_ASSIGNED
FROM (SELECT *
          FROM (SELECT A.*, ROWNUM RNUM
                  FROM (SELECT *
                          FROM (SELECT *
                                  FROM (SELECT DISTINCT
                                               ENTDNT.ORDER_DATE,
                                               ENTDNT.ORDER_NUMBER,
                                               ENTDNT.ENTITLEMENT_ID,
                                               ENTDNT.ENTITLEMENT_REF_ID,
                                               ENTDNT.CUSTOMER_NUM,
                                               ENTDNT.ENTITLEMENT_STATUS_ID,
                                               ENTDNT.ENT_QTY,
                                               ENTDNT.ENTITLEMENTNAME,
                                               ENTDNT.ACT_KEY_LOB_ID,
                                               ENTDNT.LIC_KEY_LOB_ID,
                                               ENTDNT.LICENSE_KEY,
                                               ENTDNT.ENT_TYPE_ID,
                                               ENTDNT.PRODUCT_DATA_ID,
                                               ENTDNT.PRODUCT_NAME,
                                               ENTDNT.TYPE_DIMENSION_EXT,
                                               ENTDNT.BINDING_TYPE,
                                               DECODE (
                                                  ENTDNT.ENT_TYPE_ID,
                                                  1, ENTDNT.PRODUCT_DESC,
                                                  3, ENTDNT.GENERIC_PRODUCT_NAME_EXT)
                                                  AS PRODUCT_DESC,
                                               DECODE (
                                                  ENTDNT.ENT_TYPE_ID,
                                                  3, PKD.PRIMARY_LICENSE_IDENTIFIER,
                                                  2, 'SOFTWARE_SERVICETAG',
                                                  1, 'ENTITLEMENTID',
                                                  NULL)
                                                  AS PRIMARYLICENSEIDENTIFIER,
                                               CASE
                                                  WHEN     DECODE (
                                                             ENTDNT.ENT_TYPE_ID,
                                                              3, DECODE (
                                                                    PKD.KEY_SOURCE_TYPE,
                                                                    'SOURCE_NO_KEY', 'N',
                                                                    'Y'),
                                                              1, 'Y',
                                                              LC.IS_KEY_REQUIRED) =
                                                              'Y'
                                                       AND ENTDNT.ENTITLEMENT_STATUS_ID =
                                                              '0'
                                                       AND (   ENTDNT.LIC_KEY_LOB_ID
                                                                  IS NOT NULL
                                                            OR ENTDNT.LICENSE_KEY
                                                                  IS NOT NULL
                                                            OR ENTDNT.ACT_KEY_LOB_ID
                                                                  IS NOT NULL)
                                                  THEN
                                                     'Y'
                                                  WHEN     ENTDNT.ENTITLEMENT_STATUS_ID =
                                                              '0'
                                                       AND (   ENTDNT.LIC_KEY_LOB_ID
                                                                  IS NOT NULL
                                                            OR ENTDNT.LICENSE_KEY
                                                                  IS NOT NULL)
                                                  THEN
                                                     'Y'
                                                  ELSE
                                                     'N'
                                               END
                                                  AS KEYREQUIRED,
                                               ENTDNT.ISTHISANUPGRADE,
                                               ENTDNT.DEVICE_ASSET_ID,
                                               ENTDNT.SWSERVICETAG,
                                               PKD.PHVALUE,
                                               CASE
                                                  WHEN -- ENTDNT.BINDING_TYPE = 'Trusted'
                                                      ENTDNT.BINDING_TYPE =
                                                          INITCAP ('TRUSTED')
                                                  THEN
                                                     'N'
                                                  WHEN    ENTDNT.BINDING_TYPE =
                                                             INITCAP (
                                                                'COMPONENT')
                                                       -- OR ENTDNT.BINDING_TYPE = 'DeviceID'
                                                       OR ENTDNT.BINDING_TYPE =
                                                             INITCAP (
                                                                'DEVICEID')
                                                       --OR ENTDNT.BINDING_TYPE = 'ServiceTag'
                                                       OR ENTDNT.BINDING_TYPE =
                                                             INITCAP (
                                                                'SERVICETAG')
                                                  THEN
                                                     'Y'
                                                  ELSE
                                                     'N'
                                               END
                                                  AS ISBOUND,
                                               CASE
                                                  WHEN     ENTDNT.ENT_TYPE_ID =
                                                              3
                                                       AND PKD.ALLOW_RESEND =
                                                              'Y'
                                                       AND ENTDNT.ENTITLEMENT_STATUS_ID =
                                                              '0'
                                                       AND (   ENTDNT.LIC_KEY_LOB_ID
                                                                  IS NOT NULL
                                                            OR ENTDNT.LICENSE_KEY
                                                                  IS NOT NULL
                                                            OR ENTDNT.ACT_KEY_LOB_ID
                                                                  IS NOT NULL)
                                                  THEN
                                                     'Y'
                                                  WHEN     ENTDNT.ENTITLEMENT_STATUS_ID =
                                                              '0'
                                                       AND (   ENTDNT.LIC_KEY_LOB_ID
                                                                  IS NOT NULL
                                                            OR ENTDNT.LICENSE_KEY
                                                                  IS NOT NULL)
                                                  THEN
                                                     'Y'
                                                  ELSE
                                                     'N'
                                               END
                                                  AS ALLOWRESEND,
                                               ENTDNT.GENERIC_PRODUCT_NAME_EXT,
                                               PLP.LICENSE_PART_NUMBER
                                                  AS SRVPARTNUMBER,
                                               ENTDNT.DEVICE_UNIQUE_ID,
                                               (SELECT MAX (IS_ASSIGNED)
                                                  FROM EE_PROD_GRP_ENTITLEMENT PGE
                                                 WHERE ENTITLEMENT_ID =
                                                          ENTDNT.ENTITLEMENT_ID)
                                                  AS IS_ASSIGNED,
                                               ENTDNT.SINGLEFILEPERID
                                          FROM ent_dnt ENTDNT,
                                               PD_KT_DETAILS PKD,
                                               PRODUCT_LICENSE_PART PLP,
                                               Legacy_Config LC,
                                               upd_dnt UPDNT
                                         WHERE     ENTDNT.PRODUCT_GROUP_ID =
                                                      UPDNT.PRODUCT_GROUP_ID
                                               AND UPDNT.EMAIL_ADDRESS =
                                                      '[email protected]'
                                               AND ENTDNT.ENT_TYPE_ID =
                                                      LC.ENTITLEMENT_TYPE_ID(+)
                                               AND PLP.PRODUCT_DATA_ID(+) =
                                                      ENTDNT.PRODUCT_DATA_ID
                                               AND PKD.PART_NUMBER(+) =
                                                      ENTDNT.LIC_PART_NUM
                                               AND UPDNT.IS_DELETED = 'N'
                                               AND ENTDNT.ENTITLEMENT_STATUS_ID IN
                                                      (0, 4)
                                               AND ENTDNT.IS_DELETED = 'N')
                                 WHERE    (UPPER (GENERIC_PRODUCT_NAME_EXT) LIKE
                                              '%IDRAC%')
                                       OR (ORDER_NUMBER LIKE '%251608469%')
                                       OR (ENTITLEMENT_REF_ID LIKE '%162523200%')
                                       OR (DEVICE_ASSET_ID LIKE '%162523200%')
                                       OR (DEVICE_UNIQUE_ID LIKE '%162523200%')
                                       OR (SWSERVICETAG LIKE '%162523200%')
                                       OR (UPPER (PRODUCT_DESC) LIKE
                                              '%162523200%'))
                         WHERE    (UPPER (GENERIC_PRODUCT_NAME_EXT) LIKE
                                      '%575757%')
                               OR (ORDER_NUMBER LIKE '%251608469%')
                               OR (ENTITLEMENT_REF_ID LIKE '%162523200%')
                               OR (DEVICE_ASSET_ID LIKE '%162523200%')
                               OR (DEVICE_UNIQUE_ID LIKE '%162523200%')
                               OR (SWSERVICETAG LIKE '%162523200%')
                               OR (UPPER (PRODUCT_DESC) LIKE '%162523200%')) A
                 WHERE ROWNUM <= 100)
         WHERE RNUM >= 1) B;
yes..you are 100% correct. I did mistake with like operator. now I executed the below query. displaying all oracle related tables and indexes. Thank you.
SELECT object_name, object_type
FROM   user_objects
WHERE object_name LIKE '%ORACLE%'
but here I have a problem. how can i run the below select for my above query?
SELECT *
2 FROM Entitlement_dnt
3 WHERE CONTAINS (search_cols, REPLACE (:i_OpenSearchText, ',', ' AND ')) > 0
here what can I do the value for :I_opensearchText ? confused.
please help me..!!
Regards
AR

Performance issue with Oracle Text index

Hi Experts,
We are on Oracle 11.2..0.3 on Solaris 10. I have implemented Oracle Text in our environment and I am facing a strange performance issue that is happening in our environment.
One sql having CONTAINS clause is taking forever - more than 20 minutes and still does not complete. This sql has a contains clause and an exists clause and a not exists clause.
Now if I remove the exists clause and a not exists clause , it completes fast. but with those two clauses it is just taking forever. It is late night so i am not able to post the table and sql query details and will do so tomorrow but based on this general description, are there any pointers for me to review?
sql query doing fine:
SELECT
    U.CLNT_OID, U.USR_OID, S.MAILADDR
FROM
    access_usr U
    INNER JOIN access_sia S
        ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
    WHERE U.CLNT_OID = 'ABCX32S'
    AND CONTAINS(LAST_NAME , 'TO%' ) >0
--sql query that hangs forever:
SELECT
    U.CLNT_OID, U.USR_OID, S.MAILADDR
FROM
    access_usr U
    INNER JOIN access_sia S
        ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
    WHERE U.CLNT_OID = 'ABCX32S'
    AND CONTAINS(LAST_NAME , 'TO%' ) >0
and exists (--one clause here wiht a few table joins)
and not exists (--one clause here wiht a few table joins);
--Now another strange thing I found is if instead of 'TO%' in this sql, if I were to use 'ZZ%' or 'L1%' it works fast but for 'TO%' it goes slow with those two exists not exists clauses!
I will be most thankful for the inputs.
OrauserN

Hi Barbara,
First of all, thanks a lot for reviewing the issue.
Unluckily making the change to empty_stoplist did not work out. I am today copying the entire sql here that has this issue and will be most thankful for more insights/pointers on what can be done.
Here is the entire sql:
SELECT U.CLNT_OID,
       U.USR_OID,
       S.EMAILADDRESS,
       U.FIRST_NAME,
       U.LAST_NAME,
       S.JOBCODE,
       S.LOCATION,
       S.DEPARTMENT,
       S.ASSOCIATEID,
       S.ENTERPRISECOMPANYCODE,
       S.EMPLOYEEID,
       S.PAYGROUP,
       S.PRODUCTLOCALE
FROM    ACCESS_USR U
       INNER JOIN
          ACCESS_SIA S
       ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
WHERE     U.CLNT_OID = 'G39NY3D25942TXDA'
       AND EXISTS
              (SELECT 1
                 FROM ACCESS_USR_GROUP_XREF UGX
                      INNER JOIN ACCESS_GROUP RELG
                         ON     RELG.CLNT_OID = UGX.CLNT_OID
                            AND RELG.GROUP_OID = UGX.GROUP_OID
                      INNER JOIN ACCESS_GROUP G
                         ON     G.CLNT_OID = RELG.CLNT_OID
                            AND G.GROUP_TYPE_OID = RELG.GROUP_TYPE_OID
                WHERE     UGX.CLNT_OID = U.CLNT_OID
                      AND UGX.USR_OID = U.USR_OID
                      AND G.GROUP_OID = 920512943
                      AND UGX.INCLUDED = 1)
       AND NOT EXISTS
                  (SELECT 1
                     FROM    ACCESS_USR_GROUP_XREF UGX
                          INNER JOIN
                             ACCESS_GROUP G
                          ON     G.CLNT_OID = UGX.CLNT_OID
                             AND G.GROUP_OID = UGX.GROUP_OID
                    WHERE     UGX.CLNT_OID = U.CLNT_OID
                          AND UGX.USR_OID = U.USR_OID
                          AND G.GROUP_OID = 920512943
                          AND UGX.INCLUDED = 1)
       AND CONTAINS (U.LAST_NAME, 'Bon%') > 0;
Like I said before if the EXISTS and NOT EXISTS clause are removed it works in sub-second. But with those EXISTS and NOT EXISTS CLAUSE IT TAKES ANY WHERE FROM 25 minutes to more than one hour.
NOte also that it was not TO% but Bon% in the CONTAINS clause that is giving the issue - sorry that was wrong on my part.
Also please see below the ORACLE TEXT index defined on the table ACCESS_USER:
--definition of preferences used in the index:
SET SERVEROUTPUT ON size unlimited
WHENEVER SQLERROR EXIT SQL.SQLCODE
DECLARE
   v_err       VARCHAR2 (1000);
   v_sqlcode   NUMBER;
   v_count     NUMBER;
BEGIN
   ctxsys.ctx_ddl.create_preference ('cust_lexer', 'BASIC_LEXER');
   ctxsys.ctx_ddl.set_attribute ('cust_lexer', 'base_letter', 'YES'); -- removes diacritics
EXCEPTION
   WHEN OTHERS
   THEN
      v_err := SQLERRM;
      v_sqlcode := SQLCODE;
      v_count := INSTR (v_err, 'DRG-10701');
      IF v_count > 0
      THEN
         DBMS_OUTPUT.put_line (
            'The required preference named CUST_LEXER with BASIC LEXER is already set up');
      ELSE
         RAISE;
      END IF;
END;
DECLARE
   v_err       VARCHAR2 (1000);
   v_sqlcode   NUMBER;
   v_count     NUMBER;
BEGIN
   ctxsys.ctx_ddl.create_preference ('cust_wl', 'BASIC_WORDLIST');
   ctxsys.ctx_ddl.set_attribute ('cust_wl', 'SUBSTRING_INDEX', 'true'); -- to improve performance
EXCEPTION
   WHEN OTHERS
   THEN
      v_err := SQLERRM;
      v_sqlcode := SQLCODE;
      v_count := INSTR (v_err, 'DRG-10701');
      IF v_count > 0
      THEN
         DBMS_OUTPUT.put_line (
            'The required preference named CUST_WL with BASIC WORDLIST is already set up');
      ELSE
         RAISE;
      END IF;
END;
--now below is the code of the index:
CREATE INDEX ACCESS_USR_IDX3 ON ACCESS_USR
(FIRST_NAME)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('LEXER cust_lexer WORDLIST cust_wl SYNC (ON COMMIT)');
CREATE INDEX ACCESS_USR_IDX4 ON ACCESS_USR
(LAST_NAME)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('LEXER cust_lexer WORDLIST cust_wl SYNC (ON COMMIT)');
The strange thing is that, like I said, If I remove the exists clause the query returns very fast. Also if I modify the query to use only one NOT EXISTS clause and remove the other EXISTS clause it returns in less than one second. Also if I remove the EXISTS clause and use only the NOT EXISTS clause it returns in less than 4 seconds. But with both clauses it runs forever!
When I tried to get dbms_xplan.display_cursor to get the query plan (for the case of both exists and not exists clause in the query), it said that previous statement's sql id was 0 or something like that so that I was not able to see the query plan. I will keep trying to get this plan (it takes 25 minutes to one hour each time but will get this info soon). Again any pointers are most helpful.
Regards
OrauserN

Should Oracle Text be used here?

Hi,
We are developing a search feature for a bank that has thousands of documents. Each document has a set of free-form comments written by multiple bank officials. The comments are in a table in a Oracle 9i database. The comments can be 10 to 10000 characters. The actual documents are not available in the database. Only a document identifier is kept in the table containing the comments.
The search engine will have single-word as well as phrase searches. Do you think using Oracle Text is the best approach for such a serach facility?
Thanks
Yash

Sameer,
I guess I was right about it being personal. I do feel like we're getting somewhere though! DETAILS!!!
This isn't about me, you, or Text. The posts are to help those that need assistance, and for some of us (like me) to gain some exposure to problems I have not run into. You'll see a fair amount of research goes into many people's posts here.
Your earlier write-up telling the person to stay away provided no specifics about your situation. It was alarmist. They might have totally different requirements than you, so it pays to ask follow-up questions to find out if they are going to hit what you did.
I don't mind 'talking crap' about something so long as it is specific and can be addressed, or people can find situations where it should/should not be used by comparing their system to yours. If I tell you that cars suck because they break down, it isn't terribly useful to anyone. If I tell you that the 1982 Pontiac Bonneville's transmission needed to be replaced 13 times since I bought it...that kind of detail might be of some use to someone with that car. This is what I was soliciting from you.
What do we have from your last post:
* 9.2.0.5
* Millions of user's per day
* Peak time the box was only 5-10% idle
* The problem was with queries on a CONTEXT index
(rather than the indexing process itself)
* You couldn't use CTXCAT because of application requirements
Two things you said that I will agree with.
1) CONTAINS queries can be costly. There are some ways to improve the performance if you post more detail about your requirements. This is the first time you mentioned that you were using the CONTEXT index instead of CTXCAT...good information. It would indicate that your 'this totally sucks' gut response from earlier is from the perspective of CONTEXT and doesn't extend to the other index types (or am I mistaken?).
2) 10g has improved upon some things. Later patchsets on Release 1, and the newly available Release 2 have a different filter as well. 9i made some major changes from prior releases though, so if 9i is the only option, I'd still take it.
If you are interested in continuing, I'd like to find out how many indexes, the number of documents indexed, the size of the index tables and data tables, and some more about the application. Give us your best shot.
As for hurting the ego...I'll still sleep well tonight. I would just like to see this remain a constructive place for people to post questions and try out solutions. That can't happen if the replies are a blanket 'stay away' without justification or a matching of requirements to problems.
Finally, feel free to post to anything I participate in...just remember that I'm not shy (and neither are you it seems), so if a debate happens it will likely be lively.
-Ron

Oracle Text ALTER INDEX Performance

Greetings,
We have encountered some ehancement issues with Oracle Text and really need assistance.
We are using Oracle 9i (Release 9.0.1) Standard Edition
We are using a very simple Oracle text environmet, with CTXSYS.CONTEXT indextype on Domain Indexes.
We have indexed two text columns in one table, one of these columns is CLOB.
Currently if one of these columns is modified, we are using a trigger to automatically ALTER the index.
This is very slow, it is just like dropping the index and creating it again.
Is this right? should it be this slow?
We are also trying to use the ONLINE parameter for ALTER INDEX and CREATE INDEX, but it gives an error saying this feature is not enabled.
How can we enable it?
Is there any way in improving the performance of this automatic update of the indexes?
Would using a trigger be the best way to do this?
How can we optimize it to a more satifactory performance level?
Also, are we able to use the language lexers for indexes with the Standard Edition. If so, how do you enable the CTX_DLL?
Many thanks for any assistance.
Chi-Shyan Wang

If you are going to sync your index on every update, you need to make sure that you are optmizing it on a regular basis to remove index fragmentation and remove deleted rows.
you can set up a dmbs_job to do a ctx_ddl.optmize and run a full optmize periodically.
Also, depending on the number of rows you have, and also the size of the data, you might want to look at using a CTXCAT index, which is transactional, stays in sync automatically and does not need to be optimized. CTXCAT indexes do not work well on large text objects (they are good for a couple lines of text at most) so they may not suit your dataset.

Oracle Text Storage Issue

Hi Everyone,
My name is John and I just have 3 small queries which your expertise and assistance is greatly needed and appreciated.
I'm currently using Oracle Text on Oracle 10g Enterprise Edition Release 10.2.0.2.0 database and experiencing some kind of space storage problem. I have a table with 2 BLOB columns. One of the column is storing the TIF image file and the other column is storing the TIF's OCR version in PDF format. We are indexing on the PDF format column for rapid text retrieval. As we are loading them into the table, the index and table tablespaces were used up very rapidly. I've used and created my context index storage using the statements below:
ctx_ddl.create_preference('OCR_DOC_OCR_CONTENT_I_STORAGE','BASIC_STORAGE');
ctx_ddl.set_attribute('OCR_DOC_OCR_CONTENT_I_STORAGE','I_INDEX_CLAUSE',
'tablespace TS_OCR_IDX_LGE compress 2');
ctx_ddl.set_attribute('OCR_DOC_OCR_CONTENT_I_STORAGE','I_TABLE_CLAUSE',
'tablespace TS_OCR_IDX_LGE');
ctx_ddl.set_attribute('OCR_DOC_OCR_CONTENT_I_STORAGE','K_TABLE_CLAUSE',
'tablespace TS_OCR_IDX_LGE');
ctx_ddl.set_attribute('OCR_DOC_OCR_CONTENT_I_STORAGE','N_TABLE_CLAUSE',
'tablespace TS_OCR_IDX_LGE');
ctx_ddl.set_attribute('OCR_DOC_OCR_CONTENT_I_STORAGE','P_TABLE_CLAUSE',
'tablespace TS_OCR_IDX_LGE');
ctx_ddl.set_attribute('OCR_DOC_OCR_CONTENT_I_STORAGE','R_TABLE_CLAUSE',
'tablespace TS_OCR_IDX_LGE lob (data) store as (cache)');
I've created my table using the following commands below:
create table OCR_DOCUMENT (
DOC_ID number
,DOC_NAME varchar2(255)
,DOC_DIRECTORY varchar2(255)
,DOC_EXTENSION varchar2(10)
,DOC_CONTENT blob
,OCR_EXTENSION varchar2(10)
,OCR_CONTENT blob
,HAS_BLOB varchar2(1)
,CREATED_DATETIME date
,FILE_NAME VARCHAR2(2000)
,DW_DOC_ID NUMBER
,PAGE_NO NUMBER
,DOC_TYPE VARCHAR2(100)
,DOC_CLASS VARCHAR2(100)
,DOC_DESCRIPTION VARCHAR2(2000)
,PAGES NUMBER(10)
,CLT_NUMBER NUMBER(10)
,TAXENT_NUMBER NUMBER(10)
,REG_DATE DATE
,TAX_YEAR VARCHAR2(20)
,ORIG_FILE_NAME VARCHAR2(2000)
tablespace TS_OCR_TBL_LGE
pctfree 5 initrans 2 maxtrans 255
nologging noparallel;
My first question is, is there anything wrong with my storage clauses so I can improve and save some additional space?
Second question is, is there a way that I can compress and save some space on the table blob columns, i.e. DOC_CONTENT and OCR_CONTENT, without affecting the document service retreival?
Because at the beginning of the project, I've used utl_compress.lz_uncompress to compress the BLOB content before storing them to the table but I soon ditched such idea after finding out when I attempt to retrieve the compressed BLOB content using ctx_doc.markup for highlight document service (to highlight the text which I've used in my searching), it displayed some sort of garbage text information and I could not find any workaround to it.
Also, if we are preapred NOT to use the THEME and GIST features of Oracle Text, can I perhaps remove them to save some addition space? Any feedback that I can save space would be welcomed and appreciated. Have a nice day.
Thanks and Regards,
John

The BEST solution to your problem is to move to 11gRelease1
I am not sure how feasible that will be on your part, but 11gR1 have exactly the same capabilities as you are looking for.
You can compress, deduplicate all the LOB fields (with SECUREFILE clause) in all the tables including internal index tables ($R etc) and the base table (OCR_DOCUMENT).
This is just for your information.
I dont reallyhave any other information to share with you to resolve your problem :(

Installing Oracle Text on a running Oracle 9.2 DB on AIX

A third party set up and configured our Oracle installation some years ago. I have a requirement for using Oracle text to speed up searches (although of course I need to trial it to see if speed improvements justify additional storage, etc).
My problem is that I just don't know where to begin the install. We are patched to 9.2.0.6.0 which was done by another third party on our behalf.
I have test databases but only one, live Oracle installation.
Any help or advice would be appreciated.
Thanks

The problem is, it looks like this was not installed:
SQL> select comp_name, version, status from dba_registry;
COMP_NAME
VERSION                        STATUS
Oracle9i Catalog Views
9.2.0.6.0                      VALID
Oracle9i Packages and Types
9.2.0.6.0                      VALID
JServer JAVA Virtual Machine
9.2.0.6.0                      VALID

About index memory parameter for Oracle text indexes

Hi Experts,
I am on Oracle 11.2.0.3 on Linux and have implemented Oracle Text. I am not an expert in this subject and need help about one issue. I created Oracle Text indexes with default setting. However in an oracle white paper I read that the default setting may not be right. Here is the excerpt from the white paper by Roger Ford:
URL:http://www.oracle.com/technetwork/database/enterprise-edition/index-maintenance-089308.html
"(Part of this white paper below....)
Index Memory                                  As mentioned above, cached $I entries are flushed to disk each time the indexing memory is exhausted. The default index memory at installation is a mere 12MB, which is very low. Users can specify up to 50MB at index creation time, but this is still pretty low.
This would be done by a CREATE INDEX statement something like:
CREATE INDEX myindex ON mytable(mycol) INDEXTYPE IS ctxsys.context PARAMETERS ('index memory 50M');
Allow index memory settings above 50MB, the CTXSYS user must first increase the value of the MAX_INDEX_MEMORY parameter, like this:
begin ctx_adm.set_parameter('max_index_memory', '500M'); end;
The setting for index memory should never be so high as to cause paging, as this will have a serious effect on indexing speed. On smaller dedicated systems, it is sometimes advantageous to temporarily decrease the amount of memory consumed by the Oracle SGA (for example by decreasing DB_CACHE_SIZE and/or SHARED_POOL_SIZE) during the index creation process. Once the index has been created, the SGA size can be increased again to improve query performance."
(End here from the white paper excerpt)
My question is:
1) To apply this procedure (ctx_adm.set_parameter) required me to login as CTXSYS user. Is that right? or can it be avoided and be done from the application schema? This user CTXSYS is locked by default and I had to unlock it. Is that ok to do in production?
2) What is the value that I should use for the max_index_memory should it be 500 mb - my SGA is 2 GB in Dev/ QA and 3GB in production. Also in the index creation what is the value I should set for index memory parameter - I had left that at default but how should I change now? Should it be 50MB as shown in example above?
3) The white paper also refer to rebuilding an index at some interval like once in a month:   ALTER INDEX DR$index_name$X REBUILD ONLINE;
--Is this correct advice? i would like to ask the experts once before doing that. We are on Oracle 11g and the white paper was written in 2003.
Basically while I read the paper, I am still not very clear on several aspects and need help to understand this.
Thanks,
OrauserN

Perhaps it's time I updated that paper
1. To change max_index_memory you must be a DBA user OR ctxsys. As you say, the ctxsys account is locked by default. It's usually easiest to log in as a DBA and run something like
exec ctxsys.ctx_adm.set_parameter('MAX_INDEX_MEMORY', '10G')
2. Index memory is allocated from PGA memory, not SGA memory. So the size of SGA is not relevant. If you use too high a setting your index build may fail with an error saying you have exceeded PGA_AGGREGATE_LIMIT. Of course, you can increase that parameter if necessary. Also be aware that when indexing in parallel, each parallel process will allocated up to the index memory setting.
What should it be set to? It's really a "safety" setting to prevent users grabbing too much machine memory when creating indexes. If you don't have ad-hoc users, then just set it as high as you need. In 10.1 it was limited to just under 500M, in 10.2 you can set it to any value.
The actual amount of memory used is not governed by this parameter, but by the MEMORY setting in the parameters clause of the CREATE INDEX statement. eg:
create index fooindex on foo(bar) indextype is ctxsys.context parameters ('memory 1G')
What's a good number to use for memory? Somewhere in the region of 100M to 200M is usually good.
3. No - that's out of date. To optimize your index use CTX_DDL.OPTIMIZE_INDEX. You can do that in FULL mode daily or weekly, and REBUILD mode perhaps once a month.

How do I get Oracle Text to index files on a file server?

I am new to Oracle (I'm a MS-SQL DBA looking for a Full-Text Search solution that is better than linking to a MS index server.)
So - Here's the objective:
I have Oracle Server(Express) installed on a Windows server.
I would like for Oracle to build a Full-Text Catalog of the files on a separate file server based on file paths in a table in the database.
(No desire to store terabytes of images and documents inside the database)
I can get Oracle text up and running, using the URL_Datastore:
CREATE TABLE files (id NUMBER PRIMARY KEY, issue_id NUMBER, path VARCHAR(255) UNIQUE, ot_format VARCHAR(6), ot_version VARCHAR(10));
The Compaq server is a remote windows server on my local workgroup, so the fully qualified path is just "compaq" and the URL is valid:
INSERT INTO files VALUES (9,9,'file://Compaq/FTQ/00000003.pdf',NULL,NULL);
INSERT INTO files VALUES (13,13,'file://Compaq/FTQ/01.txt',NULL,NULL);
CREATE INDEX file_index ON files(path) INDEXTYPE IS ctxsys.context
PARAMETERS ('datastore ctxsys.URL_DATASTORE format column ot_format');
but when I enter:
Select * from CTX_User_Index_errors, I see the following errors:
DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/00000003.pdf
DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/01.txt
Did I miss something?
Do I need to install anything on the file server?
I would like to convince my company that Oracle can be much quicker than Microsoft's Indexing Service because it can avoid joining two large result sets (one result set from Full_text (indexing service) and one for specific data contained in fields in the MS-SQL database.) Full Text Searches commonly take 40 - 60 seconds where there are 1.5 million multi-page PDF files for a particular set that I sample search on. Without this massive join, I believe I can get the search to run in under 10 seconds.

Thank you!
File_Datastore worked fine.
I was staying away from File_Datastore because the information I gathered from googling suggested that file_datastore would only work locally.
Now I just have to get Oracle to pull data out of tables in a MS-SQL database on the local network (don't have a clue yet), and then have it index compiled file paths.
Then MS-SQL can query Oracle with index and full-text criteria and Oracle can send back a result set
It may sound like a bad way of performing Full-Text Queries, but anything will be better than the way things are currently running. We are currently performing Full Text Searches on a table that is rebuilt nightly, so the table containing millions of file paths is not live..
It would be so much better if we just migrated to Oracle, but we currently do not have the resources.

Error while running the Oracle Text optimize index procedure (even as a dba user too)

Hi Experts,
I am on Oracle on 11.2.0.2 on Linux. I have implemented Oracle Text. My Oracle Text indexes are fragmented but I am getting an error while running the optimize_index error. Following is the error:
begin
ctx_ddl.optimize_index(idx_name=>'ACCESS_T1',optlevel=>'FULL');
end;
ERROR at line 1:
ORA-20000: Oracle Text error:
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DDL", line 941
ORA-06512: at line 1
Now I tried then to run this as DBA user too and it failed the same way!
begin
ctx_ddl.optimize_index(idx_name=>'BVSCH1.ACCESS_T1',optlevel=>'FULL');
end;
ERROR at line 1:
ORA-20000: Oracle Text error:
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DDL", line 941
ORA-06512: at line 1
Now CTXAPP role is granted to my schema and still I am getting this error. I will be thankful for the suggestions.
Also one other important observation: We have this issue ONLY in one database and in the other two databases, I don't see any problem at all.
I am unable to figure out what the issue is with this one database!
Thanks,
OrauserN

How about check the following?
Bug 10626728 - CTX_DDL.optimize_index "full" fails with an empty ORA-20000 since 11.2.0.2 upgrade (DOCID 10626728.8)

Getting error while importing schema with ORACLE TEXT

IMP-00003: ORACLE error 20000 encountered
ORA-20000: Oracle Text error:
DRG-52204: error while registering index
DRG-10507: duplicate index name: WORKORDER_Q, owner: SYS
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.DRIIMP", line 115
ORA-06512: at line 2
IMP-00088: Problem importing metadata for index WORKORDER_Q. Index creation will be skipped
Database version - Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
Os version - Linux nlxs1012.slb.atosorigin-asp.com 2.6.18-308.el5 #1 SMP Fri Jan 27 17:17:51 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
We have take export of schema from production db now importing data to qa environment..
In import facing above error..

I am importing objects from P20_MAXIMO to Q25_MAXIMO to another database..
Below is import par file..
USERID='/ as sysdba'
FILE=exp_P20_MAXIMO_C2364781.dmp
LOG=imp_P20_MAXIMO__Q25_MAXIMO_C2364781_1.log
FROMUSER=P20_MAXIMO
TOUSER=Q25_MAXIMO
buffer=1000000
feedback=100000
Export parfile
userid='/ as sysdba'
owner=P20_MAXIMO
FILE=exp_P20_MAXIMO_C2364781.dmp
LOG=exp_P20_MAXIMO_C2364781.log
buffer=10000000
feedback=100000
statistics=none

Pre-loading Oracle text in memory with Oracle 12c

There is a white paper from Roger Ford that explains how to load the Oracle index in memory : http://www.oracle.com/technetwork/database/enterprise-edition/mem-load-082296.html
In our application, Oracle 12c, we are indexing a big XML field (which is stored as XMLType with storage secure file) with the PATH_SECTION_GROUP. If I don't load the I table (DR$..$I) into memory using the technique explained in the white paper then I cannot have decent performance (and especially not predictable performance, it looks like if the blocks from the TOKEN_INFO columns are not memory then performance can fall sharply)
But after migrating to oracle 12c, I got a different problem, which I can reproduce: when I create the index it is relatively small (as seen with ctx_report.index_size) and by applying the technique from the whitepaper, I can pin the DR$ I table into memory. But as soon as I do a ctx_ddl.optimize_index('Index','REBUILD') the size becomes much bigger and I can't pin the index in memory. Not sure if it is bug or not.
What I found as work-around is to build the index with the following storage options:
ctx_ddl.create_preference('TEST_STO','BASIC_STORAGE');
ctx_ddl.set_attribute ('TEST_STO', 'BIG_IO', 'YES' );
ctx_ddl.set_attribute ('TEST_STO', 'SEPARATE_OFFSETS', 'NO' );
so that the token_info column will be stored in a secure file. Then I can change the storage of that column to put it in the keep buffer cache, and write a procedure to read the LOB so that it will be loaded in the keep cache. The size of the LOB column is more or less the same as when creating the index without the BIG_IO option but it remains constant even after a ctx_dll.optimize_index. The procedure to read the LOB and to load it into the cache is very similar to the loaddollarR procedure from the white paper.
Because of the SDATA section, there is a new DR table (S table) and an IOT on top of it. This is not documented in the white paper (the white paper was written for Oracle 10g). In my case this DR$ S table is much used, and the IOT also, but putting it in the keep cache is not as important as the token_info column of the DR I table. A final note: doing SEPARATE_OFFSETS = 'YES' was very bad in my case, the combined size of the two columns is much bigger than having only the TOKEN_INFO column and both columns are read.
Here is an example on how to reproduce the problem with the size increasing when doing ctx_optimize
1. create the table
drop table test;
CREATE TABLE test
(ID NUMBER(9,0) NOT NULL ENABLE,
XML_DATA XMLTYPE
XMLTYPE COLUMN XML_DATA STORE AS SECUREFILE BINARY XML (tablespace users disable storage in row);
2. insert a few records
insert into test values(1,'<Book><TITLE>Tale of Two Cities</TITLE>It was the best of times.<Author NAME="Charles Dickens"> Born in England in the town, Stratford_Upon_Avon </Author></Book>');
insert into test values(2,'<BOOK><TITLE>The House of Mirth</TITLE>Written in 1905<Author NAME="Edith Wharton"> Wharton was born to George Frederic Jones and Lucretia Stevens Rhinelander in New York City.</Author></BOOK>');
insert into test values(3,'<BOOK><TITLE>Age of innocence</TITLE>She got a prize for it.<Author NAME="Edith Wharton"> Wharton was born to George Frederic Jones and Lucretia Stevens Rhinelander in New York City.</Author></BOOK>');
3. create the text index
drop index i_test;
exec ctx_ddl.create_section_group('TEST_SGP','PATH_SECTION_GROUP');
begin
CTX_DDL.ADD_SDATA_SECTION(group_name => 'TEST_SGP',
                            section_name => 'SData_02',
                            tag => 'SData_02',
                            datatype => 'varchar2');
end;
exec ctx_ddl.create_preference('TEST_STO','BASIC_STORAGE');
exec ctx_ddl.set_attribute('TEST_STO','I_TABLE_CLAUSE','tablespace USERS storage (initial 64K)');
exec ctx_ddl.set_attribute('TEST_STO','I_INDEX_CLAUSE','tablespace USERS storage (initial 64K) compress 2');
exec ctx_ddl.set_attribute ('TEST_STO', 'BIG_IO', 'NO' );
exec ctx_ddl.set_attribute ('TEST_STO', 'SEPARATE_OFFSETS', 'NO' );
create index I_TEST
on TEST (XML_DATA)
indextype is ctxsys.context
parameters('
    section group   "TEST_SGP"
    storage         "TEST_STO"
') parallel 2;
4. check the index size
select ctx_report.index_size('I_TEST') from dual;
it says :
TOTALS FOR INDEX TEST.I_TEST
TOTAL BLOCKS ALLOCATED:                                                104
TOTAL BLOCKS USED:                                                      72
TOTAL BYTES ALLOCATED:                                 851,968 (832.00 KB)
TOTAL BYTES USED:                                      589,824 (576.00 KB)
4. optimize the index
exec ctx_ddl.optimize_index('I_TEST','REBUILD');
and now recompute the size, it says
TOTALS FOR INDEX TEST.I_TEST
TOTAL BLOCKS ALLOCATED:                                               1112
TOTAL BLOCKS USED:                                                    1080
TOTAL BYTES ALLOCATED:                                 9,109,504 (8.69 MB)
TOTAL BYTES USED:                                      8,847,360 (8.44 MB)
which shows that it went from 576KB to 8.44MB. With a big index the difference is not so big, but still from 14G to 19G.
5. Workaround: use the BIG_IO option, so that the token_info column of the DR$ I table will be stored in a secure file and the size will stay relatively small. Then you can load this column in the cache using a procedure similar to
alter table DR$I_TEST$I storage (buffer_pool keep);
alter table dr$i_test$i modify lob(token_info) (cache storage (buffer_pool keep));
rem: now we must read the lob so that it will be loaded in the keep buffer pool, use the prccedure below
create or replace procedure loadTokenInfo is
type c_type is ref cursor;
c2 c_type;
s varchar2(2000);
b blob;
buff varchar2(100);
siz number;
off number;
cntr number;
begin
    s := 'select token_info from DR$i_test$I';
    open c2 for s;
    loop
       fetch c2 into b;
       exit when c2%notfound;
       siz := 10;
       off := 1;
       cntr := 0;
       if dbms_lob.getlength(b) > 0 then
         begin
           loop
             dbms_lob.read(b, siz, off, buff);
             cntr := cntr + 1;
             off := off + 4096;
           end loop;
         exception when no_data_found then
           if cntr > 0 then
             dbms_output.put_line('4K chunks fetched: '||cntr);
           end if;
         end;
       end if;
    end loop;
end;
Rgds, Pierre

I have been working a lot on that issue recently, I can give some more info.
First I totally agree with you, I don't like to use the keep_pool and I would love to avoid it. On the other hand, we have a specific use case : 90% of the activity in the DB is done by queuing and dbms_scheduler jobs where response time does not matter. All those processes are probably filling the buffer cache. We have a customer facing application that uses the text index to search the database : performance is critical for them.
What kind of performance do you have with your application ?
In my case, I have learned the hard way that having the index in memory (the DR$I table in fact) is the key : if it is not, then performance is poor. I find it reasonable to pin the DR$I table in memory and if you look at competitors this is what they do. With MongoDB they explicitly says that the index must be in memory. With elasticsearch, they use JVM's that are also in memory. And effectively, if you look at the awr report, you will see that Oracle is continuously accessing the DR$I table, there is a SQL similar to
SELECT /*+ DYNAMIC_SAMPLING(0) INDEX(i) */
TOKEN_FIRST, TOKEN_LAST, TOKEN_COUNT, ROWID
FROM DR$idxname$I
WHERE TOKEN_TEXT = :word AND TOKEN_TYPE = :wtype
ORDER BY TOKEN_TEXT, TOKEN_TYPE, TOKEN_FIRST
which is continuously done.
I think that the algorithm used by Oracle to keep blocks in cache is too complex. A just realized that in 12.1.0.2 (was released last week) there is finally a "killer" functionality, the in-memory parameters, with which you can pin tables or columns in memory with compression, etc. this looks ideal for the text index, I hope that R. Ford will finally update his white paper :-)
But my other problem was that the optimize_index in REBUILD mode caused the DR$I table to double in size : it seems crazy that this was closed as not a bug but it was and I can't do anything about it. It is a bug in my opinion, because the create index command and "alter index rebuild" command both result in a much smaller index, so why would the guys that developped the optimize function (is it another team, using another algorithm ?) make the index two times bigger ?
And for that the track I have been following is to put the index in a 16K tablespace : in this case the space used by the index remains more or less flat (increases but much more reasonably). The difficulty here is to pin the index in memory because the trick of R. Ford was not working anymore.
What worked:
first set the keep_pool to zero and set the db_16k_cache_size to instead. Then change the storage preference to make sure that everything you want to cache (mostly the DR$I) table come in the tablespace with the non-standard block size of 16k.
Then comes the tricky part : the pre-loading of the data in the buffer cache. The problem is that with Oracle 12c, Oracle will use direct_path_read for FTS which basically means that it bypasses the cache and read directory from file to the PGA !!! There is an event to avoid that, I was lucky to find it on a blog (I can't remember which, sorry for the credit).
I ended-up doing that. the events to 10949 is to avoid the direct path reads issue.
alter session set events '10949 trace name context forever, level 1';
alter table DR#idxname0001$I cache;
alter table DR#idxname0002$I cache;
alter table DR#idxname0003$I cache;
SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0001$I;
SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0002$I;
SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0003$I;
SELECT /*+ INDEX(ITAB) CACHE(ITAB) */ SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0001$I ITAB;
SELECT /*+ INDEX(ITAB) CACHE(ITAB) */ SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0002$I ITAB;
SELECT /*+ INDEX(ITAB) CACHE(ITAB) */ SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0003$I ITAB;
It worked. With a big relief I expected to take some time out, but there was a last surprise. The command
exec ctx_ddl.optimize_index(idx_name=>'idxname',part_name=>'partname',optlevel=>'REBUILD');
gqve the following
ERROR at line 1:
ORA-20000: Oracle Text error:
DRG-50857: oracle error in drftoptrebxch
ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DDL", line 1141
ORA-06512: at line 1
Which is very much exactly described in a metalink note 1645634.1 but in the case of a non-partitioned index. The work-around given seemed very logical but it did not work in the case of a partitioned index. After experimenting, I found out that the bug occurs when the partitioned index is created with dbms_pclxutil.build_part_index procedure (this enables enables intra-partition parallelism in the index creation process). This is a very annoying and stupid bug, maybe there is a work-around, but did not find it on metalink
Other points of attention with the text index creation (stuff that surprised me at first !) ;
- if you use the dbms_pclxutil package, then the ctx_output logging does not work, because the index is created immediately and then populated in the background via dbms_jobs.
- this in combination with the fact that if you are on a RAC, you won't see any activity on the box can be very frightening : this is because oracle can choose to start the workers on the other node.
I understand much better how the text indexing works, I think it is a great technology which can scale via partitioning. But like always the design of the application is crucial, most of our problems come from the fact that we did not choose the right sectioning (we choosed PATH_SECTION_GROUP while XML_SECTION_GROUP is so much better IMO). Maybe later I can convince the dev to change the sectionining, especially because SDATA and MDATA section are not supported with PATCH_SECTION_GROUP (although it seems to work, even though we had one occurence of a bad result linked to the existence of SDATA in the index definition). Also the whole problematic of mixed structured/unstructured searches is completly tackled if one use XML_SECTION_GROUP with MDATA/SDATA (but of course the app was written for Oracle 10...)
Regards, Pierre

Suggestion: Oracle text CONTEXT index on one or more columns ?

Hi,
I'm implementing Oracle text using CONTEXT ..... and would like to ask you for performance suggestion ...
I have a table of Articles .... with columns .. TITLE, SUBTITLE , BODY ...
Now is it better from performance point of view to move all three columns into one dummy column ... with name like FULLTEXT ... and put index on this single column,
and then use CONTAINS(FULLTEXT,'...')>0
Or is it almost the same for oracle if i put indexes on all three columns and then call:
CONTAINS(TITLE,'...')>0 OR CONTAINS(SUBTITLE,'...')>0 OR CONTAINS(BODY,'...')>0
I actually don't care if the result is a match in TITLE OR SUBTITLE OR BODY ....
So if i move into some FULLTEXT column, then i have duplicate data in a article row ... but if i create indexes for each column, than oracle has 2x more to index,optimize and search ... am I wright ?
Table has 1.8mil records ...
Thank you.
Kris

mackrispi wrote:
Now is it better from performance point of view to move all three columns into one dummy column ... with name like FULLTEXT ... and put index on this single column,
and then use CONTAINS(FULLTEXT,'...')>0What version of Oracle are you on? If 11 then you could use a virtual column to do this, otherwise you'd have to write code to maintain the column which can get messy.
mackrispi wrote:
Or is it almost the same for oracle if i put indexes on all three columns and then call:
CONTAINS(TITLE,'...')>0 OR CONTAINS(SUBTITLE,'...')>0 OR CONTAINS(BODY,'...')>0Benchmark it and find out :)
Another option would be something like this.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:9455353124561
Were i you, i would try out those 3 approaches and see which meet your performance requirements and weigh that with the ease of implementation and administration.

ERROR at line 1: ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine ORA-20000: Oracle Text error: DRG-10700: preference does not exist: global_lexer ORA-06512: at "CTXSYS.DRUE", line 160 ORA-06512: at "CTXSYS.TEXTINDEXMETHODS", line 366

database version 11.2.0.4
rac two node
CREATE INDEX MAXIMO.ACTCI_NDX3 ON MAXIMO.ACTCI
(DESCRIPTION)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('lexer global_lexer language column LANGCODE')
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-10700: preference does not exist: global_lexer
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.TEXTINDEXMETHODS", line 366

Like the error message says, you don't have a global_lexer. So, you need to create a global_lexer and that lexer must have at least a default sub_lexer, then you can use that global_lexer in your index parameters. Please see the demonstration below, including reproduction of the error and solution.
SCOTT@orcl12c> -- reproduction of problem:
SCOTT@orcl12c> CREATE TABLE actci
2    (description VARCHAR2(60),
3      langcode     VARCHAR2(30))
4 /
Table created.
SCOTT@orcl12c> CREATE INDEX ACTCI_NDX3 ON ACTCI (DESCRIPTION)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS('lexer global_lexer language column LANGCODE')
4 /
CREATE INDEX ACTCI_NDX3 ON ACTCI (DESCRIPTION)
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-10700: preference does not exist: global_lexer
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.TEXTINDEXMETHODS", line 366
SCOTT@orcl12c> -- solution:
SCOTT@orcl12c> DROP INDEX actci_ndx3
2 /
Index dropped.
SCOTT@orcl12c> BEGIN
2    CTX_DDL.CREATE_PREFERENCE ('global_lexer', 'multi_lexer');
3    CTX_DDL.CREATE_PREFERENCE ('english_lexer', 'basic_lexer');
4    CTX_DDL.ADD_SUB_LEXER ('global_lexer', 'default', 'english_lexer');
5 END;
6 /
PL/SQL procedure successfully completed.
SCOTT@orcl12c> CREATE INDEX ACTCI_NDX3 ON ACTCI (DESCRIPTION)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS('lexer global_lexer language column LANGCODE')
4 /
Index created.

Oracle text Improve perfomance

Similar Messages

Maybe you are looking for