Substring search with Oracle context indexes

Hi,
i would like to know if it is possibile to do a substring search with one of the obtion offer with the context indexes.
(ctxcat,ctxrule,context)
example:
i would like to search the word 'berub' in a column A in table_example.
the value in the column a are :
The betther
berube
A.berube
berub
Berub
BERUB
R berube
S tartif
Y Thibeault
the rows return should be :
berube
A.berube
berub
Berub
BERUB
R berube
A simple sql could be
select * from table_example where upper(a) like upper('%berub%' );
How i can do this same action with the context indexes and a select (catsearch, contains, matches), if it is possible?
A example will be welcome
Thanks

I know how to do explain plan.
my point is not the query i post, it's just a example.
I have many query on my production we optimize many times (they past from 3min to 15 sec with optimisation, but we want to have better result). At this point we are looking to implant the context indexes to make them more efficient.
Do make this sql more efficient we have to deal with like '%xxxxxx%' and the context indexes like to be a option, but we have to be able to do some substring search with context option.
Is it possible to do it and how?
This is my question and why i post it here. The query is just a simple example to illsutrate what i want.
Thanks to anyone who can answer my question.

Similar Messages

  • Oracle 10g  – Performance with BIG CONTEXT indexes

    I would like to use Oracle XE 10.2.0.1.0 only for the full-text searching of the files residing outside the database on the FTP server.
    Recently I have found out that size of the files to be indexed is 5GB.
    As I have read somewhere on this forum before size of the index should be 30-40% of the indexed text files (so with formatted documents like PDF or DOC even less).
    Lets say that the CONTEXT index size over these files will be 1.5-2GB.
    Number of the concurrent user will be max. 5.
    I can not easily test it my self yet.
    Does anybody have any experience with Oracle XE or other Oracle Database edition performance with the CONTEXT index this BIG?
    Will Oracle XE hardware resources license limitation be sufficient to handle one CONTEXT indexe this BIG?
    (Oracle XE license limitations: 1 GB RAM and 1 CPU)
    Regards.

    That depends on at least three things:
    (1) what is the range of words that will appear in the document set (wide range of documents = smaller resultsets = better performance)
    (2) how precise are the user's queries likely to be (more precise = smaller resultsets = better performance)
    (3) how many milliseconds are your users willing to wait for results
    So, unfortunately, you'll probably have to experiment a bit before you'll know...

  • Problem with oracle text indexes during import

    We have a 9.2.0.6 database using oracle text features on a server with windows 2000 5.00.2195 SP4.
    We need to export its data ( user ARIANE only ) and then import the result into another 9.2.0.6 database.
    The import never comes to an end.
    The only way to make it work is to use the "indexes=n" clause.
    Then ( without the indexes ), we tried to create manually the oracle text indexes.
    We get this error :
    CREATE INDEX ARIANE.DOSTEXTE_DTTEXTE_CTXIDX ON ARIANE.DOSTEXTE (DTTEXTE)
    INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS('lexer ariane_lexer stoplist ctxsys.default_stoplist storage ariane_storage');
    ORA-29855: erreur d'exécution de la routine ODCIINDEXCREATE
    ORA-20000: Erreur Oracle Text :
    DRG-10700: préférence inexistante : ariane_lexer
    ORA-06512: à "CTXSYS.DRUE", ligne 157
    ORA-06512: à "CTXSYS.TEXTINDEXMETHODS", ligne 219
    We then tried to uninstall Oracle text and install it ( My Oracle Support [ID 275689.1] ). The index creation above still fails.
    We also checked our Text installation and setup through My Oracle Support FAQ ( ID 153264.1 ) and everything seems ok.
    Do we have to create some ARIANE* lexer preferences through specific pl/sql ( ctx_report* ? ) before importing anything from the ARIANE user ?
    What do we need to do exactly when exporting data with oracle text features from one database to another given we used to restore the database through a copy of the entire windows files ?
    Is there a specific order to follow to succeed an import ?
    Thank you for your help.
    Jean-michel, Nemours, FRANCE

    Hi
    index preferences are not exported, ie ariane_lexer + ariane_storage, only the Text index metada, thus the DRG-10700 from index DDL on target/import DB.
    I recommend to use ctx_report.create_index_script on source/export DB, see Doc ID 189819.1 for details, export with indexes=N and then create text indexes manually after data import.
    -Edwin

  • Performance issue with Oracle Text index

    Hi Experts,
    We are on Oracle 11.2..0.3 on Solaris 10. I have implemented Oracle Text in our environment and I am facing a strange performance issue that is happening in our environment.
    One sql having CONTAINS clause is taking forever - more than 20 minutes and still does not complete. This sql has a contains clause and an exists clause and a not exists clause.
    Now if I remove the exists clause and a not exists clause , it completes fast. but with those two clauses it is just taking forever. It is late night so i am not able to post the table and sql query details and will do so tomorrow but based on this general description, are there any pointers for me to review?
    sql query doing fine:
    SELECT
        U.CLNT_OID, U.USR_OID, S.MAILADDR
    FROM
        access_usr U
        INNER JOIN access_sia S
            ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
        WHERE U.CLNT_OID = 'ABCX32S'
        AND CONTAINS(LAST_NAME , 'TO%' ) >0
    --sql query that hangs forever:
    SELECT
        U.CLNT_OID, U.USR_OID, S.MAILADDR
    FROM
        access_usr U
        INNER JOIN access_sia S
            ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
        WHERE U.CLNT_OID = 'ABCX32S'
        AND CONTAINS(LAST_NAME , 'TO%' ) >0
    and exists (--one clause here wiht a few table joins)
    and not exists (--one clause here wiht a few table joins);
    --Now another strange thing I found is if instead of 'TO%' in this sql, if I were to use 'ZZ%' or 'L1%' it works fast but for 'TO%' it goes slow with those two exists not exists clauses!
    I will be most thankful for the inputs.
    OrauserN

    Hi Barbara,
    First of all, thanks a lot for reviewing the issue.
    Unluckily making the change to empty_stoplist did not work out. I am today copying the entire sql here that has this issue and will be most thankful for more insights/pointers on what can be done.
    Here is the entire sql:
    SELECT U.CLNT_OID,
           U.USR_OID,
           S.EMAILADDRESS,
           U.FIRST_NAME,
           U.LAST_NAME,
           S.JOBCODE,
           S.LOCATION,
           S.DEPARTMENT,
           S.ASSOCIATEID,
           S.ENTERPRISECOMPANYCODE,
           S.EMPLOYEEID,
           S.PAYGROUP,
           S.PRODUCTLOCALE
      FROM    ACCESS_USR U
           INNER JOIN
              ACCESS_SIA S
           ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
    WHERE     U.CLNT_OID = 'G39NY3D25942TXDA'
           AND EXISTS
                  (SELECT 1
                     FROM ACCESS_USR_GROUP_XREF UGX
                          INNER JOIN ACCESS_GROUP RELG
                             ON     RELG.CLNT_OID = UGX.CLNT_OID
                                AND RELG.GROUP_OID = UGX.GROUP_OID
                          INNER JOIN ACCESS_GROUP G
                             ON     G.CLNT_OID = RELG.CLNT_OID
                                AND G.GROUP_TYPE_OID = RELG.GROUP_TYPE_OID
                    WHERE     UGX.CLNT_OID = U.CLNT_OID
                          AND UGX.USR_OID = U.USR_OID
                          AND G.GROUP_OID = 920512943
                          AND UGX.INCLUDED = 1)
           AND NOT EXISTS
                      (SELECT 1
                         FROM    ACCESS_USR_GROUP_XREF UGX
                              INNER JOIN
                                 ACCESS_GROUP G
                              ON     G.CLNT_OID = UGX.CLNT_OID
                                 AND G.GROUP_OID = UGX.GROUP_OID
                        WHERE     UGX.CLNT_OID = U.CLNT_OID
                              AND UGX.USR_OID = U.USR_OID
                              AND G.GROUP_OID = 920512943
                              AND UGX.INCLUDED = 1)
           AND CONTAINS (U.LAST_NAME, 'Bon%') > 0;
    Like I said before if the EXISTS and NOT EXISTS clause are removed it works in sub-second. But with those EXISTS and NOT EXISTS CLAUSE IT TAKES ANY WHERE FROM 25 minutes to more than one hour.
    NOte also that it was not TO% but Bon% in the CONTAINS clause that is giving the issue - sorry that was wrong on my part.
    Also please see below the ORACLE TEXT index defined on the table ACCESS_USER:
    --definition of preferences used in the index:
    SET SERVEROUTPUT ON size unlimited
    WHENEVER SQLERROR EXIT SQL.SQLCODE
    DECLARE
       v_err       VARCHAR2 (1000);
       v_sqlcode   NUMBER;
       v_count     NUMBER;
    BEGIN
       ctxsys.ctx_ddl.create_preference ('cust_lexer', 'BASIC_LEXER');
       ctxsys.ctx_ddl.set_attribute ('cust_lexer', 'base_letter', 'YES'); -- removes diacritics
    EXCEPTION
       WHEN OTHERS
       THEN
          v_err := SQLERRM;
          v_sqlcode := SQLCODE;
          v_count := INSTR (v_err, 'DRG-10701');
          IF v_count > 0
          THEN
             DBMS_OUTPUT.put_line (
                'The required preference named CUST_LEXER with BASIC LEXER is already set up');
          ELSE
             RAISE;
          END IF;
    END;
    DECLARE
       v_err       VARCHAR2 (1000);
       v_sqlcode   NUMBER;
       v_count     NUMBER;
    BEGIN
       ctxsys.ctx_ddl.create_preference ('cust_wl', 'BASIC_WORDLIST');
       ctxsys.ctx_ddl.set_attribute ('cust_wl', 'SUBSTRING_INDEX', 'true'); -- to improve performance
    EXCEPTION
       WHEN OTHERS
       THEN
          v_err := SQLERRM;
          v_sqlcode := SQLCODE;
          v_count := INSTR (v_err, 'DRG-10701');
          IF v_count > 0
          THEN
             DBMS_OUTPUT.put_line (
                'The required preference named CUST_WL with BASIC WORDLIST is already set up');
          ELSE
             RAISE;
          END IF;
    END;
    --now below is the code of the index:
    CREATE INDEX ACCESS_USR_IDX3 ON ACCESS_USR
    (FIRST_NAME)
    INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS('LEXER cust_lexer WORDLIST cust_wl SYNC (ON COMMIT)');
    CREATE INDEX ACCESS_USR_IDX4 ON ACCESS_USR
    (LAST_NAME)
    INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS('LEXER cust_lexer WORDLIST cust_wl SYNC (ON COMMIT)');
    The strange thing is that, like I said, If I remove the exists clause the query returns very fast. Also if I modify the query to use only one NOT EXISTS clause and remove the other EXISTS clause it returns in less than one second.  Also if I remove the EXISTS clause and use only the NOT EXISTS  clause it returns in less than 4 seconds. But with both clauses it runs forever!
    When I tried to get dbms_xplan.display_cursor to get the query plan (for the case of both exists and not exists clause in the query), it said that previous statement's sql id was 0 or something like that so that I was not able to see the query plan. I will keep trying to get this plan (it takes 25 minutes to one hour each time but will get this info soon). Again any pointers are most helpful.
    Regards
    OrauserN

  • Smart search with oracle SES

    Hi,
    I'm working on a demo and I'd like to set up smart search. I'm trying to make it work with Oracle Secure Enterprise Search (google onebox isn't free), but I'm having a hard time finding some documentation on-line, and the configuration process is unclear to me.
    Has someone already done this? Could you please give me a link, or guide me through Oracle SES Search Mode Configuration ?
    Thanks,
    Cyril

    Hi,
    I found a good tutorial: http://st-curriculum.oracle.com/tutorial/SESAdminTutorial/index.htm
    Thanks for reading ;) !
    Cyril

  • Offline Search with Oracle Documentation Library?

    ..I have download Oracle Database Documentation Library 10g for reading at home..
    It's all fine! But i can not use the Search function offline.
    (see Quick Search or Tab Search. Enter a word or phrase. Button Search)
    It must offline, because i have not inernet connection at home for online..
    Question: Is it posible or somehow can I config for Search, but offline!

    maybe you can use google desktop ...

  • Oracle 9.2 ConText index alternate_spelling problem

    Hello everybody!
    I'm having problems with a ConText index in Oracle 9.2, using the alternative_spelling parameter...
    Here is my code
    CREATE TABLE U2000P.TEST_FICHIER_INT
    (ID NUMBER(6) NOT NULL,
    NOM_FICHIER VARCHAR2(90) NULL,
    MIME VARCHAR2(90) NULL,
    FICHIER BLOB DEFAULT empty_blob(),
    LNG VARCHAR2(3) NULL,
    KEY_WORDS VARCHAR2(500) NULL,
    CONSTRAINT PK_TEST_FICHIER_INT PRIMARY KEY (ID)
    EXECUTE CTX_DDL.CREATE_PREFERENCE('ENGLISH_LEXER','BASIC_LEXER');
    EXECUTE CTX_DDL.SET_ATTRIBUTE('ENGLISH_LEXER', 'INDEX_THEMES', 'YES');
    EXECUTE CTX_DDL.SET_ATTRIBUTE('ENGLISH_LEXER', 'THEME_LANGUAGE', 'ENGLISH');
    EXECUTE CTX_DDL.SET_ATTRIBUTE('ENGLISH_LEXER', 'BASE_LETTER', 'NO');
    EXECUTE CTX_DDL.CREATE_PREFERENCE('FRENCH_LEXER','BASIC_LEXER');
    EXECUTE CTX_DDL.SET_ATTRIBUTE('FRENCH_LEXER', 'INDEX_THEMES', 'NO');
    EXECUTE CTX_DDL.SET_ATTRIBUTE('FRENCH_LEXER', 'BASE_LETTER', 'NO');
    EXECUTE CTX_DDL.CREATE_PREFERENCE('GERMAN_LEXER','BASIC_LEXER');
    EXECUTE CTX_DDL.SET_ATTRIBUTE('GERMAN_LEXER', 'INDEX_THEMES', 'NO');
    EXECUTE CTX_DDL.SET_ATTRIBUTE('GERMAN_LEXER', 'BASE_LETTER', 'NO');
    EXECUTE CTX_DDL.SET_ATTRIBUTE('GERMAN_LEXER', 'ALTERNATE_SPELLING', 'GERMAN');
    EXECUTE CTX_DDL.CREATE_PREFERENCE('GLOBAL_LEXER','MULTI_LEXER');
    EXECUTE CTX_DDL.ADD_SUB_LEXER('GLOBAL_LEXER', 'FRENCH', 'FRENCH_LEXER', '1');
    EXECUTE CTX_DDL.ADD_SUB_LEXER('GLOBAL_LEXER', 'DEFAULT', 'GERMAN_LEXER');
    EXECUTE CTX_DDL.ADD_SUB_LEXER('GLOBAL_LEXER', 'ENGLISH', 'ENGLISH_LEXER', '5');
    CREATE INDEX IDX_F_TEST_FICHIER_INT
    ON TEST_FICHIER_INT(FICHIER)
    INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS('DATASTORE CTXSYS.DIRECT_DATASTORE
    FILTER CTXSYS.INSO_FILTER
    LEXER GLOBAL_LEXER LANGUAGE COLUMN LNG');
    In one of the files that I load, I have the word 'paläontologie'
    Here are my searches
    select nom_fichier, score(1) from test_fichier_int where contains(fichier, 'paläontologie', 1) > 0;
    -> no rows selected
    select nom_fichier, score(1) from test_fichier_int where contains(fichier, 'palaontologie', 1) > 0;
    -> Finds my document
    Why does the first search not work?
    If I don't use the 'alternate_spelling' parameter, both searches don't work, why is that???
    Thanks in advance for your help
    Best regards
    Neil.

    I found my error!!!! Thanks Neil... lol
    In fact, it's my SQL*Plus that must be badly configured, and I am having problems with accentuated characters... If I search through a browser, it works!!!
    Sorry about that...
    Best regards
    Neil.

  • Searching using Oracle Text instead of LIKE '%'

    Hello all,
    I hope you help me in this:
    I have a table looks like this
    create table subscribers (
    id numer(10),
    first_name varchar2(30),
    father_name varchar2(30),
    grandfather_name varchar2(30),
    last_name varchar2(30))
    The application is built using Oracle Forms. Many times, the end users are not so sure of the spelling of the name, therefore they use the "%" wildcard with name fields. This will be reflected to the queries the application will send them to the Oracle Server.
    We have the following queries
    1) select *
    from subscribers
    where last_name like '%family_name%';
    2) select *
    from subscribers
    where last_name like 'family_name%';
    3) select *
    from subscribers
    where last_name like '%family_name%' and first_name like '%first_name%';
    4) select *
    from subscribers
    where last_name like 'family_name%' and first_name like 'first_name%';
    As well as searching on the father_name and grandfather_name fields. But most of the search are on the first_name and the last_name.
    These queries are killing the server since we have millions of records. BTree indexes will not help here because of the LIKE and the "%"
    I am thinking to use Oracle Text here, but I am not sure whether I have to go for a CONTEXT index on each individual column, or I can use the MULTI_COLUMN_DATASTORE indexing.
    Any idea will be appreciated

    The ctxcat index and catsearch operator are generally intended for usage with one text column and one or more columns of structured data. You would have to pick just one of your columns as the text column and the others as structured columns. I would be more inclined to use the multi_column_datastore with a context index and contains operator, so that you can search all of your columns as text columns.

  • Slow performance for context index

    Hi, I'm just a newbie here in forum and I would like ask for your expertise about oracle context index. I have my sql and I'm using wild character for searching '%%' .
    I used the sql below with a context index (ctxsys.context) in order to avoid full table scan for wild character searching.
    SELECT BODY_ID
                        TITLE, trim(upper(title)) as title_sort,
                        SUM(JAN) as JAN,
                        SUM(FEB) as FEB,
                        SUM(MAR) as MAR,
                        SUM(APR) as APR,
                        SUM(MAY) as MAY,
                        SUM(JUN) as JUN,
                        SUM(JUL) as JUL,
                        SUM(AUG) as AUG,
                        SUM(SEP) as SEP,
                        SUM(OCT) as OCT,
                        SUM(NOV) as NOV,
                        SUM(DEC) AS DEC
                        FROM APP_REPCBO.CBO_TURNAWAY_REPORT
                        WHERE contains (BODY_ID,'%240103%') >0 and
    PERIOD BETWEEN '1201' AND '1212'
                        GROUP BY BODY_ID, trim(upper(title))
    But i was surprised that performance was very slow, and when I try this on explain plan time of performance almost consume 2 hours.
    plan FOR succeeded.
    PLAN_TABLE_OUTPUT
    Plan hash value: 814472363
    | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
    | 0 | SELECT STATEMENT | | 1052K| 97M| | 805K (1)| 02:41:12 |
    | 1 | HASH GROUP BY | | 1052K| 97M| 137M| 805K (1)| 02:41:12 |
    |* 2 | TABLE ACCESS BY INDEX ROWID| CBO_TURNAWAY_REPORT | 1052K| 97M| | 782K (1)| 02:36:32 |
    |* 3 | DOMAIN INDEX | CBO_REPORT_BID_IDX | | | | 663K (0)| 02:12:41 |
    Predicate Information (identified by operation id):
    2 - filter("PERIOD">='1201' AND "PERIOD"<='1212')
    3 - access("CTXSYS"."CONTAINS"("BODY_ID",'%240103%')>0)
    16 rows selected
    oracle version: Oracle Database 11g Release 11.1.0.7.0 - 64bit Production
    Thanks,
    Zack

    Hi Rod,
    Thanks for the reply, yes I already made gather stats on that table, including rebuild index.
    but its so strange when I use another body_id the performance will vary.
    SQL> EXPLAIN PLAN FOR
    2 SELECT BODY_ID
    3 TITLE, trim(upper(title)) as title_sort,
    4 SUM(JAN) as JAN,
    5 SUM(FEB) as FEB,
    6 SUM(MAR) as MAR,
    7 SUM(APR) as APR,
    8 SUM(MAY) as MAY,
    9 SUM(JUN) as JUN,
    10 SUM(JUL) as JUL,
    11 SUM(AUG) as AUG,
    12 SUM(SEP) as SEP,
    13 SUM(OCT) as OCT,
    14 SUM(NOV) as NOV,
    15 SUM(DEC) as DEC
    16 FROM WEB_REPCBO.CBO_TURNAWAY_REPORT
    17 WHERE contains (BODY_ID,'%119915311%')> 0 and
    18 PERIOD BETWEEN '1201' AND '1212'
    19 GROUP BY BODY_ID, trim(upper(title));
    SELECT * FROM TABLE (dbms_xplan.display);
    Explained.
    SQL>
    Explained.
    SQL>
    PLAN_TABLE_OUTPUT
    Plan hash value: 814472363
    | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
    | 0 | SELECT STATEMENT | | 990 | 96030 | 1477 (1)| 00:00:18 |
    | 1 | HASH GROUP BY | | 990 | 96030 | 1477 (1)| 00:00:18 |
    |* 2 | TABLE ACCESS BY INDEX ROWID| CBO_TURNAWAY_REPORT | 990 | 96030 | 1475 (0)| 00:00:18 |
    |* 3 | DOMAIN INDEX | CBO_REPORT_BID_IDX | | | 647 (0)| 00:00:08 |
    Predicate Information (identified by operation id):
    2 - filter("PERIOD">='1201' AND "PERIOD"<='1212')
    3 - access("CTXSYS"."CONTAINS"("BODY_ID",'%119915311%')>0)
    16 rows selected.

  • Searching for words with dot  in the context index.

    Hi ,
    We have a context index that uses a following BASIC_LEXER
    drop_preference (c_lexer_pref);
    ctx_ddl.create_preference(c_lexer_pref, 'basic_lexer');
    ctx_ddl.set_attribute(c_lexer_pref, 'base_letter', 'YES');
    ctx_ddl.set_attribute(c_lexer_pref, 'whitespace', '()"`/');
    ctx_ddl.set_attribute (c_lexer_pref, 'index_stems', 'english');
    When user searches for "OAR" , oracle does not find "O.A.R'"
    Can someone help me understand why oracle is not able to find the "O.A.R" when OAR is being searched.
    Thanks for your help.

    You need to use skipjoins to tell it to skip the periods, as shown below.
    scott@ORA92> drop table your_table
    2 /
    Table dropped.
    scott@ORA92> exec ctx_ddl.drop_preference ('c_lexer_pref')
    PL/SQL procedure successfully completed.
    scott@ORA92> begin
    2 ctx_ddl.create_preference ('c_lexer_pref', 'basic_lexer');
    3 ctx_ddl.set_attribute ('c_lexer_pref', 'base_letter', 'YES');
    4 ctx_ddl.set_attribute ('c_lexer_pref', 'whitespace', '()"`/');
    5 ctx_ddl.set_attribute ('c_lexer_pref', 'index_stems', 'english');
    6 ctx_ddl.set_attribute ('c_lexer_pref', 'skipjoins', '.');
    7 end;
    8 /
    PL/SQL procedure successfully completed.
    scott@ORA92> create table your_table
    2 (your_column varchar2(30))
    3 /
    Table created.
    scott@ORA92> insert into your_table values ('OAR')
    2 /
    1 row created.
    scott@ORA92> insert into your_table values ('O.A.R')
    2 /
    1 row created.
    scott@ORA92> create index your_index
    2 on your_table (your_column)
    3 indextype is ctxsys.context
    4 parameters ('lexer c_lexer_pref')
    5 /
    Index created.
    scott@ORA92> select * from your_table
    2 where contains (your_column, 'OAR') > 0
    3 /
    YOUR_COLUMN
    O.A.R
    OAR
    scott@ORA92>

  • Problem with blob column index created using Oracle Text.

    Hi,
    I'm running Oracle Database 10g 10.2.0.1.0 standard edition one, on windows server 2003 R2 x64.
    I have a table with a blob column which contains pdf document.
    Then, I create an index using the following script so that I can do fulltext search using Oracle Text.
    CREATE INDEX DMCS.T_DMCS_FILE_DF_FILE_IDX ON DMCS.T_DMCS_FILE
    (DF_FILE)
    INDEXTYPE IS CTXSYS.CONTEXT
    PARAMETERS('DATASTORE CTXSYS.DEFAULT_DATASTORE');
    However, the index is not searchable and I check the following tables created by database for my index and found them to be empty as well !!
    DR$T_DMCS_FILE_DF_FILE_IDX$I
    DR$T_DMCS_FILE_DF_FILE_IDX$K
    DR$T_DMCS_FILE_DF_FILE_IDX$N
    DR$T_DMCS_FILE_DF_FILE_IDX$R
    I wonder what's wrong with it.
    My user has been granted the ctx_app role and I have other tables that store plain text which I use Oracle Text are fine. I even output the blob column and save as pdf file and they are fine.
    However the database seems like not indexing my blob column although the index can be created without error.
    Please advise.
    Really appreciate anyone who can help.
    Thank you.

    The situation is I have already loaded a few pdf document into the table's blob column.
    After I create the Oracle text index on this blob column, I find the system generated index tables listed in my earlier posting are empty, except for the 4th table.
    Normally we'll see words inside the table where those are the words indexed by oracle text on my document.
    As a result, no matter how i search for the index using select statement with contains operator, it will not give me any result.
    I feel weird why the blob is not indexed. The content of the blob are actually valid because I tested this by export the content back to pdf and I can still view and search within the pdf.
    Regards,
    Jap.

  • Wildcard search with catsearch on Oracle 10g

    Wildcard search problem is being discussed many times, however the solutions provided did not solve the problem.
    I am using catsearch to take its advantages and return results at a faster rate. Aim is to simulate a like '%abc%' using catsearch in 10g.
    Following are the steps to reproduce the problem.
    CREATE TABLE test
       (name VARCHAR2(60))
    INSERT ALL
    INTO test VALUES ('VCL Master')
      INTO test VALUES ('VCL Master S.')
      INTO test VALUES ('VCL Master S.A. Compartment 1')
      INTO test VALUES ('VCL MasterS.A. Compartment 2')
      INTO test VALUES ('VCL Master., S.A.')
      INTO test VALUES ('KCL Master Corp.')
      SELECT * FROM DUAL
    begin
    ctx_ddl.create_preference('Jylex', 'basic_lexer');
    ctx_ddl.set_attribute('Jylex','SKIPJOINS','.-=[];\,/~!@#$%^&*+{}:"|<>?`§´¨½¼¾¤£€©®''');
    end;
    begin
        Ctx_Ddl.Create_Preference('wildcard_Jylex', 'BASIC_WORDLIST');
        ctx_ddl.set_attribute('wildcard_Jylex', 'wildcard_maxterms', 15000) ;
    end;
    CREATE INDEX test_inx ON test(NAME)
    INDEXTYPE IS CTXSYS.CTXCAT
    PARAMETERS('STOPLIST CTXSYS.EMPTY_STOPLIST
    LEXER     Jylex
    WORDLIST wildcard_Jylex')
    problem1:
    select * from test where catsearch(name, 'CL Mast*', NULL)>0 --- no results returned
    problem 2:
    when I run the following query on the actual column of my table with 3 million records,
    select * from XXXX where catsearch (NAME, 'VCL Master S*', NULL) > 0
    I get the following error.
    DRG-51030: wildcard query expansion resulted in too many terms
    I have used () and "". Did a lot of R&D and still I am not able to a solution that resolves both of the problems.
    Suggestions will be much appreciated.
    Thanks.

    This post has nothing to do with Oracle Objects.  Perhaps some moderator will move it to the Oracle Text sub-forum/space, where it belongs.
    Your search for 'CL Mast*' did not find any rows because there aren't any rows that match that criteria.  If you want to return rows that have that string, then you need to add a leading wildcard.  Technically CTXCAT indexes and CATSEARCH do not support leading wildcards, but you can use two asterisks as a workaround, so you can search for '**CL Mast*'.
    Your wildcard_maxterms is set to 15000, so if there are more than 15000 words that begin with 's' in your 3 million records, then it will result in an error.  If you upgrade to Oracle 11g, then you can set the wildcard_maxterms higher or to unlimited by setting it to 0, but that may cause your system to run out of memory.  Most applications trap these errors and return a simple message to the user, indicating that the search for s* is too broad and to narrow the search.
    I would use a CONTEXT index with CONTAINS instead of CTXCAT and CATSEARCH.  It supports leading wildcards and you can index substrings for faster searches.  If you want it to be like the CTXCAT index then you can make it transactional.

  • Stop words handling with CONTEXT index - weird behavior

    I have a context index with the following output from the report (describe index report).
    CTX_REPORT.DESCRIBE_INDEX('KWTI10569_20121010115054')
    ===========================================================================
    INDEX DESCRIPTION
    ===========================================================================
    index name: "METCALF_T"."KWTI10569_20121010115054"
    index id: 1524
    index type: context
    base table: "METCALF_T"."KWTD10569_20121010115054"
    primary key column:
    text column: MESSAGE_CONTENT
    text column type: RAW(2000)
    language column:
    format column: FMT
    charset column: CSET
    ===========================================================================
    INDEX OBJECTS
    ===========================================================================
    datastore: DIRECT_DATASTORE
    filter: CHARSET_FILTER
    charset: UTF8
    section group: NULL_SECTION_GROUP
    lexer: BASIC_LEXER
    punctuations: .?!
    skipjoins: _-"'`~!@#$%^&*()+=|}{[]\:;<>?/,
    continuation: \-
    index_stems: NONE
    wordlist: BASIC_WORDLIST
    stemmer: ENGLISH
    fuzzy_match: GENERIC
    stoplist: BASIC_STOPLIST
    stop_word: how
    stop_word: however
    stop_word: i
    stop_word: if
    <trimmed for brevity of message......but all default stop words provided by Oracle has been added here>
    storage: BASIC_STORAGE
    i_table_clause: tablespace TEXT_INDEX storage (initial 10M next 10M)
    k_table_clause: tablespace TEXT_INDEX storage (initial 10M next 10M)
    r_table_clause: tablespace TEXT_INDEX storage (initial 1M) lob (data) store as (cache)
    n_table_clause: tablespace TEXT_INDEX storage (initial 1M)
    i_index_clause: tablespace TEXT_INDEX storage (initial 1M) compress 2
    DB: 10g (10.2.0.4)
    DB characterset: UTF8
    Distinct tokens from index:
    SQL> select distinct token_text from dr$KWTI10569_20121010115054$i;
    TOKEN_TEXT
    BLAH
    EXPIRE
    OFFER
    My text content:
    SQL>
    SQL> select distinct utl_raw.cast_to_varchar2(message_content) from KWTD10569_20121010115054;
    UTL_RAW.CAST_TO_VARCHAR2(MESSAGE_CONTENT)
    blah blah offer will expire blah blah
    offer expire
    this offer shall expire
    offer to expire
    offer expire
    blah blah offer expire blah blah
    blah blah offer to expire blah blah
    blah blah offer expire blah blah
    offer will expire
    blah blah this offer shall expire blah blah
    10 rows selected.
    Now, when i perform some contain queries i get some behavior that i cant understand.
    When i search for "this offer will expire" i dont get every row (10 rows) - why is that?
    SQL> select UTL_RAW.CAST_TO_VARCHAR2(MESSAGE_CONTENT) from KWTD10569_20121010115054 where contains(message_content,'this offer will expire')>0;
    UTL_RAW.CAST_TO_VARCHAR2(MESSAGE_CONTENT)
    blah blah offer will expire blah blah
    this offer shall expire
    blah blah offer to expire blah blah
    blah blah this offer shall expire blah blah
    Also, when i search for "offer expire" i get the following
    SQL> select UTL_RAW.CAST_TO_VARCHAR2(MESSAGE_CONTENT) from KWTD10569_20121010115054 where contains(message_content,'offer expire')>0;
    UTL_RAW.CAST_TO_VARCHAR2(MESSAGE_CONTENT)
    offer expire
    blah blah offer expire blah blah
    blah blah offer expire blah blah
    offer expire
    I was thinking that the stop words will be ignored while searching in context grammar, so i would get all my rows back? Isnt that correct?
    What i really want to achieve is that all these stop words are stripped from the content AND the keywords when i run the query and i get 100% matches. Any pointers on how that can be accomplished?

    Roger-
    Thanks again. Is there any place in Oracle doc that documents these two facts?
    Please see the example below, does the number of words also matter? My search phrase was "the offer will expire" but why is that i didnt get rows like "offer to expire" back?
    SQL> select distinct utl_raw.cast_to_varchar2(message_content) from KWTD10569_20121010115054;
    UTL_RAW.CAST_TO_VARCHAR2(MESSAGE_CONTENT)
    offer expire
    blah blah offer expire blah blah
    blah blah offer will expire blah blah
    this offer shall expire
    offer expire
    offer to expire
    blah blah offer to expire blah blah
    blah blah offer expire blah blah
    offer will expire
    blah blah this offer shall expire blah blah
    10 rows selected.
    SQL> select UTL_RAW.CAST_TO_VARCHAR2(MESSAGE_CONTENT) from KWTD10569_20121010115054 where contains(message_content,'the offer will expire')>0;
    UTL_RAW.CAST_TO_VARCHAR2(MESSAGE_CONTENT)
    blah blah offer will expire blah blah
    this offer shall expire
    blah blah offer to expire blah blah
    blah blah this offer shall expire blah blah

  • Oracle Text - CTX Context Index Soundex Problem

    Hi,
    I'm running into a problem with Oracle Text when searching using the ! (soundex) option. I've created a simple test example to highlight the issue.
    Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit
    Windows 2008 Server 64-bit
    create table test_tab (test_col  varchar2(200));
    insert all
      into test_tab (test_col) values ('ab-tönes')
      into test_tab (test_col) values ('ab-tones')
      into test_tab (test_col) values ('abtones')
      into test_tab (test_col) values ('ab tones')
      into test_tab (test_col) values ('ab-tanes')
      select * from dual
    select * from test_tab
    begin
          ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
          ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+&''');
          ctx_ddl.set_attribute('test_lex1','base_letter','YES');
          -- ctx_ddl.set_attribute('test_lex1','skipjoins','-');
    end;
    create index test_idx on test_tab (test_col)
      indextype is ctxsys.context
        parameters
          ('lexer        test_lex1'     
    select token_text from dr$test_idx$i;
    TOKEN_TEXT
    AB
    ABTONES
    TANES
    TONES
    select * from test_tab where contains (test_col, '!ab tones') > 0;
    TEST_COL
    ab-tönes
    ab-tones
    ab tones
    select * from test_tab where soundex(test_col) = soundex('ab tones');
    TEST_COL
    ab-tönes
    ab-tones
    abtones
    ab tones
    ab-tanes
    So my question is, can anyone suggest an approach whereby I can get the Oracle Text Context index (or CTXCAT index if it's more appropriate) to return all 5 rows like the simple Soundex is doing?
    I can't really use soundex as this search query will form part of a search screen for a multi-language application. Soundex is limited to English sounding words, so I need the solution to be able to compare strings that may not "sound" English.
    It must be an attribute of the BASIC_LEXER, and I've tried skipjoins, start/end-joins, stop lists, but I just cannot get the Soundex feature of Oracle Text to function like the SOUNDEX() function!
    Looking at how the tokens are stored dr$test_idx$i I need Oracle Text to almost concat 'AB' and 'TONES' to search as a single string.
    Any help greatly appreciated.
    Thanks,

    I am not getting the same problem that you are getting with the umlat, but I don't see what is different.  Please post the result of:
    select ctx_report.create_index_script ('test_idx') from dual;
    Here are the results on my system.  Perhaps you can spot the difference.  I added an empty_stoplist, so that it won't print out a long list of stopwords.
    SCOTT@orcl12c> create table test_tab (test_col    varchar2(200))
      2  /
    Table created.
    SCOTT@orcl12c> insert all
      2    into test_tab (test_col) values ('ab-tönes')
      3    into test_tab (test_col) values ('ab-tones')
      4    into test_tab (test_col) values ('abtones')
      5    into test_tab (test_col) values ('ab tones')
      6    into test_tab (test_col) values ('ab-tanes')
      7  select * from dual
      8  /
    5 rows created.
    SCOTT@orcl12c> select * from test_tab
      2  /
    TEST_COL
    ab-tönes
    ab-tones
    abtones
    ab tones
    ab-tanes
    5 rows selected.
    SCOTT@orcl12c> begin
      2    ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
      3    ctx_ddl.set_attribute('test_lex1','base_letter','YES');
      4  end;
      5  /
    PL/SQL procedure successfully completed.
    SCOTT@orcl12c> create or replace procedure test_proc
      2    (p_rowid in          rowid,
      3      p_clob    in out nocopy clob)
      4  as
      5  begin
      6    select replace (translate (test_col, '/\|-_+&''', '      '), ' ', '')
      7    into   p_clob
      8    from   test_tab
      9    where  rowid = p_rowid;
    10  end test_proc;
    11  /
    Procedure created.
    SCOTT@orcl12c> show errors
    No errors.
    SCOTT@orcl12c> begin
      2    ctx_ddl.create_preference ('test_ds', 'user_datastore');
      3    ctx_ddl.set_attribute ('test_ds', 'procedure', 'test_proc');
      4  end;
      5  /
    PL/SQL procedure successfully completed.
    SCOTT@orcl12c> create index test_idx on test_tab (test_col)
      2    indextype is ctxsys.context
      3    parameters
      4       ('lexer    test_lex1
      5         datastore    test_ds
      6         stoplist    ctxsys.empty_stoplist')
      7  /
    Index created.
    SCOTT@orcl12c> select token_text from dr$test_idx$i
      2  /
    TOKEN_TEXT
    ABTANES
    ABTONES
    2 rows selected.
    SCOTT@orcl12c> variable search_string varchar2(100)
    SCOTT@orcl12c> exec :search_string := 'ab tones'
    PL/SQL procedure successfully completed.
    SCOTT@orcl12c> select * from test_tab
      2  where  contains
      3            (test_col,
      4             '!' || replace (:search_string, ' ', ' !') ||
      5             ' or !' || replace (:search_string, ' ', '')) > 0
      6  /
    TEST_COL
    ab-tönes
    ab-tones
    abtones
    ab tones
    ab-tanes
    5 rows selected.
    SCOTT@orcl12c> exec :search_string := 'abtones'
    PL/SQL procedure successfully completed.
    SCOTT@orcl12c> /
    TEST_COL
    ab-tönes
    ab-tones
    abtones
    ab tones
    ab-tanes
    5 rows selected.
    SCOTT@orcl12c> exec :search_string := 'ab tönes'
    PL/SQL procedure successfully completed.
    SCOTT@orcl12c> /
    TEST_COL
    ab-tönes
    ab-tones
    abtones
    ab tones
    ab-tanes
    5 rows selected.
    SCOTT@orcl12c> select ctx_report.create_index_script ('test_idx') from dual
      2  /
    CTX_REPORT.CREATE_INDEX_SCRIPT('TEST_IDX')
    begin
      ctx_ddl.create_preference('"TEST_IDX_DST"','USER_DATASTORE');
      ctx_ddl.set_attribute('"TEST_IDX_DST"','PROCEDURE','"SCOTT"."TEST_PROC"');
    end;
    begin
      ctx_ddl.create_preference('"TEST_IDX_FIL"','NULL_FILTER');
    end;
    begin
      ctx_ddl.create_section_group('"TEST_IDX_SGP"','NULL_SECTION_GROUP');
    end;
    begin
      ctx_ddl.create_preference('"TEST_IDX_LEX"','BASIC_LEXER');
      ctx_ddl.set_attribute('"TEST_IDX_LEX"','BASE_LETTER','YES');
    end;
    begin
      ctx_ddl.create_preference('"TEST_IDX_WDL"','BASIC_WORDLIST');
      ctx_ddl.set_attribute('"TEST_IDX_WDL"','STEMMER','ENGLISH');
      ctx_ddl.set_attribute('"TEST_IDX_WDL"','FUZZY_MATCH','GENERIC');
    end;
    begin
      ctx_ddl.create_stoplist('"TEST_IDX_SPL"','BASIC_STOPLIST');
    end;
    begin
      ctx_ddl.create_preference('"TEST_IDX_STO"','BASIC_STORAGE');
      ctx_ddl.set_attribute('"TEST_IDX_STO"','R_TABLE_CLAUSE','lob (data) store as (
    cache)');
      ctx_ddl.set_attribute('"TEST_IDX_STO"','I_INDEX_CLAUSE','compress 2');
    end;
    begin
      ctx_output.start_log('TEST_IDX_LOG');
    end;
    create index "SCOTT"."TEST_IDX"
      on "SCOTT"."TEST_TAB"
          ("TEST_COL")
      indextype is ctxsys.context
      parameters('
        datastore       "TEST_IDX_DST"
        filter          "TEST_IDX_FIL"
        section group   "TEST_IDX_SGP"
        lexer           "TEST_IDX_LEX"
        wordlist        "TEST_IDX_WDL"
        stoplist        "TEST_IDX_SPL"
        storage         "TEST_IDX_STO"
    begin
      ctx_output.end_log;
    end;
    1 row selected.

  • Help with creating oracle text index on 2 columns with partial html data

    Hi,
    I need to create an oracle text index on 2 columns.
    TITLE - varchar(255) = contains plain text data
    DESCRIPTION - CLOB = contains partial HTML data
    This is what I created.
    begin
    ctx_ddl.create_preference ('Title_Description_Pref', 'MULTI_COLUMN_DATASTORE');
    ctx_ddl.set_attribute('Title_Description_Pref', 'columns', 'TITLE, DESCRIPTION');
    end;
    begin
    ctx_ddl.create_preference ('bid_lexer', 'BASIC_LEXER');
    ctx_ddl.set_attribute('bid_lexer', 'index_stems', 'ENGLISH');
    ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
    end;
    create index Bid_Title_Index on Bid(title) indextype is ctxsys.context parameters ('LEXER bid_lexer sync (every "sysdate+(1/24)")');
    create index Bid_Title_Desc_Index on Bid(description) indextype is ctxsys.context parameters ('LEXER bid_lexer DATASTORE Title_Description_Pref sync (every "sysdate+(1/24)") filter ctxsys.null_filter section group htmgroup');
    The problem is when I do a CONTAINS(description, '$(auction)')>0. I get results where the descriptions have the "auction" word (which is correct). But, the results also returned rows where the search word is inside an IMG tag. e.g. <img src="http://auction.de/120483" alt="Auction Logo"/>.
    What I would like is to exclude rows where the search word is inside HTML tag attributes, results expected are rows having <a>Auction</a> or <p>For Auction</p> ... etc. Basically stripping the html tags and leave the text contents.
    I'd appreciate some input.
    Thanks,
    Amiel

    Hi,
    I need to create an oracle text index on 2 columns.
    TITLE - varchar(255) = contains plain text data
    DESCRIPTION - CLOB = contains partial HTML data
    This is what I created.
    begin
    ctx_ddl.create_preference ('Title_Description_Pref', 'MULTI_COLUMN_DATASTORE');
    ctx_ddl.set_attribute('Title_Description_Pref', 'columns', 'TITLE, DESCRIPTION');
    end;
    begin
    ctx_ddl.create_preference ('bid_lexer', 'BASIC_LEXER');
    ctx_ddl.set_attribute('bid_lexer', 'index_stems', 'ENGLISH');
    ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
    end;
    create index Bid_Title_Index on Bid(title) indextype is ctxsys.context parameters ('LEXER bid_lexer sync (every "sysdate+(1/24)")');
    create index Bid_Title_Desc_Index on Bid(description) indextype is ctxsys.context parameters ('LEXER bid_lexer DATASTORE Title_Description_Pref sync (every "sysdate+(1/24)") filter ctxsys.null_filter section group htmgroup');
    The problem is when I do a CONTAINS(description, '$(auction)')>0. I get results where the descriptions have the "auction" word (which is correct). But, the results also returned rows where the search word is inside an IMG tag. e.g. <img src="http://auction.de/120483" alt="Auction Logo"/>.
    What I would like is to exclude rows where the search word is inside HTML tag attributes, results expected are rows having <a>Auction</a> or <p>For Auction</p> ... etc. Basically stripping the html tags and leave the text contents.
    I'd appreciate some input.
    Thanks,
    Amiel

Maybe you are looking for

  • Auto Email generation in multiple language in Access Enforcer 5.2

    Hi All, We have configured workflow in Access Enforcer 5.2 for autoprovisioning of users in the system. Requestor gets an email in english with the userid and password once the user is provisioned in the system. Now the requirment is to send these em

  • 101 & 541 in one step, SC vendor indicator

    we have 2 plants in same company code, and using STO (stock transport order, order type UB) to transfer goods from plant A to plant B, now there is one case, plant B ask for a material from plant A, but need it shipped directly to a external vendor,

  • Why should we activate an dso..

    Hi gurus , i have a doubt regarding activation of DSO .While we are loading data using process chains,after the DTP process chain we will have a process chain for activation of DSO,please could any one clear my doubt ,why exactly we are activating th

  • Accessing COM/OLE2 Objects using Jdeveloper3.1

    We have created JSP files using JDEVELOPER 3.1. Now we need to open Microsoft Word 2000 document from JAVA(JSP) and enable the WORD macros. How do we open the word document and use the macros ? null

  • LED not lighting

    My LED no longer lights as red when I receive a new e-mail message.had issues rec'ing messages recently, lost signal... but appears ok after taking battery out, rebooting. Why no LED now tho?