Oracle Text performance -- failed attempts

We are trying to implement a simple search of text data stored in a heavily used table (inserts/updates). There are 3 columns to index --
Headline (varchar2(255))
Subheadline (varchar2(255))
Teaser (varchar2(4000))
The first attempt to implement Oracle text w/ CATSEARCH
begin
ctx_ddl.create_index_set('cms_iset');
ctx_ddl.add_index('cms_iset','poolid_cp, mediaid_cp'); /* sub-index A */
end;
---- We knew we were going to filter on poolid_cp and mediaid_cp ---
CREATE INDEX cms_headlineidx ON con_properties (headline)
INDEXTYPE IS ctxsys.CTXCAT
PARAMETERS ('index set cms_iset');
CREATE INDEX cms_subheadlineidx ON con_properties (subheadline)
INDEXTYPE IS ctxsys.CTXCAT
PARAMETERS ('index set cms_iset');
CREATE INDEX cms_teaseridx ON con_properties (teaser)
INDEXTYPE IS ctxsys.CTXCAT
PARAMETERS ('index set cms_iset');
*********THE RESULTS*************
Our application server would spin up threads that would appear to be hanging. The load on the DB servers (RAC) were higher than normal. This implementation would have saved on having to do resync's manually.
The next attempt was implementing w/ CONTEXT:
alter table con_properties add (dummy varchar2(1));
begin
ctx_ddl.create_preference('con_propsearch', 'MULTI_COLUMN_DATASTORE');
ctx_ddl.set_attribute('con_propsearch', 'columns', 'headline,subheadline,teaser');
end;
CREATE INDEX con_properties_searchidx
ON con_properties(dummy)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('datastore CTXSYS.con_propsearch')
Records getting put into the ctx_user_pending table a few hundred per hour.
********THE RESULTS*************
Same issue with the application servers spinning off threads that seem to be hung. Spikey load on the DB servers (RAC).
NOTE: In both implementations, running search querys ran OK. However, dropping the text index in BOTH cases caused the application servers to behave normally.
Can anyone tell me what's going on internally with Oracle TEXT when a table is heavily inserted and updated? What is going on in the background. Is there some sort of lock that the app servers are waiting on? I know there is "overhead" with inserts on a normal b-tree index. Is it "exponential" with Oracle Text?
Thank you!

When documents in the base table are inserted, updated, or deleted, their ROWIDs are held in a DML queue until you synchronize the index. You can view this queue with the CTX_USER_PENDING view. Apparently, you are not synchronizing your context index, so the queue is building infinitely. You need to establish some method of synchronizing your index. You can use parameters('sync(on commit)') in your index creation or create an after insert or update statement level trigger, not row trigger, that uses dbms_job.submit to schedule ctx_ddl.sync_index to synchronize the index upon commit of the dml or you can manually run ctx_ddl.sync_index periodically or schedule it or you can alter and rebuild your index periodically or you can drop and recreate it periodically. Which method you choose depends on how current the information that you query needs to be. If your data needs to be current up to the moment, the you should sync on commit. Otherwise it may be better to do it in periodic batches.

Similar Messages

Oracle text performance with context search indexes

Search performance using context index.
We are intending to move our search engine to a new one based on Oracle Text, but we are meeting some
bad performance issues when using search.
Our application allows the user to search stored documents by name, object identifier and annotations(formerly set on objects).
For example, suppose I want to find a document named ImportSax2.c: according to user set parameters, our search engine format the following
search queries :
1) If the user explicitely ask for a search by document name, the query is the following one =>
 select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c WITHIN objname' , 1 ) > 0;
2) If the user don't specify any extra parameters, the query is the following one =>
 select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c' , 1 ) > 0;
Oracle text only need around 7 seconds to answer the second query, whereas it need around 50 seconds to give an answer for the first query.
Here is a part of the sql script used for creating the Oracle Text index on the column OBJFIELDURL
(this column stores a path to an xml file containing properties that have to be indexed for each object) :
begin
Ctx_Ddl.Create_Preference('wildcard_pref', 'BASIC_WORDLIST');
ctx_ddl.set_attribute('wildcard_pref', 'wildcard_maxterms', 200) ;
ctx_ddl.set_attribute('wildcard_pref','prefix_min_length',3);
ctx_ddl.set_attribute('wildcard_pref','prefix_max_length',6);
ctx_ddl.set_attribute('wildcard_pref','STEMMER','AUTO');
ctx_ddl.set_attribute('wildcard_pref','fuzzy_match','AUTO');
ctx_ddl.set_attribute('wildcard_pref','prefix_index','TRUE');
ctx_ddl.set_attribute('wildcard_pref','substring_index','TRUE');
end;
begin
ctx_ddl.create_preference('doc_lexer_perigee', 'BASIC_LEXER');
ctx_ddl.set_attribute('doc_lexer_perigee', 'printjoins', '_-');
ctx_ddl.set_attribute('doc_lexer_perigee', 'BASE_LETTER', 'YES');
ctx_ddl.set_attribute('doc_lexer_perigee','index_themes','yes');
ctx_ddl.create_preference('english_lexer','basic_lexer');
ctx_ddl.set_attribute('english_lexer','index_themes','yes');
ctx_ddl.set_attribute('english_lexer','theme_language','english');
ctx_ddl.set_attribute('english_lexer', 'printjoins', '_-');
ctx_ddl.set_attribute('english_lexer', 'BASE_LETTER', 'YES');
ctx_ddl.create_preference('german_lexer','basic_lexer');
ctx_ddl.set_attribute('german_lexer','composite','german');
ctx_ddl.set_attribute('german_lexer','alternate_spelling','GERMAN');
ctx_ddl.set_attribute('german_lexer','printjoins', '_-');
ctx_ddl.set_attribute('german_lexer', 'BASE_LETTER', 'YES');
ctx_ddl.set_attribute('german_lexer','NEW_GERMAN_SPELLING','YES');
ctx_ddl.set_attribute('german_lexer','OVERRIDE_BASE_LETTER','TRUE');
ctx_ddl.create_preference('japanese_lexer','JAPANESE_LEXER');
ctx_ddl.create_preference('global_lexer', 'multi_lexer');
ctx_ddl.add_sub_lexer('global_lexer','default','doc_lexer_perigee');
ctx_ddl.add_sub_lexer('global_lexer','german','german_lexer','ger');
ctx_ddl.add_sub_lexer('global_lexer','japanese','japanese_lexer','jpn');
ctx_ddl.add_sub_lexer('global_lexer','english','english_lexer','en');
end;
begin
 ctx_ddl.create_section_group('axmlgroup', 'AUTO_SECTION_GROUP');
end;
drop index ADSOBJ_XOBJFIELDURL force;
create index ADSOBJ_XOBJFIELDURL on ADSOBJ(OBJFIELDURL) indextype is ctxsys.context
parameters
('datastore ctxsys.file_datastore
filter ctxsys.inso_filter
sync (on commit)
lexer global_lexer
language column OBJFIELDURLLANG
charset column OBJFIELDURLCHARSET
format column OBJFIELDURLFORMAT
section group axmlgroup
Wordlist wildcard_pref
Oracle created a table named DR$ADSOBJ_XOBJFIELDURL$I which now contains around 25 millions records.
ADSOBJ is the table contaings information for our documents,OBJFIELDURL is the field that contains the path to the xml file containing
data to index. That file looks like this :
<?xml version="1.0" encoding="UTF-8" ?>
<fields>
<OBJNAME><![CDATA[NomLnk_177527o.jpgp]]></OBJNAME>
<OBJREM><![CDATA[Z_CARACT_141]]></OBJREM>
<OBJID>295926o.jpgp</OBJID>
</fields>
Can someone tell me how I can make that kind of request
"select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c WITHIN objname' , 1 ) > 0;"
run faster ?

Below are the execution plan for both the 2 requests :
select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c WITHIN objname' , 1 ) > 0
PLAN_TABLE_OUTPUT
| Id | Operation |Name |Rows |Bytes |Cost (%CPU)|
| 0 | SELECT STATEMENT | |1272 |119K | 4 (0) |
| 1 | TABLE ACCESS BY INDEX ROWID |ADSOBJ |1272 |119K | 4 (0) |
| 2 | DOMAIN INDEX |ADSOBJ_XOBJFIELDURL | | | 4 (0) |
Note
- 'PLAN_TABLE' is old version
Executed in 2 seconds
select objid FROM ADSOBJ WHERE CONTAINS( OBJFIELDURL , 'ImportSax2.c' , 1 ) > 0
PLAN_TABLE_OUTPUT
| Id |Operation |Name |Rows |Bytes |Cost (%CPU)|
| 0 | SELECT STATEMENT | |1272 |119K | 4 (0) |
| 1 | TABLE ACCESS BY INDEX ROWID |ADSOBJ |1272 |119K | 4 (0) |
| 2 | DOMAIN INDEX |ADSOBJ_XOBJFIELDURL | | | 4 (0) |
Sorry for the result formatting, I can't get it "easily" readable :(

Regular expression vs oracle text performance

Does anyone have experience with comparig performance of regular expression vs oracle text?
We need to implement a text search on a large volume table, 100K-500K rows.
The select stmt will select from a VL, a view joining 2 tables, B and _TL.
We need to search 2 text columns from this _VL view.
Using regex seems less complex, but the deciding factor is of course performace.
Would oracle text search perform better than regular expression in general?
Thanks,
Margaret

Hi Dominc,
Thanks, we'll try both...
Would you be able to validate our code to create the multi-table index:
CREATE OR REPLACE PACKAGE requirements_util AS
PROCEDURE concat_columns(i_rowid IN ROWID, io_text IN OUT NOCOPY VARCHAR2);
END requirements_util;
CREATE OR REPLACE PACKAGE BODY requirements_util AS
PROCEDURE concat_columns(i_rowid IN ROWID, io_text IN OUT NOCOPY VARCHAR2)
AS
tl_req pjt_requirements_tl%ROWTYPE;
b_req pjt_requirements_b%ROWTYPE;
CURSOR cur_req_name (i_rqmt_id IN pjt_requirements_tl.rqmt_id%TYPE) IS
SELECT rqmt_name FROM pjt_requirements_tl
WHERE rqmt_id = i_rqmt_id;
PROCEDURE add_piece(i_add_str IN VARCHAR2) IS
lx_too_big EXCEPTION;
PRAGMA EXCEPTION_INIT(lx_too_big, -6502);
BEGIN
io_text := io_text||' '||i_add_str;
EXCEPTION WHEN lx_too_big THEN NULL; -- silently don't add the string.
END add_piece;
BEGIN
     BEGIN
          SELECT * INTO b_req FROM pjt_requirements_b WHERE ROWID = i_rowid;
          EXCEPTION
          WHEN NO DATA_FOUND THEN
          RETURN;
     END;
     add_piece(b_req.req_code);
     FOR tl_req IN cur_req_name(b_req.rqmt_id) LOOP
     add_piece(tl_req.rqmt_name);
END concat_columns;
END requirements_util;
EXEC ctx_ddl.drop_section_group('rqmt_sectioner');
EXEC ctx_ddl.drop_preference('rqmt_user_ds');
BEGIN
ctx_ddl.create_preference('rqmt_user_ds', 'USER_DATASTORE');
ctx_ddl.set_attribute('rqmt_user_ds', 'procedure', sys_context('userenv','current_schema')||'.'||'requirements_util.concat_columns');
ctx_ddl.set_attribute('rqmt_user_ds', 'output_type', 'VARCHAR2');
END;
CREATE INDEX rqmt_cidx ON pjt_requirements_b(req_code)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('DATASTORE rqmt_user_ds
SYNC (ON COMMIT)');

Oracle text performance

hi all.
i have useing orcle text for the indexing purposes ..
the below query goes for an wildcard search and it performance is very poor ..
/* Formatted on 2009/08/11 16:06 (Formatter Plus v4.8.5) */
SELECT *
FROM (SELECT z.*, ROWNUM r
 FROM (SELECT *
 FROM (SELECT *
 FROM (SELECT score (1) score,
 SUBSTR (note, 1, 200) note, tmplt_id,
 appl_name, appl_use, appl_reference,
 appl_entity, effctv_start_date,
 effctv_end_date, eq_name, note_id,
 Bfcommon.getvaluebasedontemplate
 (tmplt_id,
 note_id
 ) VALUE,
 (SELECT em_name
 FROM ip_eq
 WHERE eq_name = a.eq_name) em_name,
 Bfcommon.is_edit_allowed_note1
 (a.note_id,
 'PACRIM1\E317329',
 SYSDATE,
 SYSDATE
 ) LOCKED
 FROM bf_note a
 WHERE tmplt_id NOT IN (19, 14, 16)
 AND effctv_start_date >=
 TO_DATE ('08/12/2008 00:00:00',
 'MM/DD/YYYY HH24:MI:SS'
 AND ( effctv_end_date <=
 TO_DATE
 ('08/12/2009 00:00:00',
 'MM/DD/YYYY HH24:MI:SS'
 OR effctv_end_date IS NULL
 AND contains (note, '%test%', 1) > 0) r
 UNION
 SELECT score (1) score, SUBSTR (note, 1, 200) note,
 tmplt_id, appl_name, appl_use,
 appl_reference, appl_entity,
 effctv_start_date, effctv_end_date, eq_name,
 a.note_id, b.trgt_name VALUE,
 (SELECT em_name
 FROM ip_eq
 WHERE eq_name = a.eq_name) em_name,
 Bfcommon.is_edit_allowed_note1
 (a.note_id,
 'PACRIM1\E317329',
 SYSDATE,
 SYSDATE
 ) LOCKED
 FROM bf_note a, om_limit_hist b
 WHERE ( date_changed =
 (SELECT MAX (date_changed)
 FROM om_limit_hist h1
 WHERE date_changed <
 (SELECT MAX (date_changed)
 FROM om_limit_hist h2
 WHERE h2.trgt_name =
 b.trgt_name)
 AND h1.trgt_name = b.trgt_name)
 OR date_changed =
 (SELECT MAX (date_changed)
 FROM om_limit_hist h1
 WHERE h1.trgt_name = b.trgt_name)
 AND b.note_id = a.note_id(+)
 AND tmplt_id = 19
 AND effctv_start_date >=
 TO_DATE ('08/12/2008 00:00:00',
 'MM/DD/YYYY HH24:MI:SS'
 AND ( effctv_end_date <=
 TO_DATE ('08/12/2009 00:00:00',
 'MM/DD/YYYY HH24:MI:SS'
 OR effctv_end_date IS NULL
 AND contains (note, '%test%', 1) > 0)
 ORDER BY 1 DESC) z
 WHERE ROWNUM <= 50)
WHERE r >= 1here it goes for the wild card search for the string test ...Plz tell me how to index it in this case ..
and its plan is
Execution Plan
Plan hash value: 3535478881
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 50 | 128K| | 632 (1)| 00:00:08 |
|* 1 | VIEW | | 50 | 128K| | 632 (1)| 00:00:08 |
|* 2 | COUNT STOPKEY | | | | | | |
| 3 | VIEW | | 69 | 177K| | 632 (1)| 00:00:08 |
|* 4 | SORT ORDER BY STOPKEY | | 69 | 177K| 376K| 632 (1)| 00:00:08 |
| 5 | VIEW | | 69 | 177K| | 590 (1)| 00:00:08 |
| 6 | SORT UNIQUE | | 69 | 7281 | | 590 (2)| 00:00:08 |
| 7 | UNION-ALL | | | | | | |
|* 8 | TABLE ACCESS BY INDEX ROWID | BF_NOTE | 68 | 7140 | | 585 (1)| 00:00:08 |
|* 9 | DOMAIN INDEX | BF_NOTE_TEXT_SEARCH | | | | 221 (0)| 00:00:03 |
|* 10 | FILTER | | | | | | |
| 11 | NESTED LOOPS | | 1 | 141 | | 3 (0)| 00:00:01 |
| 12 | TABLE ACCESS FULL | OM_LIMIT_HIST | 1 | 36 | | 2 (0)| 00:00:01 |
|* 13 | TABLE ACCESS BY INDEX ROWID| BF_NOTE | 1 | 105 | | 1 (0)| 00:00:01 |
|* 14 | INDEX UNIQUE SCAN | BF_NOTE_PK | 1 | | | 1 (0)| 00:00:01 |
| 15 | SORT AGGREGATE | | 1 | 23 | | | |
|* 16 | INDEX RANGE SCAN | OM_LIMIT_HIST_IDX1 | 1 | 23 | | 0 (0)| 00:00:01 |
| 17 | SORT AGGREGATE | | 1 | 23 | | | |
|* 18 | INDEX RANGE SCAN | OM_LIMIT_HIST_IDX1 | 1 | 23 | | 0 (0)| 00:00:01 |
| 19 | SORT AGGREGATE | | 1 | 23 | | | |
|* 20 | INDEX RANGE SCAN | OM_LIMIT_HIST_IDX1 | 1 | 23 | | 0 (0)| 00:00:01 |
Predicate Information (identified by operation id):
 1 - filter("R">=1)
 2 - filter(ROWNUM<=50)
 4 - filter(ROWNUM<=50)
 8 - filter("EFFCTV_START_DATE">=TO_DATE(' 2008-08-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
 "TMPLT_ID"19 AND "TMPLT_ID"14 AND "TMPLT_ID"16 AND ("EFFCTV_END_DATE" IS NULL OR
 "EFFCTV_END_DATE"<=TO_DATE(' 2009-08-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
 9 - access("CTXSYS"."CONTAINS"("NOTE",'%test%',1)>0)
10 - filter("DATE_CHANGED"= (SELECT /*+ */ MAX("DATE_CHANGED") FROM "OM_LIMIT_HIST" "H1" WHERE
 "H1"."TRGT_NAME"=:B1 AND "DATE_CHANGED"< (SELECT /*+ */ MAX("DATE_CHANGED") FROM "OM_LIMIT_HIST" "H2" WHER
E
 "H2"."TRGT_NAME"=:B2)) OR "DATE_CHANGED"= (SELECT /*+ */ MAX("DATE_CHANGED") FROM "OM_LIMIT_HIST" "H1"
 WHERE "H1"."TRGT_NAME"=:B3))
13 - filter("TMPLT_ID"=19 AND "EFFCTV_START_DATE">=TO_DATE(' 2008-08-12 00:00:00', 'syyyy-mm-dd
 hh24:mi:ss') AND "CTXSYS"."CONTAINS"("NOTE",'%test%',1)>0 AND ("EFFCTV_END_DATE" IS NULL OR
 "EFFCTV_END_DATE"<=TO_DATE(' 2009-08-12 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
14 - access("B"."NOTE_ID"="A"."NOTE_ID")
16 - access("H1"."TRGT_NAME"=:B1 AND "DATE_CHANGED"< (SELECT /*+ */ MAX("DATE_CHANGED") FROM
 "OM_LIMIT_HIST" "H2" WHERE "H2"."TRGT_NAME"=:B2))
 filter("DATE_CHANGED"< (SELECT /*+ */ MAX("DATE_CHANGED") FROM "OM_LIMIT_HIST" "H2" WHERE
 "H2"."TRGT_NAME"=:B1))
18 - access("H2"."TRGT_NAME"=:B1)
20 - access("H1"."TRGT_NAME"=:B1)

What version of Oracle are you using?
Your sample query is a good example of the benefits and costs of 'mixed' queries, and the challenges of mixed query performance. Oracle 11g has some very helpful new features (search for SDATA) that can really improve performance. (Specifically, it looks like your query does some useful date-range bounding. You need to get that into the FT index).
In the end, it's not going to be easy to look at your reasonably complex query, understand the data and relationships, and wave the magic wand to make the thing go fast.

"Oracle text" performance Problem

Architecture for Performance on a web site Search!
I wanna use text service of ORACLE.But I am worried about the performance ....
How should I design the system if I want the best perfomance and scalability ?
1.Should I build a seperate coloumn in my every table and merge all the information into one coloumn and full text index that column.
2.Put a full text index in all column in the table and use OR clause and reverse rank it for AND clause,using CONTAINSTABLE function.
3.Make a different table and put ID,TYPE and _VALUE fields and search in that table with less coloumns.
4.Seperate the full text database and search in a seperate db so that I can scale better?
did anybody have a similiar problem ? Any books on full text search ?

The number of indexes is irrelevant as such. If you really need 100 tables and you really need full text search on all of them, you need 100 indexes. When you are inserting data in any given table, the fact that there are 99 other tables with 99 other Text indexes is irrelevant.
That being said, I would seriously question whether a data model that involves doing full-text searches on 100 separate tables was actually a proper data model. That strikes me as highly unlikely.
Justin

Oracle Full text Performance

Hi,
I created the indexes using the following commands. I have a few questions:
1. Is the way i have created the indexes is correct? Is it must that i need to use the data store?If yes how do I use that given that there are multiple tables involved
 CREATE INDEX TITLESEARCH ON SCS_PRODUCT(title_printed) INDEXTYPE IS ctxsys.context; 
CREATE INDEX SUBTITLESEARCH ON SCS_PRODUCT(subtitle_printed) INDEXTYPE IS ctxsys.context; 
CREATE INDEX FIRSTNAMESEARCH ON SCS_CONTRIBUTOR(first_name) INDEXTYPE IS ctxsys.context; 
CREATE INDEX LASTNAMESEARCH ON SCS_CONTRIBUTOR(LAST_name) INDEXTYPE IS ctxsys.context; 
CREATE INDEX PRINTEDNAMESEARCH ON SCS_CONTRIBUTOR(PRINTED_name) INDEXTYPE IS ctxsys.context; 
CREATE INDEX ESSNSEARCH ON SCS_JOURNAL(ESSN) INDEXTYPE IS ctxsys.context; 
CREATE INDEX ISSNSEARCH ON SCS_JOURNAL(ISSN) INDEXTYPE IS ctxsys.context; 
CREATE INDEX ISBN10SEARCH ON scs_sku(isbn_old) INDEXTYPE IS ctxsys.context; 
CREATE INDEX ISBN13SEARCH ON scs_sku(isbn) INDEXTYPE IS ctxsys.context; 
CREATE INDEX AVAILABLEUSSEARCH ON SCS_PRODUCT(AVAILABLE_SAGE) INDEXTYPE IS ctxsys.context; 
CREATE INDEX AVAILABLEUKSEARCH ON SCS_PRODUCT(AVAILABLE_SAGE_UK) INDEXTYPE IS ctxsys.context; 
CREATE INDEX KEYWORDSEARCH ON DCS_PRD_KEYWRDS(KEYWORD) INDEXTYPE IS ctxsys.context; 
2. The performace is very slow. 
3. I am getting the following error sometimes when i try to search though my web application. 
ORA-29861: domain index is marked LOADING/FAILED/UNUSABLE
Any help would be really appreciated!
Thanks
Gautam

1) It is correct - but may be not so performance (you using default options).
I would not use oracle text for columns isbn and isbn_old. Normal index it better.
2) Could you post sql, execution plan, more info about computer.
3) When your computer is too slow for Oracle it can occurs... (LOADING).
oerr ora 29861
29861, 00000, "domain index is marked LOADING/FAILED/UNUSABLE"
// *Cause: An attempt has been made to access a domain index that is
// being built or is marked failed by an unsuccessful DDL
// or is marked unusable by a DDL operation.
// *Action: Wait if the specified index is marked LOADING
// Drop the specified index if it is marked FAILED
// Drop or rebuild the specified index if it is marked UNUSABLE.

Performance issues and options to reduce load with Oracle text implementation

Hi Experts,
My database on Oracle 11.2.0.2 on Linux. We have Oracle Text implemented for fuzzy search. Our oracle text indexes are defined as sync on commit as we can not afford to have stale data. Now our application does literally thousands of inserts/updates/deletes to those columns where we have these Oracle text indexes defined. As a result, we are seeing a lot of performance impact due to the oracle text sync routines being called on each commit. We are doing the index optimization every night (full optimization every night at 3 am). The oracle text index related internal operations are showing up as top sql in our AWR report and there are concerns that it is causing lot of load on the DB. Since we do the full index optimization only once at night, I am thinking should I change that , and if I do so, will it help us?
For example here are some data from my one day's AWR report:
Elapsed Time (s)
Executions
Elapsed Time per Exec (s)
%Total
%CPU
%IO
SQL Id
SQL Module
SQL Text
27,386.25
305,441
0.09
16.50
15.82
9.98
ddr8uck5s5kp3
begin ctxsys.drvdml.com_sync_i...
14,618.81
213,980
0.07
8.81
8.39
27.79
02yb6k216ntqf
begin ctxsys.syncrn(:idxownid,...
Full Text of above top sql:
ddr8uck5s5kp3
begin ctxsys.drvdml.com_sync_index(:idxname, :idxmem, :partname);
end
02yb6k216ntqf
begin ctxsys.syncrn(:idxownid, :idxoname, :idxid, :ixpid, :rtabnm, :flg); end;
Now if I do the full index optimization more often and not just once at night 3 PM, will that mean, the load on DB due to sync on commit will decrease? If yes how often should I optimized and doesn't the optimization itself lead to some load? Can someone suggest?
Thanks,
OrauserN

You can query the ctx_parameters view to see what your default and maximum memory values are:
SCOTT@orcl12c> COLUMN bytes    FORMAT 9,999,999,999
SCOTT@orcl12c> COLUMN megabytes FORMAT 9,999,999,999
SCOTT@orcl12c> SELECT par_name AS parameter,
2          TO_NUMBER (par_value) AS bytes,
3          par_value / 1048576 AS megabytes
4 FROM   ctx_parameters
5 WHERE par_name IN ('DEFAULT_INDEX_MEMORY', 'MAX_INDEX_MEMORY')
6 ORDER BY par_name
7 /
PARAMETER                               BYTES      MEGABYTES
DEFAULT_INDEX_MEMORY               67,108,864             64
MAX_INDEX_MEMORY                1,073,741,824          1,024
2 rows selected.
You can set the memory value in your index parameters:
SCOTT@orcl12c> CREATE INDEX EMPLOYEE_IDX01
2 ON EMPLOYEES (EMP_NAME)
3 INDEXTYPE IS CTXSYS.CONTEXT
4 PARAMETERS ('SYNC (ON COMMIT) MEMORY 1024M')
5 /
Index created.
You can also modify the default and maximum values using CTX_ADM.SET_PARAMETER:
http://docs.oracle.com/cd/E11882_01/text.112/e24436/cadmpkg.htm#CCREF2096
The following contains general guidelines for what to set the max_index_memory parameter and others to:
http://docs.oracle.com/cd/E11882_01/text.112/e24435/aoptim.htm#CCAPP9274

Performance issue with Oracle Text index

Hi Experts,
We are on Oracle 11.2..0.3 on Solaris 10. I have implemented Oracle Text in our environment and I am facing a strange performance issue that is happening in our environment.
One sql having CONTAINS clause is taking forever - more than 20 minutes and still does not complete. This sql has a contains clause and an exists clause and a not exists clause.
Now if I remove the exists clause and a not exists clause , it completes fast. but with those two clauses it is just taking forever. It is late night so i am not able to post the table and sql query details and will do so tomorrow but based on this general description, are there any pointers for me to review?
sql query doing fine:
SELECT
    U.CLNT_OID, U.USR_OID, S.MAILADDR
FROM
    access_usr U
    INNER JOIN access_sia S
        ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
    WHERE U.CLNT_OID = 'ABCX32S'
    AND CONTAINS(LAST_NAME , 'TO%' ) >0
--sql query that hangs forever:
SELECT
    U.CLNT_OID, U.USR_OID, S.MAILADDR
FROM
    access_usr U
    INNER JOIN access_sia S
        ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
    WHERE U.CLNT_OID = 'ABCX32S'
    AND CONTAINS(LAST_NAME , 'TO%' ) >0
and exists (--one clause here wiht a few table joins)
and not exists (--one clause here wiht a few table joins);
--Now another strange thing I found is if instead of 'TO%' in this sql, if I were to use 'ZZ%' or 'L1%' it works fast but for 'TO%' it goes slow with those two exists not exists clauses!
I will be most thankful for the inputs.
OrauserN

Hi Barbara,
First of all, thanks a lot for reviewing the issue.
Unluckily making the change to empty_stoplist did not work out. I am today copying the entire sql here that has this issue and will be most thankful for more insights/pointers on what can be done.
Here is the entire sql:
SELECT U.CLNT_OID,
       U.USR_OID,
       S.EMAILADDRESS,
       U.FIRST_NAME,
       U.LAST_NAME,
       S.JOBCODE,
       S.LOCATION,
       S.DEPARTMENT,
       S.ASSOCIATEID,
       S.ENTERPRISECOMPANYCODE,
       S.EMPLOYEEID,
       S.PAYGROUP,
       S.PRODUCTLOCALE
FROM    ACCESS_USR U
       INNER JOIN
          ACCESS_SIA S
       ON S.USR_OID = U.USR_OID AND S.CLNT_OID = U.CLNT_OID
WHERE     U.CLNT_OID = 'G39NY3D25942TXDA'
       AND EXISTS
              (SELECT 1
                 FROM ACCESS_USR_GROUP_XREF UGX
                      INNER JOIN ACCESS_GROUP RELG
                         ON     RELG.CLNT_OID = UGX.CLNT_OID
                            AND RELG.GROUP_OID = UGX.GROUP_OID
                      INNER JOIN ACCESS_GROUP G
                         ON     G.CLNT_OID = RELG.CLNT_OID
                            AND G.GROUP_TYPE_OID = RELG.GROUP_TYPE_OID
                WHERE     UGX.CLNT_OID = U.CLNT_OID
                      AND UGX.USR_OID = U.USR_OID
                      AND G.GROUP_OID = 920512943
                      AND UGX.INCLUDED = 1)
       AND NOT EXISTS
                  (SELECT 1
                     FROM    ACCESS_USR_GROUP_XREF UGX
                          INNER JOIN
                             ACCESS_GROUP G
                          ON     G.CLNT_OID = UGX.CLNT_OID
                             AND G.GROUP_OID = UGX.GROUP_OID
                    WHERE     UGX.CLNT_OID = U.CLNT_OID
                          AND UGX.USR_OID = U.USR_OID
                          AND G.GROUP_OID = 920512943
                          AND UGX.INCLUDED = 1)
       AND CONTAINS (U.LAST_NAME, 'Bon%') > 0;
Like I said before if the EXISTS and NOT EXISTS clause are removed it works in sub-second. But with those EXISTS and NOT EXISTS CLAUSE IT TAKES ANY WHERE FROM 25 minutes to more than one hour.
NOte also that it was not TO% but Bon% in the CONTAINS clause that is giving the issue - sorry that was wrong on my part.
Also please see below the ORACLE TEXT index defined on the table ACCESS_USER:
--definition of preferences used in the index:
SET SERVEROUTPUT ON size unlimited
WHENEVER SQLERROR EXIT SQL.SQLCODE
DECLARE
   v_err       VARCHAR2 (1000);
   v_sqlcode   NUMBER;
   v_count     NUMBER;
BEGIN
   ctxsys.ctx_ddl.create_preference ('cust_lexer', 'BASIC_LEXER');
   ctxsys.ctx_ddl.set_attribute ('cust_lexer', 'base_letter', 'YES'); -- removes diacritics
EXCEPTION
   WHEN OTHERS
   THEN
      v_err := SQLERRM;
      v_sqlcode := SQLCODE;
      v_count := INSTR (v_err, 'DRG-10701');
      IF v_count > 0
      THEN
         DBMS_OUTPUT.put_line (
            'The required preference named CUST_LEXER with BASIC LEXER is already set up');
      ELSE
         RAISE;
      END IF;
END;
DECLARE
   v_err       VARCHAR2 (1000);
   v_sqlcode   NUMBER;
   v_count     NUMBER;
BEGIN
   ctxsys.ctx_ddl.create_preference ('cust_wl', 'BASIC_WORDLIST');
   ctxsys.ctx_ddl.set_attribute ('cust_wl', 'SUBSTRING_INDEX', 'true'); -- to improve performance
EXCEPTION
   WHEN OTHERS
   THEN
      v_err := SQLERRM;
      v_sqlcode := SQLCODE;
      v_count := INSTR (v_err, 'DRG-10701');
      IF v_count > 0
      THEN
         DBMS_OUTPUT.put_line (
            'The required preference named CUST_WL with BASIC WORDLIST is already set up');
      ELSE
         RAISE;
      END IF;
END;
--now below is the code of the index:
CREATE INDEX ACCESS_USR_IDX3 ON ACCESS_USR
(FIRST_NAME)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('LEXER cust_lexer WORDLIST cust_wl SYNC (ON COMMIT)');
CREATE INDEX ACCESS_USR_IDX4 ON ACCESS_USR
(LAST_NAME)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('LEXER cust_lexer WORDLIST cust_wl SYNC (ON COMMIT)');
The strange thing is that, like I said, If I remove the exists clause the query returns very fast. Also if I modify the query to use only one NOT EXISTS clause and remove the other EXISTS clause it returns in less than one second. Also if I remove the EXISTS clause and use only the NOT EXISTS clause it returns in less than 4 seconds. But with both clauses it runs forever!
When I tried to get dbms_xplan.display_cursor to get the query plan (for the case of both exists and not exists clause in the query), it said that previous statement's sql id was 0 or something like that so that I was not able to see the query plan. I will keep trying to get this plan (it takes 25 minutes to one hour each time but will get this info soon). Again any pointers are most helpful.
Regards
OrauserN

Error when attempting to add Oracle Text using DBCA

Hi all -
I'm trying to add Oracle Text to my 10gR1 database using DBCA. The Oracle Text option is available, but when I attempt to add it I get the error message: "A tablespace for the database option Oracle Text is not found. Cannot install this option in the database." I have a SYSAUX tablespace, which I believe is the default tablespace for Text, so I'm not sure what the problem could be. I also tried manually adding Oracle Text by running catctx.sql but received the error message:
ERROR at line1:
ORA-39705: component 'CONTEXT' not found in registry
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 86
ORA-06512: at "SYS.DBMS_REGISTRY", line 757
ORA-06512: at line 2
Any ideas or suggestions would be greatly appreciated. Thanks!

Hi,
this looks like a half way install, I think you should try to deinstall Oracle Text and then reinstall, have a look at note 280713.1 on Oracle Support
Herald ten Dam
http://htendam.wordpress.com

Oracle Text ALTER INDEX Performance

Greetings,
We have encountered some ehancement issues with Oracle Text and really need assistance.
We are using Oracle 9i (Release 9.0.1) Standard Edition
We are using a very simple Oracle text environmet, with CTXSYS.CONTEXT indextype on Domain Indexes.
We have indexed two text columns in one table, one of these columns is CLOB.
Currently if one of these columns is modified, we are using a trigger to automatically ALTER the index.
This is very slow, it is just like dropping the index and creating it again.
Is this right? should it be this slow?
We are also trying to use the ONLINE parameter for ALTER INDEX and CREATE INDEX, but it gives an error saying this feature is not enabled.
How can we enable it?
Is there any way in improving the performance of this automatic update of the indexes?
Would using a trigger be the best way to do this?
How can we optimize it to a more satifactory performance level?
Also, are we able to use the language lexers for indexes with the Standard Edition. If so, how do you enable the CTX_DLL?
Many thanks for any assistance.
Chi-Shyan Wang

If you are going to sync your index on every update, you need to make sure that you are optmizing it on a regular basis to remove index fragmentation and remove deleted rows.
you can set up a dmbs_job to do a ctx_ddl.optmize and run a full optmize periodically.
Also, depending on the number of rows you have, and also the size of the data, you might want to look at using a CTXCAT index, which is transactional, stays in sync automatically and does not need to be optimized. CTXCAT indexes do not work well on large text objects (they are good for a couple lines of text at most) so they may not suit your dataset.

Bad Query Performance in Oracle Text

Hello everyone, I have the following problem:
I have a table, TABLE_A from now on, a table of more or less 1,000.000 rows, with a CONTEXT index, using FILE_DATASTORE, CTXSYS.DEFAULT_STORAGE, CTXSYS.NULL_FILTER, CTXSYS.BASIC_LEXER and querying the index in the following way:
SELECT /*+FIRST_ROWS*/ A.ID, B.ID2, SCORE(1) FROM TABLE_A A, TABLE_B WHERE A.ID = B.ID AND CONTAINS(A.PATH, '<SOME KW>', 1) > 0 ORDER BY SCORE(1) DESC
where TABLE_B has another 1,000.000 rows.
The problem is that the query response time is much higher after some time of inactivity regarding those tables. How can I avoid this behavior?. The fact is that those inactivity times (not more than 20min) are core to my application, so I always get long long response times for my queries.
Is there any cache or cache time parameter that affects this behavior? I have checked the Oracle Text documentation without finding anything about that...
More data: I am using Oracle 9.2.0.1, but I have tested with the latest patches an the behavior is the same...
Thank you very much in advance.

Pablo,
This appears to be a generic database or OS issue, not a Text specific issue. It really depends on what your application is doing.
If your application is doing some other database activity such as queries or DMLs on other non-text tables, chances are Oracle Text related data blocks are being aged out of cache. You can either increase the db_cache_size init
parmater or try to keep the text tables and index tables blocks in cache using ALTER TABLE commands.
If your app is doing NON-database activity, then chances are your application is taking up much of the machine's physical memory such that OS is swapping ORACLE out of the memory. In which case, you may want to consider to add more memory to the machine or have ORACLE run on a separate machine by itself.

Performance of Oracle Text

Hi,
I'm tasked to help design an application that will have Oracle Text powering the searching logic. The application will have millions of records (in the 30 million to 50 million range), but there's a restriction that 95% of all searches must be able to complete in 1 second or less (!)
So, my question is, is it possible for Oracle Text to meet this criteria? Assuming we have the best hardware, etc. Or should I look for another solution/approach?
Regards,
Roy

Hi Roy,
It's pretty hard to give a yes/no answer based on the limited information. I will say that Oracle's method of indexing is fairly standard (dictionary/postings list) so you are not likely to find a better solution if your records are stored in Oracle. The Oracle Text advantage - tight integration with the database already. You have storage and query optimization features of Oracle when using Oracle Text. < 1 sec response time is pretty tight for any search, but I don't think you'd have any better chance at hitting it with another solution.
Thanks,
Ron

Using oracle text on a non-materialized view

I'm having trouble tracking down an error when using oracle text on a non-materialized view (indexes are on the referenced columns). My database has a users table and a user history table which saves the old values when a user profile changes. My view performs a "union all" so I can select from both at once.
I would like to use oracle text to perform a "contains" on the view whenever someone signs up to see if any current users or historical entries contain the desired username.
The following works fine:
contains(user_history_view, 'bill')but when I reference anything in the contains clause, i get a "column is not indexed" error:
contains(user_history_view, signup.user_name) --signup.username is 'bill'Here is a stripped-down demonstration (I am using version 10.2.0.4.0)
create table signup (
signup_id   number(19,0) not null,
signup_name varchar2(255),
primary key (signup_id)
create table users (
user_id   number(19,0) not null,
user_name varchar2(255),
primary key (user_id)
create table user_history (
history_id number(19,0) not null,
user_id    number(19,0) not null,
user_name varchar2(255),
primary key (history_id),
foreign key (user_id) references users on delete set null
create index user_name_index on users(user_name)
indextype is ctxsys.context parameters ('sync (on commit)');
create index user_hist_name_index on user_history(user_name)
indextype is ctxsys.context parameters ('sync (on commit)');
create index signup_name_index on signup(signup_name)
indextype is ctxsys.context parameters ('sync (on commit)');
create or replace force view user_history_view
(user_id, user_name, flag_history) as
select user_id, user_name, 'N' from users
union all
select user_id, user_name, 'Y' from user_history;
--user bill changed his name to bob, and there is a pending signup for another bill
insert into users(user_id, user_name) values (1, 'bob');
insert into user_history(history_id, user_id, user_name) values (1, 1, 'bill');
insert into signup(signup_id, signup_name) values(1, 'bill');
commit;
--works
select * from user_history_view users, signup new_user
where new_user.signup_id = 1
and contains(users.user_name, 'bill')>0;
--fails
select * from user_history_view users, signup new_user
where new_user.signup_id = 1
and contains(users.user_name, new_user.signup_name)>0;I could move everything into a materialized view, but querying against real-time data like this would be ideal. Any help would be greatly appreciated.

Hi,
this is to my knowledge not possible. It is hard for Oracle to do, think about a table with many rows, every row with that column must be checked. So I think only a single varchar2 is possible. Maybe for you will a function work. It is possible to give a function as second parameter.
function return_signup
return varchar2
is
l_signup_name signup.signup_name%type;
begin
select signup_name
into l_signup_name
from signup
where signup_id = 1
and rownum = 1
return l_signup_name;
exception
when no_data_found
then
    l_signup_name := 'abracadabra'; -- hope does not exist
    return l_signup_name;
end;Now you can use above function in the contains.
select * from user_history_view users --, signup new_user
--where new_user.signup_id = 1
where contains(users.user_name, return_signup)>0;I didn't test the code! Maybe you have to adjust the function for your needs. But it is a idea how this can be done.
Otherwise you must make the check by normaly check the columns by simple using a join:
select * from user_history_view users, signup new_user
where new_user.signup_id = 1
and users.user_name = new_user.signup_name;Herald ten Dam
htendam.wordpress.com

About index memory parameter for Oracle text indexes

Hi Experts,
I am on Oracle 11.2.0.3 on Linux and have implemented Oracle Text. I am not an expert in this subject and need help about one issue. I created Oracle Text indexes with default setting. However in an oracle white paper I read that the default setting may not be right. Here is the excerpt from the white paper by Roger Ford:
URL:http://www.oracle.com/technetwork/database/enterprise-edition/index-maintenance-089308.html
"(Part of this white paper below....)
Index Memory                                  As mentioned above, cached $I entries are flushed to disk each time the indexing memory is exhausted. The default index memory at installation is a mere 12MB, which is very low. Users can specify up to 50MB at index creation time, but this is still pretty low.
This would be done by a CREATE INDEX statement something like:
CREATE INDEX myindex ON mytable(mycol) INDEXTYPE IS ctxsys.context PARAMETERS ('index memory 50M');
Allow index memory settings above 50MB, the CTXSYS user must first increase the value of the MAX_INDEX_MEMORY parameter, like this:
begin ctx_adm.set_parameter('max_index_memory', '500M'); end;
The setting for index memory should never be so high as to cause paging, as this will have a serious effect on indexing speed. On smaller dedicated systems, it is sometimes advantageous to temporarily decrease the amount of memory consumed by the Oracle SGA (for example by decreasing DB_CACHE_SIZE and/or SHARED_POOL_SIZE) during the index creation process. Once the index has been created, the SGA size can be increased again to improve query performance."
(End here from the white paper excerpt)
My question is:
1) To apply this procedure (ctx_adm.set_parameter) required me to login as CTXSYS user. Is that right? or can it be avoided and be done from the application schema? This user CTXSYS is locked by default and I had to unlock it. Is that ok to do in production?
2) What is the value that I should use for the max_index_memory should it be 500 mb - my SGA is 2 GB in Dev/ QA and 3GB in production. Also in the index creation what is the value I should set for index memory parameter - I had left that at default but how should I change now? Should it be 50MB as shown in example above?
3) The white paper also refer to rebuilding an index at some interval like once in a month:   ALTER INDEX DR$index_name$X REBUILD ONLINE;
--Is this correct advice? i would like to ask the experts once before doing that. We are on Oracle 11g and the white paper was written in 2003.
Basically while I read the paper, I am still not very clear on several aspects and need help to understand this.
Thanks,
OrauserN

Perhaps it's time I updated that paper
1. To change max_index_memory you must be a DBA user OR ctxsys. As you say, the ctxsys account is locked by default. It's usually easiest to log in as a DBA and run something like
exec ctxsys.ctx_adm.set_parameter('MAX_INDEX_MEMORY', '10G')
2. Index memory is allocated from PGA memory, not SGA memory. So the size of SGA is not relevant. If you use too high a setting your index build may fail with an error saying you have exceeded PGA_AGGREGATE_LIMIT. Of course, you can increase that parameter if necessary. Also be aware that when indexing in parallel, each parallel process will allocated up to the index memory setting.
What should it be set to? It's really a "safety" setting to prevent users grabbing too much machine memory when creating indexes. If you don't have ad-hoc users, then just set it as high as you need. In 10.1 it was limited to just under 500M, in 10.2 you can set it to any value.
The actual amount of memory used is not governed by this parameter, but by the MEMORY setting in the parameters clause of the CREATE INDEX statement. eg:
create index fooindex on foo(bar) indextype is ctxsys.context parameters ('memory 1G')
What's a good number to use for memory? Somewhere in the region of 100M to 200M is usually good.
3. No - that's out of date. To optimize your index use CTX_DDL.OPTIMIZE_INDEX. You can do that in FULL mode daily or weekly, and REBUILD mode perhaps once a month.

How do I get Oracle Text to index files on a file server?

I am new to Oracle (I'm a MS-SQL DBA looking for a Full-Text Search solution that is better than linking to a MS index server.)
So - Here's the objective:
I have Oracle Server(Express) installed on a Windows server.
I would like for Oracle to build a Full-Text Catalog of the files on a separate file server based on file paths in a table in the database.
(No desire to store terabytes of images and documents inside the database)
I can get Oracle text up and running, using the URL_Datastore:
CREATE TABLE files (id NUMBER PRIMARY KEY, issue_id NUMBER, path VARCHAR(255) UNIQUE, ot_format VARCHAR(6), ot_version VARCHAR(10));
The Compaq server is a remote windows server on my local workgroup, so the fully qualified path is just "compaq" and the URL is valid:
INSERT INTO files VALUES (9,9,'file://Compaq/FTQ/00000003.pdf',NULL,NULL);
INSERT INTO files VALUES (13,13,'file://Compaq/FTQ/01.txt',NULL,NULL);
CREATE INDEX file_index ON files(path) INDEXTYPE IS ctxsys.context
PARAMETERS ('datastore ctxsys.URL_DATASTORE format column ot_format');
but when I enter:
Select * from CTX_User_Index_errors, I see the following errors:
DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/00000003.pdf
DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/01.txt
Did I miss something?
Do I need to install anything on the file server?
I would like to convince my company that Oracle can be much quicker than Microsoft's Indexing Service because it can avoid joining two large result sets (one result set from Full_text (indexing service) and one for specific data contained in fields in the MS-SQL database.) Full Text Searches commonly take 40 - 60 seconds where there are 1.5 million multi-page PDF files for a particular set that I sample search on. Without this massive join, I believe I can get the search to run in under 10 seconds.

Thank you!
File_Datastore worked fine.
I was staying away from File_Datastore because the information I gathered from googling suggested that file_datastore would only work locally.
Now I just have to get Oracle to pull data out of tables in a MS-SQL database on the local network (don't have a clue yet), and then have it index compiled file paths.
Then MS-SQL can query Oracle with index and full-text criteria and Oracle can send back a result set
It may sound like a bad way of performing Full-Text Queries, but anything will be better than the way things are currently running. We are currently performing Full Text Searches on a table that is rebuilt nightly, so the table containing millions of file paths is not live..
It would be so much better if we just migrated to Oracle, but we currently do not have the resources.

Oracle Text performance -- failed attempts

Similar Messages

Maybe you are looking for