Doubt in oracle text
Hi,
Hi i am getting a parse error when i tried to execute the following using catsearch.
>>> CATSEARCH (item_title,'{SIGN PEN SIGN R50 (M)), 12 PCS/BOX, RED}', null) > 0;
whereas when i use contains operator, there is no problem.
>>> contains (item_title,'{SIN PEN SIGN R50 (M)), 12 PCS/BOX, RED}') > 0;
Can anyone pls let me know the reason for this behaviour and also the solution for this problem.
regards
S.Karthikeyan
Are you searching on a CTXCAT index? presumably not.
Cheers, APC
Similar Messages
-
Hi ,
I am new to oracle text , i have a table which has a text data in one of the column which is been context indexed, now i want to find the frequently appearing words in that column, so i am using the ctx_doc.themes() procedure to achieve it.
I have created a result table 'ctx_themes' which has query_id,themes,weight columns which will get populated after executing the ctx_doc.themes() proc, but i am facing the below error after executing the ctx_doc.themes() proc, so please help me in fixing the issue.
Error starting at line 1 in command:
begin
ctx_doc.themes('text_data_index','e_id','CTX_THEMES',1,full_themes => FALSE,num_themes=>20);
end;
Error report:
ORA-20000: Oracle Text error:
DRG-11445: rowid value is invalid: e_id
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DOC", line 210
ORA-06512: at line 2
20000. 00000 - "%s"
*Cause: The stored procedure 'raise_application_error'
was called which causes this error to be generated.
*Action: Correct the problem as described in the error message or contact
the application administrator or DBA for more information.
Thank You.This is meant well: I wouldn't bother posting Oracle Text questions here. It's a specialised area, and not a lot of standard, general DBAs will know much about it. If you click the 'Database' link above, however, and scroll down quite a long way, you'll find a specific Forum for such queries, called 'Text'. Some geniuses in there (Barbara, especially) are extremely knowledgeable on this sort of topic.
-
SQL Injection with Oracle Text
I did a search here for any posts about SQL Injection on Oracle Text indexes, but returned no hits.
Can anyone give their opinion about whether SQL Injection is a concern when using Oracle Text or what steps can be taken ahead of time to prevent (or at least reduce the attack surface) on Oracle Text queries.
We're running a web app. that will use Oracle Text and our users can enter any search string as well as select pre-defined items from a drop down box.
Thanks in advance for any opinions
LJquote:
Originally posted by:
Dan Bracuk
What others can do is more relevent than what we think. When
in doubt, test.
very true, although my final solution went more like, "When
in doubt, manually add about 600 cfqueryparams in 406 cfquery
tags". -
Hi Masters,
I am working on Oracle Text. I have executed the below step/commands. All are executed successfully. But I didn't seen any improvement in my task. But I have one doubt, will explain below.
create table ent_dnt as select * from entitlement_dnt;
BEGIN
CTX_DDL.CREATE_PREFERENCE ('oracletext_datastore', 'MULTI_COLUMN_DATASTORE');
CTX_DDL.SET_ATTRIBUTE
('oracletext_datastore', 'COLUMNS',
'ORDER_NUMBER, GENERIC_PRODUCT_NAME_EXT, ENTITLEMENT_REF_ID, DEVICE_ASSET_ID, DEVICE_UNIQUE_ID, SWSERVICETAG, PRODUCT_DESC');
END;
CREATE INDEX idx_oracle_text
ON Ent_dnt (search_cols)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('DATASTORE oracletext_datastore');
BEGIN
DBMS_STATS.GATHER_TABLE_STATS('EE', 'ent_DNT', cascade=>TRUE);
DBMS_STATS.GATHER_TABLE_STATS('EE', 'upd_DNT', cascade=>TRUE);
dbms_stats.gather_index_stats('EE', 'idx_oracle_text1');
dbms_stats.gather_index_stats('EE', 'idx_oracle_text');
END;
the above steps are created / executed successfully. But when execute my select query..I didn't seen anywhere oracle text index name in my explain plan.
Plan
SELECT STATEMENT ALL_ROWSCost: 28,393 Bytes: 49,675 Cardinality: 5
3 SORT AGGREGATE Bytes: 8 Cardinality: 1
2 TABLE ACCESS BY INDEX ROWID TABLE EE.EE_PROD_GRP_ENTITLEMENT Cost: 4 Bytes: 8 Cardinality: 1
1 INDEX RANGE SCAN INDEX EE.IDX_PGE_ENT_ID Cost: 3 Cardinality: 1
5 SORT AGGREGATE Bytes: 8 Cardinality: 1
4 TABLE ACCESS FULL TABLE EE.ENT_DNT Cost: 26,781 Bytes: 8 Cardinality: 1
20 VIEW EE. Cost: 28,393 Bytes: 49,675 Cardinality: 5
19 COUNT STOPKEY
18 VIEW EE. Cost: 28,393 Bytes: 49,610 Cardinality: 5
17 SORT GROUP BY STOPKEY Cost: 28,393 Bytes: 2,295 Cardinality: 5
16 HASH JOIN OUTER Cost: 28,392 Bytes: 2,295 Cardinality: 5
14 NESTED LOOPS OUTER Cost: 28,388 Bytes: 1,808 Cardinality: 4
11 NESTED LOOPS OUTER Cost: 28,384 Bytes: 1,600 Cardinality: 4
8 HASH JOIN Cost: 28,383 Bytes: 1,552 Cardinality: 4
6 TABLE ACCESS FULL TABLE EE.UPD_DNT Cost: 1,089 Bytes: 174 Cardinality: 6
7 TABLE ACCESS FULL TABLE EE.ENT_DNT Cost: 27,292 Bytes: 110,648,108 Cardinality: 308,212
10 TABLE ACCESS BY INDEX ROWID TABLE EE.PRODUCT_LICENSE_PART Cost: 1 Bytes: 12 Cardinality: 1
9 INDEX RANGE SCAN INDEX EE.IDX_PLP_PD_DATA_ID Cost: 0 Cardinality: 1
13 TABLE ACCESS BY INDEX ROWID TABLE EE.PD_KT_DETAILS Cost: 1 Bytes: 52 Cardinality: 1
12 INDEX RANGE SCAN INDEX EE.IDX_PKD_PART_NUM Cost: 0 Cardinality: 1
15 TABLE ACCESS FULL TABLE EE.LEGACY_CONFIG Cost: 3 Bytes: 35 Cardinality: 5
and cost is also so high. But when I ran the below query. I didn't see any $ tables.
TEST@orcl_11gR2> SELECT object_name, object_type
2 FROM user_objects
3 WHERE object_name LIKE '%oracle%'
4 /
Usually DR$ $I, $K,$N,$R,$X tables are not creaed. where is the problem? please help me. I have to complete this task.
Regards
ARHi Roger,
Thanks alot for your reply. This is my query. Yes I didn't used contain clause in my query. But i don't know how to use.
SELECT B.*,
CASE WHEN ISBOUND = 'Y' AND ALLOWRESEND = 'Y' THEN 'Y' ELSE 'N' END
AS Allowunbind,
CASE
WHEN ISBOUND = 'Y' AND IsThisAnUpgrade = 'N' AND Allowresend = 'N'
THEN
'Y'
ELSE
CASE
WHEN ISBOUND = 'N'
AND BINDING_TYPE = INITCAP ('TRUSTED')
AND ALLOWRESEND = 'N'
THEN
'Y'
ELSE
'N'
END
END
AS AllowBind,
FNC_GET_GROUPNAME_V3 (B.ENTITLEMENT_ID) GROUP_NAME,
FNC_GET_USERGROUPNAME_V3 (B.ENTITLEMENT_ID, '[email protected]')
USER_GROUP_NAME,
FNC_GET_ROLE_V3 (B.ENTITLEMENT_ID, '[email protected]') ROLE_NAME,
(SELECT MAX (PGE_IS_ASSIGNED)
FROM ENT_DNT
WHERE ENTITLEMENT_ID = B.ENTITLEMENT_ID)
AS IS_ASSIGNED
FROM (SELECT *
FROM (SELECT A.*, ROWNUM RNUM
FROM (SELECT *
FROM (SELECT *
FROM (SELECT DISTINCT
ENTDNT.ORDER_DATE,
ENTDNT.ORDER_NUMBER,
ENTDNT.ENTITLEMENT_ID,
ENTDNT.ENTITLEMENT_REF_ID,
ENTDNT.CUSTOMER_NUM,
ENTDNT.ENTITLEMENT_STATUS_ID,
ENTDNT.ENT_QTY,
ENTDNT.ENTITLEMENTNAME,
ENTDNT.ACT_KEY_LOB_ID,
ENTDNT.LIC_KEY_LOB_ID,
ENTDNT.LICENSE_KEY,
ENTDNT.ENT_TYPE_ID,
ENTDNT.PRODUCT_DATA_ID,
ENTDNT.PRODUCT_NAME,
ENTDNT.TYPE_DIMENSION_EXT,
ENTDNT.BINDING_TYPE,
DECODE (
ENTDNT.ENT_TYPE_ID,
1, ENTDNT.PRODUCT_DESC,
3, ENTDNT.GENERIC_PRODUCT_NAME_EXT)
AS PRODUCT_DESC,
DECODE (
ENTDNT.ENT_TYPE_ID,
3, PKD.PRIMARY_LICENSE_IDENTIFIER,
2, 'SOFTWARE_SERVICETAG',
1, 'ENTITLEMENTID',
NULL)
AS PRIMARYLICENSEIDENTIFIER,
CASE
WHEN DECODE (
ENTDNT.ENT_TYPE_ID,
3, DECODE (
PKD.KEY_SOURCE_TYPE,
'SOURCE_NO_KEY', 'N',
'Y'),
1, 'Y',
LC.IS_KEY_REQUIRED) =
'Y'
AND ENTDNT.ENTITLEMENT_STATUS_ID =
'0'
AND ( ENTDNT.LIC_KEY_LOB_ID
IS NOT NULL
OR ENTDNT.LICENSE_KEY
IS NOT NULL
OR ENTDNT.ACT_KEY_LOB_ID
IS NOT NULL)
THEN
'Y'
WHEN ENTDNT.ENTITLEMENT_STATUS_ID =
'0'
AND ( ENTDNT.LIC_KEY_LOB_ID
IS NOT NULL
OR ENTDNT.LICENSE_KEY
IS NOT NULL)
THEN
'Y'
ELSE
'N'
END
AS KEYREQUIRED,
ENTDNT.ISTHISANUPGRADE,
ENTDNT.DEVICE_ASSET_ID,
ENTDNT.SWSERVICETAG,
PKD.PHVALUE,
CASE
WHEN -- ENTDNT.BINDING_TYPE = 'Trusted'
ENTDNT.BINDING_TYPE =
INITCAP ('TRUSTED')
THEN
'N'
WHEN ENTDNT.BINDING_TYPE =
INITCAP (
'COMPONENT')
-- OR ENTDNT.BINDING_TYPE = 'DeviceID'
OR ENTDNT.BINDING_TYPE =
INITCAP (
'DEVICEID')
--OR ENTDNT.BINDING_TYPE = 'ServiceTag'
OR ENTDNT.BINDING_TYPE =
INITCAP (
'SERVICETAG')
THEN
'Y'
ELSE
'N'
END
AS ISBOUND,
CASE
WHEN ENTDNT.ENT_TYPE_ID =
3
AND PKD.ALLOW_RESEND =
'Y'
AND ENTDNT.ENTITLEMENT_STATUS_ID =
'0'
AND ( ENTDNT.LIC_KEY_LOB_ID
IS NOT NULL
OR ENTDNT.LICENSE_KEY
IS NOT NULL
OR ENTDNT.ACT_KEY_LOB_ID
IS NOT NULL)
THEN
'Y'
WHEN ENTDNT.ENTITLEMENT_STATUS_ID =
'0'
AND ( ENTDNT.LIC_KEY_LOB_ID
IS NOT NULL
OR ENTDNT.LICENSE_KEY
IS NOT NULL)
THEN
'Y'
ELSE
'N'
END
AS ALLOWRESEND,
ENTDNT.GENERIC_PRODUCT_NAME_EXT,
PLP.LICENSE_PART_NUMBER
AS SRVPARTNUMBER,
ENTDNT.DEVICE_UNIQUE_ID,
(SELECT MAX (IS_ASSIGNED)
FROM EE_PROD_GRP_ENTITLEMENT PGE
WHERE ENTITLEMENT_ID =
ENTDNT.ENTITLEMENT_ID)
AS IS_ASSIGNED,
ENTDNT.SINGLEFILEPERID
FROM ent_dnt ENTDNT,
PD_KT_DETAILS PKD,
PRODUCT_LICENSE_PART PLP,
Legacy_Config LC,
upd_dnt UPDNT
WHERE ENTDNT.PRODUCT_GROUP_ID =
UPDNT.PRODUCT_GROUP_ID
AND UPDNT.EMAIL_ADDRESS =
'[email protected]'
AND ENTDNT.ENT_TYPE_ID =
LC.ENTITLEMENT_TYPE_ID(+)
AND PLP.PRODUCT_DATA_ID(+) =
ENTDNT.PRODUCT_DATA_ID
AND PKD.PART_NUMBER(+) =
ENTDNT.LIC_PART_NUM
AND UPDNT.IS_DELETED = 'N'
AND ENTDNT.ENTITLEMENT_STATUS_ID IN
(0, 4)
AND ENTDNT.IS_DELETED = 'N')
WHERE (UPPER (GENERIC_PRODUCT_NAME_EXT) LIKE
'%IDRAC%')
OR (ORDER_NUMBER LIKE '%251608469%')
OR (ENTITLEMENT_REF_ID LIKE '%162523200%')
OR (DEVICE_ASSET_ID LIKE '%162523200%')
OR (DEVICE_UNIQUE_ID LIKE '%162523200%')
OR (SWSERVICETAG LIKE '%162523200%')
OR (UPPER (PRODUCT_DESC) LIKE
'%162523200%'))
WHERE (UPPER (GENERIC_PRODUCT_NAME_EXT) LIKE
'%575757%')
OR (ORDER_NUMBER LIKE '%251608469%')
OR (ENTITLEMENT_REF_ID LIKE '%162523200%')
OR (DEVICE_ASSET_ID LIKE '%162523200%')
OR (DEVICE_UNIQUE_ID LIKE '%162523200%')
OR (SWSERVICETAG LIKE '%162523200%')
OR (UPPER (PRODUCT_DESC) LIKE '%162523200%')) A
WHERE ROWNUM <= 100)
WHERE RNUM >= 1) B;
yes..you are 100% correct. I did mistake with like operator. now I executed the below query. displaying all oracle related tables and indexes. Thank you.
SELECT object_name, object_type
FROM user_objects
WHERE object_name LIKE '%ORACLE%'
but here I have a problem. how can i run the below select for my above query?
SELECT *
2 FROM Entitlement_dnt
3 WHERE CONTAINS (search_cols, REPLACE (:i_OpenSearchText, ',', ' AND ')) > 0
here what can I do the value for :I_opensearchText ? confused.
please help me..!!
Regards
AR -
Deciding between Oracle Text v/s PL/SQL
In the Oracle Text technical document it is mentioned that the Standard ( CONTEXT ) and Catalog ( CTXCAT ) types of index are used to build index for larger co-herent documents and performing mixed querires respectively.
As I read furthur, I understand that if the requirement is not heavily document centric, then may be Oracle Text is not an ideal candidate to use. If most of the data is going to reside primarily in tables, then standard PL/ SQL queries and joins is the way to go. But on the other hand using standard SQL for names matches using LIKE operator, for eg, may not guarantee to work or may be complex to implement when trying wildcard or theme matches.
So the question is do we use Oracle text irrespective of the type of content being indexed i.e table data v/s documents ? How do we make that judgement?What type of data do you have and what types of searches do you want to be able to do? If you need features that are only available in Text, then you need Text. For example, if you will be searching documents that are stored in operatinig system files or in blob columns and you want to do stem searches or fuzzy searches or use a thesaurus, then you will need Oracle Text. If, on the other hand, the data that you have and the searches that you want can be done with or without Text, then you have a choice to make, with the major issue being which is more efficient. When in doubt, a little testing can help you decide. Set up a realistic test environment, test some queries both ways, and see which is fastest. If you are just doing standard searches on varchar2 columns, you may get better performance without Text.
-
Oracle Text Index on Materialized View
Hello,
I have designed a search engine for an internet application.
We have different tables for our main business objects, the search is based on the content of all these dependent entities (Product, Company etc...)
So I have created a materialized view to embody this aggregation.
Then I have created a Multi column datastore index on top of the snapshot.
The search engine has to be refreshed automatically 3 times a day, and manually anytime.
This is achieved by executing a complete refresh on the view and rebuild the index, programmatically via Toplink (SqlCall).
The MV refresh looks like this :
alter index usr_batiprod.fullTextMulticolIdx rebuild
and the index rebuild:
begin
DBMS_MVIEW.REFRESH('FT_TEST','C');
end;
Everything was fine until now, we have had lots of tuning on the index side, the refresh process was working fine...
We have let the users access the engine since Thursday (the index had been created on the production environment a fews weeks ago) and since yesterday (or maybe before) we have been experiencing data incoherency on the index...
I've tracked down the pb to the Fulltext Search's refresh process (manual and automatic share the same code) that was crashing on the Materialized view refresh :
java.sql.SQLException: ORA-20000: Oracle Text error:
DRG-50610: internal error: drexdsync
DRG-50857: oracle error in drekrtd (lob erase)
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 794
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 851
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 832
ORA-06512: at line 2
Again, we had tested the functionality thoroughly and never experienced such behaviour before... even while refreshing the materialized view and the index 3 times (or more) a day..
Once I dropped the Oracle Text Index, I was able to refresh the MV again..
so it looks like the index was in some incoherent state and was holding a lock on the materialized view...
Maybe my Index refresh call is wrong, and a stronger load on the functionality leads quickly to this pb, I dont know..
I had always been a bit doubtful towards my index rebuild call, so Im thinking about using a more complete call :
alter index usr_batiprod.fullTextMulticolIdx rebuild parameters ('sync')
is it enough, or do I have to switch to a 'Oracle Text' specific call ?
Is there another possible reason for the MV lock ?
Thank you for your support
Best Regards
Olivier CuzacqMVs are constructed in different ways and have lots of different uses.
Why not just use MV as temp table for OT (Oracle Text) index?
Refresh MV OT_TEMP.
Delete all not matching rows from OT.
Insert all missing missing rows from OT_TEMP to OT.
sync OT index (online).
Query table OT. -
Can anybody know is it possible to create two oracle text indexes on one column, for example, CTXCAT index and CTXRULE index and what will be during the querying of that column? is it a good practise?
Thanks in advance.When in doubt, test and see. Yes, you can create two different types of Oracle Text indexes on the same column. If you create a CTXCAT index and a CTXRULE index, then queries using CATSEARCH will use the CTXCAT index and queries using MATCHES will use the CTXRULE index. When querying with CATSEARCH, it will find all rows where the terms searched for are found within the column value. When querying with CTXRULE, it does the opposite, and finds all rows where the column values are found within the terms searched for. Please see the demonstration below. As to whether it is a good practice, it depends on what you need. If you need both types of searches, then yes. If not, then no, it would be unnecessary overhead.
SCOTT@orcl_11gR2> create table test_tab (test_col varchar2(60))
2 /
Table created.
SCOTT@orcl_11gR2> insert all
2 into test_tab values ('test')
3 into test_tab values ('data')
4 into test_tab values ('test data')
5 into test_tab values ('other stuff')
6 select * from dual
7 /
4 rows created.
SCOTT@orcl_11gR2> create index ctxcat_idx on test_tab (test_col)
2 indextype is ctxsys.ctxcat
3 /
Index created.
SCOTT@orcl_11gR2> create index ctxrule_idx on test_tab (test_col)
2 indextype is ctxsys.ctxrule
3 /
Index created.
SCOTT@orcl_11gR2> set autotrace on explain
SCOTT@orcl_11gR2> select * from test_tab
2 where catsearch (test_col, 'test data', null) > 0
3 /
TEST_COL
test data
1 row selected.
Execution Plan
Plan hash value: 399706479
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 44 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_TAB | 1 | 44 | 3 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | CTXCAT_IDX | | | | |
Predicate Information (identified by operation id):
2 - access("CTXSYS"."CATSEARCH"("TEST_COL",'test data',NULL)>0)
Note
- dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2> select * from test_tab
2 where matches (test_col, 'test data') > 0
3 /
TEST_COL
test
data
test data
3 rows selected.
Execution Plan
Plan hash value: 1476734355
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 44 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_TAB | 1 | 44 | 1 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | CTXRULE_IDX | | | 0 (0)| 00:00:01 |
Predicate Information (identified by operation id):
2 - access("CTXSYS"."MATCHES"("TEST_COL",'test data')>0)
Note
- dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2>
Message was edited by: BarbaraBoehmer -
Hello
I'd found a oracle text topic.
I search a russian stopp list for oracle text. Have everybody one?
Thanks rogerI could not find it in documentation till 11g so I doubt if there exists a stop list for Russian language as supplied.
You can probably create one using CTX_DDL package.
http://download.oracle.com/docs/cd/B19306_01/text.102/b14218/cddlpkg.htm#CCREF0600 -
Doubt in Oracle streams .Can you please help me in understanding the terms
1. Message
2.User defined Event
3. Event
4.Rules
5.Oracle supplied PL/SQL packages
6.Subscriber,ConsumerHi
Message
A message is the smallest unit of information that is inserted into and retrieved from a queue.
Queue
A queue is repository for messages. Queues are stored in queue tables
Enqueue
To place a message in queue
Dequeue
To comsume a message
Agent
An agent is a end user or the application uses a queue
Thanks
Venkat -
How do I get Oracle Text to index files on a file server?
I am new to Oracle (I'm a MS-SQL DBA looking for a Full-Text Search solution that is better than linking to a MS index server.)
So - Here's the objective:
I have Oracle Server(Express) installed on a Windows server.
I would like for Oracle to build a Full-Text Catalog of the files on a separate file server based on file paths in a table in the database.
(No desire to store terabytes of images and documents inside the database)
I can get Oracle text up and running, using the URL_Datastore:
CREATE TABLE files (id NUMBER PRIMARY KEY, issue_id NUMBER, path VARCHAR(255) UNIQUE, ot_format VARCHAR(6), ot_version VARCHAR(10));
The Compaq server is a remote windows server on my local workgroup, so the fully qualified path is just "compaq" and the URL is valid:
INSERT INTO files VALUES (9,9,'file://Compaq/FTQ/00000003.pdf',NULL,NULL);
INSERT INTO files VALUES (13,13,'file://Compaq/FTQ/01.txt',NULL,NULL);
CREATE INDEX file_index ON files(path) INDEXTYPE IS ctxsys.context
PARAMETERS ('datastore ctxsys.URL_DATASTORE format column ot_format');
but when I enter:
Select * from CTX_User_Index_errors, I see the following errors:
DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/00000003.pdf
DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/01.txt
Did I miss something?
Do I need to install anything on the file server?
I would like to convince my company that Oracle can be much quicker than Microsoft's Indexing Service because it can avoid joining two large result sets (one result set from Full_text (indexing service) and one for specific data contained in fields in the MS-SQL database.) Full Text Searches commonly take 40 - 60 seconds where there are 1.5 million multi-page PDF files for a particular set that I sample search on. Without this massive join, I believe I can get the search to run in under 10 seconds.Thank you!
File_Datastore worked fine.
I was staying away from File_Datastore because the information I gathered from googling suggested that file_datastore would only work locally.
Now I just have to get Oracle to pull data out of tables in a MS-SQL database on the local network (don't have a clue yet), and then have it index compiled file paths.
Then MS-SQL can query Oracle with index and full-text criteria and Oracle can send back a result set
It may sound like a bad way of performing Full-Text Queries, but anything will be better than the way things are currently running. We are currently performing Full Text Searches on a table that is rebuilt nightly, so the table containing millions of file paths is not live..
It would be so much better if we just migrated to Oracle, but we currently do not have the resources. -
Error while running the Oracle Text optimize index procedure (even as a dba user too)
Hi Experts,
I am on Oracle on 11.2.0.2 on Linux. I have implemented Oracle Text. My Oracle Text indexes are fragmented but I am getting an error while running the optimize_index error. Following is the error:
begin
ctx_ddl.optimize_index(idx_name=>'ACCESS_T1',optlevel=>'FULL');
end;
ERROR at line 1:
ORA-20000: Oracle Text error:
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DDL", line 941
ORA-06512: at line 1
Now I tried then to run this as DBA user too and it failed the same way!
begin
ctx_ddl.optimize_index(idx_name=>'BVSCH1.ACCESS_T1',optlevel=>'FULL');
end;
ERROR at line 1:
ORA-20000: Oracle Text error:
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DDL", line 941
ORA-06512: at line 1
Now CTXAPP role is granted to my schema and still I am getting this error. I will be thankful for the suggestions.
Also one other important observation: We have this issue ONLY in one database and in the other two databases, I don't see any problem at all.
I am unable to figure out what the issue is with this one database!
Thanks,
OrauserNHow about check the following?
Bug 10626728 - CTX_DDL.optimize_index "full" fails with an empty ORA-20000 since 11.2.0.2 upgrade (DOCID 10626728.8) -
Getting error while importing schema with ORACLE TEXT
IMP-00003: ORACLE error 20000 encountered
ORA-20000: Oracle Text error:
DRG-52204: error while registering index
DRG-10507: duplicate index name: WORKORDER_Q, owner: SYS
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.DRIIMP", line 115
ORA-06512: at line 2
IMP-00088: Problem importing metadata for index WORKORDER_Q. Index creation will be skipped
Database version - Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
Os version - Linux nlxs1012.slb.atosorigin-asp.com 2.6.18-308.el5 #1 SMP Fri Jan 27 17:17:51 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
We have take export of schema from production db now importing data to qa environment..
In import facing above error..I am importing objects from P20_MAXIMO to Q25_MAXIMO to another database..
Below is import par file..
USERID='/ as sysdba'
FILE=exp_P20_MAXIMO_C2364781.dmp
LOG=imp_P20_MAXIMO__Q25_MAXIMO_C2364781_1.log
FROMUSER=P20_MAXIMO
TOUSER=Q25_MAXIMO
buffer=1000000
feedback=100000
Export parfile
userid='/ as sysdba'
owner=P20_MAXIMO
FILE=exp_P20_MAXIMO_C2364781.dmp
LOG=exp_P20_MAXIMO_C2364781.log
buffer=10000000
feedback=100000
statistics=none -
Pre-loading Oracle text in memory with Oracle 12c
There is a white paper from Roger Ford that explains how to load the Oracle index in memory : http://www.oracle.com/technetwork/database/enterprise-edition/mem-load-082296.html
In our application, Oracle 12c, we are indexing a big XML field (which is stored as XMLType with storage secure file) with the PATH_SECTION_GROUP. If I don't load the I table (DR$..$I) into memory using the technique explained in the white paper then I cannot have decent performance (and especially not predictable performance, it looks like if the blocks from the TOKEN_INFO columns are not memory then performance can fall sharply)
But after migrating to oracle 12c, I got a different problem, which I can reproduce: when I create the index it is relatively small (as seen with ctx_report.index_size) and by applying the technique from the whitepaper, I can pin the DR$ I table into memory. But as soon as I do a ctx_ddl.optimize_index('Index','REBUILD') the size becomes much bigger and I can't pin the index in memory. Not sure if it is bug or not.
What I found as work-around is to build the index with the following storage options:
ctx_ddl.create_preference('TEST_STO','BASIC_STORAGE');
ctx_ddl.set_attribute ('TEST_STO', 'BIG_IO', 'YES' );
ctx_ddl.set_attribute ('TEST_STO', 'SEPARATE_OFFSETS', 'NO' );
so that the token_info column will be stored in a secure file. Then I can change the storage of that column to put it in the keep buffer cache, and write a procedure to read the LOB so that it will be loaded in the keep cache. The size of the LOB column is more or less the same as when creating the index without the BIG_IO option but it remains constant even after a ctx_dll.optimize_index. The procedure to read the LOB and to load it into the cache is very similar to the loaddollarR procedure from the white paper.
Because of the SDATA section, there is a new DR table (S table) and an IOT on top of it. This is not documented in the white paper (the white paper was written for Oracle 10g). In my case this DR$ S table is much used, and the IOT also, but putting it in the keep cache is not as important as the token_info column of the DR I table. A final note: doing SEPARATE_OFFSETS = 'YES' was very bad in my case, the combined size of the two columns is much bigger than having only the TOKEN_INFO column and both columns are read.
Here is an example on how to reproduce the problem with the size increasing when doing ctx_optimize
1. create the table
drop table test;
CREATE TABLE test
(ID NUMBER(9,0) NOT NULL ENABLE,
XML_DATA XMLTYPE
XMLTYPE COLUMN XML_DATA STORE AS SECUREFILE BINARY XML (tablespace users disable storage in row);
2. insert a few records
insert into test values(1,'<Book><TITLE>Tale of Two Cities</TITLE>It was the best of times.<Author NAME="Charles Dickens"> Born in England in the town, Stratford_Upon_Avon </Author></Book>');
insert into test values(2,'<BOOK><TITLE>The House of Mirth</TITLE>Written in 1905<Author NAME="Edith Wharton"> Wharton was born to George Frederic Jones and Lucretia Stevens Rhinelander in New York City.</Author></BOOK>');
insert into test values(3,'<BOOK><TITLE>Age of innocence</TITLE>She got a prize for it.<Author NAME="Edith Wharton"> Wharton was born to George Frederic Jones and Lucretia Stevens Rhinelander in New York City.</Author></BOOK>');
3. create the text index
drop index i_test;
exec ctx_ddl.create_section_group('TEST_SGP','PATH_SECTION_GROUP');
begin
CTX_DDL.ADD_SDATA_SECTION(group_name => 'TEST_SGP',
section_name => 'SData_02',
tag => 'SData_02',
datatype => 'varchar2');
end;
exec ctx_ddl.create_preference('TEST_STO','BASIC_STORAGE');
exec ctx_ddl.set_attribute('TEST_STO','I_TABLE_CLAUSE','tablespace USERS storage (initial 64K)');
exec ctx_ddl.set_attribute('TEST_STO','I_INDEX_CLAUSE','tablespace USERS storage (initial 64K) compress 2');
exec ctx_ddl.set_attribute ('TEST_STO', 'BIG_IO', 'NO' );
exec ctx_ddl.set_attribute ('TEST_STO', 'SEPARATE_OFFSETS', 'NO' );
create index I_TEST
on TEST (XML_DATA)
indextype is ctxsys.context
parameters('
section group "TEST_SGP"
storage "TEST_STO"
') parallel 2;
4. check the index size
select ctx_report.index_size('I_TEST') from dual;
it says :
TOTALS FOR INDEX TEST.I_TEST
TOTAL BLOCKS ALLOCATED: 104
TOTAL BLOCKS USED: 72
TOTAL BYTES ALLOCATED: 851,968 (832.00 KB)
TOTAL BYTES USED: 589,824 (576.00 KB)
4. optimize the index
exec ctx_ddl.optimize_index('I_TEST','REBUILD');
and now recompute the size, it says
TOTALS FOR INDEX TEST.I_TEST
TOTAL BLOCKS ALLOCATED: 1112
TOTAL BLOCKS USED: 1080
TOTAL BYTES ALLOCATED: 9,109,504 (8.69 MB)
TOTAL BYTES USED: 8,847,360 (8.44 MB)
which shows that it went from 576KB to 8.44MB. With a big index the difference is not so big, but still from 14G to 19G.
5. Workaround: use the BIG_IO option, so that the token_info column of the DR$ I table will be stored in a secure file and the size will stay relatively small. Then you can load this column in the cache using a procedure similar to
alter table DR$I_TEST$I storage (buffer_pool keep);
alter table dr$i_test$i modify lob(token_info) (cache storage (buffer_pool keep));
rem: now we must read the lob so that it will be loaded in the keep buffer pool, use the prccedure below
create or replace procedure loadTokenInfo is
type c_type is ref cursor;
c2 c_type;
s varchar2(2000);
b blob;
buff varchar2(100);
siz number;
off number;
cntr number;
begin
s := 'select token_info from DR$i_test$I';
open c2 for s;
loop
fetch c2 into b;
exit when c2%notfound;
siz := 10;
off := 1;
cntr := 0;
if dbms_lob.getlength(b) > 0 then
begin
loop
dbms_lob.read(b, siz, off, buff);
cntr := cntr + 1;
off := off + 4096;
end loop;
exception when no_data_found then
if cntr > 0 then
dbms_output.put_line('4K chunks fetched: '||cntr);
end if;
end;
end if;
end loop;
end;
Rgds, PierreI have been working a lot on that issue recently, I can give some more info.
First I totally agree with you, I don't like to use the keep_pool and I would love to avoid it. On the other hand, we have a specific use case : 90% of the activity in the DB is done by queuing and dbms_scheduler jobs where response time does not matter. All those processes are probably filling the buffer cache. We have a customer facing application that uses the text index to search the database : performance is critical for them.
What kind of performance do you have with your application ?
In my case, I have learned the hard way that having the index in memory (the DR$I table in fact) is the key : if it is not, then performance is poor. I find it reasonable to pin the DR$I table in memory and if you look at competitors this is what they do. With MongoDB they explicitly says that the index must be in memory. With elasticsearch, they use JVM's that are also in memory. And effectively, if you look at the awr report, you will see that Oracle is continuously accessing the DR$I table, there is a SQL similar to
SELECT /*+ DYNAMIC_SAMPLING(0) INDEX(i) */
TOKEN_FIRST, TOKEN_LAST, TOKEN_COUNT, ROWID
FROM DR$idxname$I
WHERE TOKEN_TEXT = :word AND TOKEN_TYPE = :wtype
ORDER BY TOKEN_TEXT, TOKEN_TYPE, TOKEN_FIRST
which is continuously done.
I think that the algorithm used by Oracle to keep blocks in cache is too complex. A just realized that in 12.1.0.2 (was released last week) there is finally a "killer" functionality, the in-memory parameters, with which you can pin tables or columns in memory with compression, etc. this looks ideal for the text index, I hope that R. Ford will finally update his white paper :-)
But my other problem was that the optimize_index in REBUILD mode caused the DR$I table to double in size : it seems crazy that this was closed as not a bug but it was and I can't do anything about it. It is a bug in my opinion, because the create index command and "alter index rebuild" command both result in a much smaller index, so why would the guys that developped the optimize function (is it another team, using another algorithm ?) make the index two times bigger ?
And for that the track I have been following is to put the index in a 16K tablespace : in this case the space used by the index remains more or less flat (increases but much more reasonably). The difficulty here is to pin the index in memory because the trick of R. Ford was not working anymore.
What worked:
first set the keep_pool to zero and set the db_16k_cache_size to instead. Then change the storage preference to make sure that everything you want to cache (mostly the DR$I) table come in the tablespace with the non-standard block size of 16k.
Then comes the tricky part : the pre-loading of the data in the buffer cache. The problem is that with Oracle 12c, Oracle will use direct_path_read for FTS which basically means that it bypasses the cache and read directory from file to the PGA !!! There is an event to avoid that, I was lucky to find it on a blog (I can't remember which, sorry for the credit).
I ended-up doing that. the events to 10949 is to avoid the direct path reads issue.
alter session set events '10949 trace name context forever, level 1';
alter table DR#idxname0001$I cache;
alter table DR#idxname0002$I cache;
alter table DR#idxname0003$I cache;
SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0001$I;
SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0002$I;
SELECT /*+ FULL(ITAB) CACHE(ITAB) */ SUM(TOKEN_COUNT), SUM(LENGTH(TOKEN_INFO)) FROM DR#idxname0003$I;
SELECT /*+ INDEX(ITAB) CACHE(ITAB) */ SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0001$I ITAB;
SELECT /*+ INDEX(ITAB) CACHE(ITAB) */ SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0002$I ITAB;
SELECT /*+ INDEX(ITAB) CACHE(ITAB) */ SUM(LENGTH(TOKEN_TEXT)) FROM DR#idxname0003$I ITAB;
It worked. With a big relief I expected to take some time out, but there was a last surprise. The command
exec ctx_ddl.optimize_index(idx_name=>'idxname',part_name=>'partname',optlevel=>'REBUILD');
gqve the following
ERROR at line 1:
ORA-20000: Oracle Text error:
DRG-50857: oracle error in drftoptrebxch
ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DDL", line 1141
ORA-06512: at line 1
Which is very much exactly described in a metalink note 1645634.1 but in the case of a non-partitioned index. The work-around given seemed very logical but it did not work in the case of a partitioned index. After experimenting, I found out that the bug occurs when the partitioned index is created with dbms_pclxutil.build_part_index procedure (this enables enables intra-partition parallelism in the index creation process). This is a very annoying and stupid bug, maybe there is a work-around, but did not find it on metalink
Other points of attention with the text index creation (stuff that surprised me at first !) ;
- if you use the dbms_pclxutil package, then the ctx_output logging does not work, because the index is created immediately and then populated in the background via dbms_jobs.
- this in combination with the fact that if you are on a RAC, you won't see any activity on the box can be very frightening : this is because oracle can choose to start the workers on the other node.
I understand much better how the text indexing works, I think it is a great technology which can scale via partitioning. But like always the design of the application is crucial, most of our problems come from the fact that we did not choose the right sectioning (we choosed PATH_SECTION_GROUP while XML_SECTION_GROUP is so much better IMO). Maybe later I can convince the dev to change the sectionining, especially because SDATA and MDATA section are not supported with PATCH_SECTION_GROUP (although it seems to work, even though we had one occurence of a bad result linked to the existence of SDATA in the index definition). Also the whole problematic of mixed structured/unstructured searches is completly tackled if one use XML_SECTION_GROUP with MDATA/SDATA (but of course the app was written for Oracle 10...)
Regards, Pierre -
Suggestion: Oracle text CONTEXT index on one or more columns ?
Hi,
I'm implementing Oracle text using CONTEXT ..... and would like to ask you for performance suggestion ...
I have a table of Articles .... with columns .. TITLE, SUBTITLE , BODY ...
Now is it better from performance point of view to move all three columns into one dummy column ... with name like FULLTEXT ... and put index on this single column,
and then use CONTAINS(FULLTEXT,'...')>0
Or is it almost the same for oracle if i put indexes on all three columns and then call:
CONTAINS(TITLE,'...')>0 OR CONTAINS(SUBTITLE,'...')>0 OR CONTAINS(BODY,'...')>0
I actually don't care if the result is a match in TITLE OR SUBTITLE OR BODY ....
So if i move into some FULLTEXT column, then i have duplicate data in a article row ... but if i create indexes for each column, than oracle has 2x more to index,optimize and search ... am I wright ?
Table has 1.8mil records ...
Thank you.
Krismackrispi wrote:
Now is it better from performance point of view to move all three columns into one dummy column ... with name like FULLTEXT ... and put index on this single column,
and then use CONTAINS(FULLTEXT,'...')>0What version of Oracle are you on? If 11 then you could use a virtual column to do this, otherwise you'd have to write code to maintain the column which can get messy.
mackrispi wrote:
Or is it almost the same for oracle if i put indexes on all three columns and then call:
CONTAINS(TITLE,'...')>0 OR CONTAINS(SUBTITLE,'...')>0 OR CONTAINS(BODY,'...')>0Benchmark it and find out :)
Another option would be something like this.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:9455353124561
Were i you, i would try out those 3 approaches and see which meet your performance requirements and weigh that with the ease of implementation and administration. -
database version 11.2.0.4
rac two node
CREATE INDEX MAXIMO.ACTCI_NDX3 ON MAXIMO.ACTCI
(DESCRIPTION)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('lexer global_lexer language column LANGCODE')
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-10700: preference does not exist: global_lexer
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.TEXTINDEXMETHODS", line 366Like the error message says, you don't have a global_lexer. So, you need to create a global_lexer and that lexer must have at least a default sub_lexer, then you can use that global_lexer in your index parameters. Please see the demonstration below, including reproduction of the error and solution.
SCOTT@orcl12c> -- reproduction of problem:
SCOTT@orcl12c> CREATE TABLE actci
2 (description VARCHAR2(60),
3 langcode VARCHAR2(30))
4 /
Table created.
SCOTT@orcl12c> CREATE INDEX ACTCI_NDX3 ON ACTCI (DESCRIPTION)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS('lexer global_lexer language column LANGCODE')
4 /
CREATE INDEX ACTCI_NDX3 ON ACTCI (DESCRIPTION)
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-10700: preference does not exist: global_lexer
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.TEXTINDEXMETHODS", line 366
SCOTT@orcl12c> -- solution:
SCOTT@orcl12c> DROP INDEX actci_ndx3
2 /
Index dropped.
SCOTT@orcl12c> BEGIN
2 CTX_DDL.CREATE_PREFERENCE ('global_lexer', 'multi_lexer');
3 CTX_DDL.CREATE_PREFERENCE ('english_lexer', 'basic_lexer');
4 CTX_DDL.ADD_SUB_LEXER ('global_lexer', 'default', 'english_lexer');
5 END;
6 /
PL/SQL procedure successfully completed.
SCOTT@orcl12c> CREATE INDEX ACTCI_NDX3 ON ACTCI (DESCRIPTION)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS('lexer global_lexer language column LANGCODE')
4 /
Index created.
Maybe you are looking for
-
I have manually back up my iphone few days back, and my contact got missing the next day. i restored my back up from iTunes, however, my contact did not get restored. How do i check if the manual back up did actually back up my contact or not? How do
-
IMac waking up..console says hijack!
Hey... My iMac has been waking up allot during the night for the past few months... Someone when I come to it I find it not sleeping at all. I was told a few weeks ago when I catch it go to the system log.. I did that today and am now really worried
-
Metal L&F bumps - how to get rid?
Hi, Does anyone know how to make the metal bumps on the scroll bar drag area and splitpane divider go away? I'd prefer the plain and simple windows look - my app currently runs on HP-UX so I don't have the option to use the windows L&F. I can see it
-
I cannot open a rental film that I just purchased. It gives an error. I opened it once but then I had to restart ipad. There are still 1day and 16 hours to expire.
-
Trouble importing from p3 after office install
Has anyone else had trouble importing photos from digital camera after installing Office for Mac 2011? Now iPhoto won't recognize my files, and what came up in the import folder were about 160 icky clip art icons, which I quickly deleted. If there's