Oracle Text, multi_column_datastore describe index

Referring to Re: Oracle Text, ctxsys.context problem with number column
In TOAD or SQL Navigator you can click on some object (like table, index) and click on tab "script" where is generated the code to create it.
Assuming that I am a person who sees the index for the first time where/how I can find multi_columns/parameters with which the index was created ?
For example:
exec ctx_ddl.drop_preference( 'myds' );
exec ctx_ddl.create_preference( 'myds', 'MULTI_COLUMN_DATASTORE' );
exec ctx_ddl.set_attribute( 'myds', 'COLUMNS', 'item_barcode, item_title, item_subtitle' );
CREATE INDEX i_index_test
   ON item (item_title)
   INDEXTYPE IS ctxsys.context
   PARAMETERS ( 'datastore myds' );And script says:
CREATE INDEX PLSQL.I_INDEX_TEST ON PLSQL.ITEM
(ITEM_TITLE)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('datastore myds')
NOPARALLEL;
Where I can find datastore "myds" ?

You can use ctx_report to get all the information about an index.
I use it from SQL*Plus like this:
set pagesize 0
set heading off
set trimspool on
set long 500000
spool index.sql
select ctx_report.create_index_script('MyIndexName') from dual;
spool offThe only thing this won't give you is the source for any procedures used in the user_datastore. You can get that from user_sources.

Similar Messages

Oracle Text location of Indexes

I am creating an Oracle Text index as per example given in documentation. I have set my storage preferences to point all the created tables / indexes to a given tablespace. This works fine, however I still get one index (domain) being created in the SYSTEM tablespace. Is this normal ? Can it be moved ?

The command used is
create index abstract_text_idx on abstracts(merge_text)
indextype is ctxsys.context
parameters('lexer abstract_lexer storage abstract_store memory 52428800')
The index left in SYSTEM tablespace is ABSTRACT_TEXT_IDX. All the DR$$ indexes have used the storage parameters in the abstract_store preference O.K.

Oracle Text - CTX Context Index Soundex Problem

Hi,
I'm running into a problem with Oracle Text when searching using the ! (soundex) option. I've created a simple test example to highlight the issue.
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit
Windows 2008 Server 64-bit
create table test_tab (test_col varchar2(200));
insert all
into test_tab (test_col) values ('ab-tönes')
into test_tab (test_col) values ('ab-tones')
into test_tab (test_col) values ('abtones')
into test_tab (test_col) values ('ab tones')
into test_tab (test_col) values ('ab-tanes')
select * from dual
select * from test_tab
begin
      ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
      ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+&''');
      ctx_ddl.set_attribute('test_lex1','base_letter','YES');
      -- ctx_ddl.set_attribute('test_lex1','skipjoins','-');
end;
create index test_idx on test_tab (test_col)
indextype is ctxsys.context
    parameters
      ('lexer        test_lex1'
select token_text from dr$test_idx$i;
TOKEN_TEXT
AB
ABTONES
TANES
TONES
select * from test_tab where contains (test_col, '!ab tones') > 0;
TEST_COL
ab-tönes
ab-tones
ab tones
select * from test_tab where soundex(test_col) = soundex('ab tones');
TEST_COL
ab-tönes
ab-tones
abtones
ab tones
ab-tanes
So my question is, can anyone suggest an approach whereby I can get the Oracle Text Context index (or CTXCAT index if it's more appropriate) to return all 5 rows like the simple Soundex is doing?
I can't really use soundex as this search query will form part of a search screen for a multi-language application. Soundex is limited to English sounding words, so I need the solution to be able to compare strings that may not "sound" English.
It must be an attribute of the BASIC_LEXER, and I've tried skipjoins, start/end-joins, stop lists, but I just cannot get the Soundex feature of Oracle Text to function like the SOUNDEX() function!
Looking at how the tokens are stored dr$test_idx$i I need Oracle Text to almost concat 'AB' and 'TONES' to search as a single string.
Any help greatly appreciated.
Thanks,

I am not getting the same problem that you are getting with the umlat, but I don't see what is different. Please post the result of:
select ctx_report.create_index_script ('test_idx') from dual;
Here are the results on my system. Perhaps you can spot the difference. I added an empty_stoplist, so that it won't print out a long list of stopwords.
SCOTT@orcl12c> create table test_tab (test_col    varchar2(200))
2 /
Table created.
SCOTT@orcl12c> insert all
2    into test_tab (test_col) values ('ab-tönes')
3    into test_tab (test_col) values ('ab-tones')
4    into test_tab (test_col) values ('abtones')
5    into test_tab (test_col) values ('ab tones')
6    into test_tab (test_col) values ('ab-tanes')
7 select * from dual
8 /
5 rows created.
SCOTT@orcl12c> select * from test_tab
2 /
TEST_COL
ab-tönes
ab-tones
abtones
ab tones
ab-tanes
5 rows selected.
SCOTT@orcl12c> begin
2    ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
3    ctx_ddl.set_attribute('test_lex1','base_letter','YES');
4 end;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl12c> create or replace procedure test_proc
2    (p_rowid in          rowid,
3      p_clob    in out nocopy clob)
4 as
5 begin
6    select replace (translate (test_col, '/\|-_+&''', '      '), ' ', '')
7    into   p_clob
8    from   test_tab
9    where rowid = p_rowid;
10 end test_proc;
11 /
Procedure created.
SCOTT@orcl12c> show errors
No errors.
SCOTT@orcl12c> begin
2    ctx_ddl.create_preference ('test_ds', 'user_datastore');
3    ctx_ddl.set_attribute ('test_ds', 'procedure', 'test_proc');
4 end;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl12c> create index test_idx on test_tab (test_col)
2    indextype is ctxsys.context
3    parameters
4       ('lexer    test_lex1
5         datastore    test_ds
6         stoplist    ctxsys.empty_stoplist')
7 /
Index created.
SCOTT@orcl12c> select token_text from dr$test_idx$i
2 /
TOKEN_TEXT
ABTANES
ABTONES
2 rows selected.
SCOTT@orcl12c> variable search_string varchar2(100)
SCOTT@orcl12c> exec :search_string := 'ab tones'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> select * from test_tab
2 where contains
3            (test_col,
4             '!' || replace (:search_string, ' ', ' !') ||
5             ' or !' || replace (:search_string, ' ', '')) > 0
6 /
TEST_COL
ab-tönes
ab-tones
abtones
ab tones
ab-tanes
5 rows selected.
SCOTT@orcl12c> exec :search_string := 'abtones'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
TEST_COL
ab-tönes
ab-tones
abtones
ab tones
ab-tanes
5 rows selected.
SCOTT@orcl12c> exec :search_string := 'ab tönes'
PL/SQL procedure successfully completed.
SCOTT@orcl12c> /
TEST_COL
ab-tönes
ab-tones
abtones
ab tones
ab-tanes
5 rows selected.
SCOTT@orcl12c> select ctx_report.create_index_script ('test_idx') from dual
2 /
CTX_REPORT.CREATE_INDEX_SCRIPT('TEST_IDX')
begin
ctx_ddl.create_preference('"TEST_IDX_DST"','USER_DATASTORE');
ctx_ddl.set_attribute('"TEST_IDX_DST"','PROCEDURE','"SCOTT"."TEST_PROC"');
end;
begin
ctx_ddl.create_preference('"TEST_IDX_FIL"','NULL_FILTER');
end;
begin
ctx_ddl.create_section_group('"TEST_IDX_SGP"','NULL_SECTION_GROUP');
end;
begin
ctx_ddl.create_preference('"TEST_IDX_LEX"','BASIC_LEXER');
ctx_ddl.set_attribute('"TEST_IDX_LEX"','BASE_LETTER','YES');
end;
begin
ctx_ddl.create_preference('"TEST_IDX_WDL"','BASIC_WORDLIST');
ctx_ddl.set_attribute('"TEST_IDX_WDL"','STEMMER','ENGLISH');
ctx_ddl.set_attribute('"TEST_IDX_WDL"','FUZZY_MATCH','GENERIC');
end;
begin
ctx_ddl.create_stoplist('"TEST_IDX_SPL"','BASIC_STOPLIST');
end;
begin
ctx_ddl.create_preference('"TEST_IDX_STO"','BASIC_STORAGE');
ctx_ddl.set_attribute('"TEST_IDX_STO"','R_TABLE_CLAUSE','lob (data) store as (
cache)');
ctx_ddl.set_attribute('"TEST_IDX_STO"','I_INDEX_CLAUSE','compress 2');
end;
begin
ctx_output.start_log('TEST_IDX_LOG');
end;
create index "SCOTT"."TEST_IDX"
on "SCOTT"."TEST_TAB"
      ("TEST_COL")
indextype is ctxsys.context
parameters('
    datastore       "TEST_IDX_DST"
    filter          "TEST_IDX_FIL"
    section group   "TEST_IDX_SGP"
    lexer           "TEST_IDX_LEX"
    wordlist        "TEST_IDX_WDL"
    stoplist        "TEST_IDX_SPL"
    storage         "TEST_IDX_STO"
begin
ctx_output.end_log;
end;
1 row selected.

Oracle text catsearch sub index query

Hello,
I wonder if you can help me with a query about Oracle Text Catsearch.
I have a database which has 10Gb of data.
There is a text column in the database on which I have to find a partial match on the data contained in it
I have indexed this column with a CTXSYS.CTXCAT index.
In addition I have added a sub index to the index set for a Date Field and ran EXEC DBMS_STATS.GATHER_TABLE_STATS to make sure the query execution path is optimised
Here's my Question:
How can I make sure that the Date sub query always runs before the finding the Partial Match on the text column?
Caveat I am a programmer not a DBA, but I've ended up doing some databasey type stuff, apologies if question is thick.
Cheers
Mark
p.s. Performance is good, but I have a feeling that the Date subquery is not being used as efficiently as it should be (the subquery should massively reduce the result set to be searched for the partial match)

You can't - ctxcat doesn't support the "functional invocation" which would be needed if another index is used first. So reducing the set of docs to index doesn't help.
If you can find a way to denormalize the information used in the sub-query such that it can be included in the main query index set, that should help performance considerably.

Oracle Text - procedure refreshing index doesn't run

Hi,
I'm experiencing problems refreshing an index using a procedure (see below). When running the command instead of the procedure everything is fine.
Reading through this forum I found that ctxapp role is needed, which I have. Since this didn't work the admin granted execute privileges on any procedures and programmes (for a short test). Even that didn't help.
The admin can run the procedure without any problems.
Help appreciated
Franziska
---The procedure---
create or replace procedure indizes_erneuern
as
begin
ctx_ddl.optimize_index('ind_nachname_assistent', 'REBUILD');
end;---The error message
Connecting to the database FAM.
ORA-20000: Oracle Text error:
ORA-01031: insufficient privileges
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CTX_DDL", line 630
ORA-06512: at "FAM.INDIZES_ERNEUERN", line 4
ORA-06512: at line 2
Process exited.
Disconnecting from the database FAM.

When running the command instead of the procedureYou'll have to either issue direct grants or define the procedure with AUTHID CURRENT_USER.

Problem with blob column index created using Oracle Text.

Hi,
I'm running Oracle Database 10g 10.2.0.1.0 standard edition one, on windows server 2003 R2 x64.
I have a table with a blob column which contains pdf document.
Then, I create an index using the following script so that I can do fulltext search using Oracle Text.
CREATE INDEX DMCS.T_DMCS_FILE_DF_FILE_IDX ON DMCS.T_DMCS_FILE
(DF_FILE)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('DATASTORE CTXSYS.DEFAULT_DATASTORE');
However, the index is not searchable and I check the following tables created by database for my index and found them to be empty as well !!
DR$T_DMCS_FILE_DF_FILE_IDX$I
DR$T_DMCS_FILE_DF_FILE_IDX$K
DR$T_DMCS_FILE_DF_FILE_IDX$N
DR$T_DMCS_FILE_DF_FILE_IDX$R
I wonder what's wrong with it.
My user has been granted the ctx_app role and I have other tables that store plain text which I use Oracle Text are fine. I even output the blob column and save as pdf file and they are fine.
However the database seems like not indexing my blob column although the index can be created without error.
Please advise.
Really appreciate anyone who can help.
Thank you.

The situation is I have already loaded a few pdf document into the table's blob column.
After I create the Oracle text index on this blob column, I find the system generated index tables listed in my earlier posting are empty, except for the 4th table.
Normally we'll see words inside the table where those are the words indexed by oracle text on my document.
As a result, no matter how i search for the index using select statement with contains operator, it will not give me any result.
I feel weird why the blob is not indexed. The content of the blob are actually valid because I tested this by export the content back to pdf and I can still view and search within the pdf.
Regards,
Jap.

Problem creating Oracle text index

Hi,
I am trying to create an index in Oracle 9i using Oracle Text.
First i gave this grant as SYSDBA:
"GRANT ALL ON CTX_DDL TO <USERNAME>"
Then i executed the following :
EXECUTE CTX_DDL.CREATE_SECTION_GROUP('MYPATHGROUP','PATH_SECTION_GROUP');
CREATE INDEX SDS_SLIDE_XML_IDX ON SDS_SLIDE_DATA (SLIDE_XML)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('SECTION GROUP MYPATHGROUP');
but I got the following error :
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-50857: oracle error in dricon.get_primary_key
ORA-00980: synonym translation is no longer valid
ORA-06512: at "CTXSYS.DRUE", line 157
ORA-06512: at "CTXSYS.TEXTINDEXMETHODS", line 186
Any ideas?
Rgds
Vikram.

Oracle Text will not index the content inside the portlet of the pages. The portlet is treated as a portlet instance item and only the relevant attributes are searched for, like display name of the portlet, etc.

Document management system using oracle text

i plan to create document management system using oracle text with following features
1) document comparision
2) document search
and more...
can oracle text be used to display documents of various formats by converting them to HTML. and can search keywords be highlighted in the document.
please help!

Have you ever considered doing this in Oracle Application Express (free on top of the Oracle database)? How about something like:
http://download-west.oracle.com/docs/cd/B31036_01/doc/appdev.22/b28839/up_dn_files.htm
Index the files using the CONTEXT index, and perhaps the docs' meta with it using the Oracle Text MULTI_COLUMN_DATASTORE, and then when you write your query for a report on the documents include a search string.
I've created a number of APEX-based document management systems and it is quite easy once you get the hang of using this environment. I suggest looking at some of the tutorials/how-to documents and you'll be on your way quickly.
Start with the upload application. Once you can get your documents in, create a report that shows everything except the document. Verify all of this works correctly.
Add some "items" to the page for the report, and include them as bind variables in the where clause.
After that, add your Oracle Text index to the database, and toss in a "text-field" item to the APEX page. Modify your report query, adding the CONTAINS clause, and use the newly created item as a bind variable. There's your keyword search.
Linking to Oracle Apps is done through API's and may be over database links.
Hope it helps. Though not a step-by-step how to document, this should point you in the right direction. Get familiar with APEX as that covers most of what you described.
-Ron

Oracle Text Help with XML column values

Hello. In addition to being new to Oracle Text, I am inheriting an Oracle Text application and have a couple of questions.
First, A context-based index has been set-up on a CLOB column which contains an XML formatted document. The Auto Section Group parameter has been set to created zones for each tag of the XML document. I have found that when using a browser to display the content of the CLOB, some of the column values have trouble displaying in the browser, where I receive an XML processing error. I believe this is due to the fact that some of the XML document rows contain URLs that are not embedded in the CDATA tag. In any case, if the browser has trouble displaying the XML, will oracle text have trouble indexing the XML and creating the section group zones?
Second, I understand that the NOT operator takes a right operand term and left operand term. Can either of the terms be the results of the WITHIN operator, i.e. "dogs not (cats within animals)".
Thank you.

I bet you just whipped that out, and I thank you with all my
heart, its amazing to me how many ways I tried to do what you did.
Thanks
I have a second question relating to the same problem and
that is in referencing the over state. Currently, I can write
'text' into the text field and see what I have coming in from xml
in its place during the 'up' state.
However, when the timeline hits the 'over' state, the
textfield will display nothing, or 'text' if I have that written
in. I suspect that I am not referencing the'over' state correctly.
Should I add one line of code sort of referencing the text
field and not just the button while in the over state?

Oracle text indexed view is possible

Oracle text indexed view is possible???

ok,
My table name is T_DOC :
ID----------------> NUMBER(30)
DESCRIPTION-------> VARCHAR2(2000 BYTE)
DOC---------------> BLOB
FILENAME----------> VARCHAR2(2000 BYTE)
MIMETYPE----------> VARCHAR2(2000 BYTE)
LAST_UPDATE_DATE--> DATE
T_DOC
| Id | DESCRIPTION | DOC | FILENAME | MIMETYPE | LAST_UPDATE_DATE |
| 1 | THE DOG | *(!BLOB) | THE_CAT.PDF | application/pdf | 20/05/2010 15:06:15 |
| 2 | THE BIRD | **(!BLOB) | THE_BIRD.PDF | application/pdf | 20/05/2010 15:06:15 |
| 3 | THE HUMAN AND CAT | ***(!BLOB) | THE_HUMAN.PDF | application/pdf | 20/05/2010 15:06:15 |
* is a document .pdf with content: "the dog and cat"
** is a document .pdf with content: "the bird in house"
*** is a document .pdf with content: "the human from USA"
Index the columns DESCRIPTION, DOC (document content), FILENAME
begin
ctx_ddl.create_preference('idxDoc_lx', 'BASIC_LEXER');
ctx_ddl.set_attribute (' idxDoc_lx ', 'MIXED_CASE', 'NO');
end;
begin
ctx_ddl.create_preference('idxDoc_ds', 'MULTI_COLUMN_DATASTORE');
ctx_ddl.set_attribute ('idxDoc_ds', 'COLUMNS', 'DOC, FILENAME, DESCRIPTION');
end;
CREATE INDEX IDX_DOC
ON T_DOC (FILENAME)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('lexer idxDoc_lx
datastore idxDoc_ds
filter CTXSYS.AUTO_FILTER
sync (on commit)');Search Query:
select ID
from T_DOC
where CONTAINS (DOCUMENTO, 'CAT', 1) > 0 RESULT ID = 1
WHY NOT ALSO Returned ID 3 ??????

How to compute a global SCORE over a few oracle text indexed tables?

Dear experts!
I want to search a website with Oracle Text. The website consists of four tables:
- site
- chapter
- text
- binaries
Each table has two or three columns which should be indexed with oracle text. So I have created a MULTI_COLUMN_DATASTORE oracle text index on each table - So I have four indexes on my website.
When I want to search over the website I have to join my 4 tables (4 contain clauses). So how do I get a global SCORE over these 4 contains clauses?
The next question is can I change the weight of my text indexes (useful for the search hit list)? For example the highest weight has the site index, the second highest weight the chapter index and so on?
Thanks
Markus

If it's a simple JOIN, then you could just add the scores for each CONTAINS clause
select score(1)+score(2)+score(3)+score(4)
from table1 t1,table2 t2, table3 t3,table4 t4
where [join conditions]
and contains(t1.col, 'xxx', 1) > 0 or
contains(t2, col, 'xxx', 2) > 0 or
... etc
then to change the weight you just add a multiplying factor.
Can't help thinking it's probably more complex than this, though.

Update Results not Displayed in Oracle Text search with Transactional Index

Hi,
I am working on a solution utilising Oracle Text to give me a probable list of matching records. The problem I have the table I am searching on is prepopulated with seed data and the application we are building is assigning a record and updating the details(columns) against it. This detail is what we are searching on using an Multi Column Datastore index which is refreshed every hr and also has the transactional parameter specified. Unfortunately the Transactional Index does not pick up the updated details, it only seems to work if I insert a new record (which will never happen). This to me sounds like a bug. Any assistance would be greatly appreciated.

Barbara,
I think you may have eluded to my problem. I haven't updated the "dummy" column
The table structure is as follows:
CREATE TABLE WAGN (
     WAGN               VARCHAR2(8) NOT NULL PRIMARY KEY,
     last_name          VARCHAR2(240),
     first_name          VARCHAR2(240),
     middle_name          VARCHAR2(240),
     date_of_birth     DATE,
     gender               VARCHAR2(1),
     status               VARCHAR2(1) NOT NULL,
     signature          RAW(64));
The preference creation is:
BEGIN
     ctx_ddl.create_preference('WAGN_NAME_SRCH', 'MULTI_COLUMN_DATASTORE');
     ctx_ddl.set_attribute('WAGN_NAME_SRCH', 'columns', 'last_name, first_name, middle_name, date_of_birth, gender');
END;
The Index Creation statement is:
CREATE INDEX wagn_srch_idx1 ON WAGN(signature) --Dummy Column
INDEXTYPE IS ctxsys.CONTEXT
PARAMETERS ('DATASTORE WAGN_NAME_SRCH SYNC(EVERY "SYSDATE+60/24/60" PARALLEL 10) TRANSACTIONAL');
And a typical update statement is (contained with PL/SQL):
     UPDATE WAGN
          SET status = x_wagn_assigned_status,
               last_name = p_employee_details.last_name,
               first_name = p_employee_details.first_name,
               middle_name = p_employee_details.middle_name,
               date_of_birth = p_employee_details.date_of_birth,
               gender = p_employee_details.gender
          WHERE WAGN = l_wagn;
So my guess is that because the dummy column (signature) is not updated it is not being reflected in the transactional memory area.

Help with creating oracle text index on 2 columns with partial html data

Hi,
I need to create an oracle text index on 2 columns.
TITLE - varchar(255) = contains plain text data
DESCRIPTION - CLOB = contains partial HTML data
This is what I created.
begin
ctx_ddl.create_preference ('Title_Description_Pref', 'MULTI_COLUMN_DATASTORE');
ctx_ddl.set_attribute('Title_Description_Pref', 'columns', 'TITLE, DESCRIPTION');
end;
begin
ctx_ddl.create_preference ('bid_lexer', 'BASIC_LEXER');
ctx_ddl.set_attribute('bid_lexer', 'index_stems', 'ENGLISH');
ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
end;
create index Bid_Title_Index on Bid(title) indextype is ctxsys.context parameters ('LEXER bid_lexer sync (every "sysdate+(1/24)")');
create index Bid_Title_Desc_Index on Bid(description) indextype is ctxsys.context parameters ('LEXER bid_lexer DATASTORE Title_Description_Pref sync (every "sysdate+(1/24)") filter ctxsys.null_filter section group htmgroup');
The problem is when I do a CONTAINS(description, '$(auction)')>0. I get results where the descriptions have the "auction" word (which is correct). But, the results also returned rows where the search word is inside an IMG tag. e.g. <img src="http://auction.de/120483" alt="Auction Logo"/>.
What I would like is to exclude rows where the search word is inside HTML tag attributes, results expected are rows having <a>Auction</a> or <p>For Auction</p> ... etc. Basically stripping the html tags and leave the text contents.
I'd appreciate some input.
Thanks,
Amiel

Hi,
I need to create an oracle text index on 2 columns.
TITLE - varchar(255) = contains plain text data
DESCRIPTION - CLOB = contains partial HTML data
This is what I created.
begin
ctx_ddl.create_preference ('Title_Description_Pref', 'MULTI_COLUMN_DATASTORE');
ctx_ddl.set_attribute('Title_Description_Pref', 'columns', 'TITLE, DESCRIPTION');
end;
begin
ctx_ddl.create_preference ('bid_lexer', 'BASIC_LEXER');
ctx_ddl.set_attribute('bid_lexer', 'index_stems', 'ENGLISH');
ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
end;
create index Bid_Title_Index on Bid(title) indextype is ctxsys.context parameters ('LEXER bid_lexer sync (every "sysdate+(1/24)")');
create index Bid_Title_Desc_Index on Bid(description) indextype is ctxsys.context parameters ('LEXER bid_lexer DATASTORE Title_Description_Pref sync (every "sysdate+(1/24)") filter ctxsys.null_filter section group htmgroup');
The problem is when I do a CONTAINS(description, '$(auction)')>0. I get results where the descriptions have the "auction" word (which is correct). But, the results also returned rows where the search word is inside an IMG tag. e.g. <img src="http://auction.de/120483" alt="Auction Logo"/>.
What I would like is to exclude rows where the search word is inside HTML tag attributes, results expected are rows having <a>Auction</a> or <p>For Auction</p> ... etc. Basically stripping the html tags and leave the text contents.
I'd appreciate some input.
Thanks,
Amiel

[Oracle Text]How to register additional datas when indexing documents ?

Hello,
For the moment we index documents (Word, excel, pdf, ppt, html, xml...) from the filesystem and it works well.
Now, we need to attach some informations on each documents and we must be able to search on these attributes, for instance :
We can index a Word document and we would like some additionnal index informations like :
YEAR
SIZE
NUMBER
These informations are stored in a table, the table contains also the path to the documents on the filesystem.
We are able to query a text on the index mixed with a filter on the columns above.
We tought with the solution to store these informations directly in the index, but we don't know if it's a good solution (in term of speed, structure...)
So, Is there any solution to index the documents on the filesystem with extra information at index time ?
Is it possible ? How can we do that ?
What do you think about that ?
Thanks by advance

1. If you're using 12c, you can use ctx_doc.policy_languages. (https://docs.oracle.com/database/121/CCREF/cdocpkg011.htm#CCREF24102)
2. If you want multiple stoplists based on each document's language, you have to use the multi-lexer. For world_lexer, there is one stoplist; since the stoplists are somewhat dynamic (you can add but not remove them), the most accurate way to fetch the list is using ctx_report.describe_index or ctx_report.create_index_script and parse the report.

How do I get Oracle Text to index files on a file server?

I am new to Oracle (I'm a MS-SQL DBA looking for a Full-Text Search solution that is better than linking to a MS index server.)
So - Here's the objective:
I have Oracle Server(Express) installed on a Windows server.
I would like for Oracle to build a Full-Text Catalog of the files on a separate file server based on file paths in a table in the database.
(No desire to store terabytes of images and documents inside the database)
I can get Oracle text up and running, using the URL_Datastore:
CREATE TABLE files (id NUMBER PRIMARY KEY, issue_id NUMBER, path VARCHAR(255) UNIQUE, ot_format VARCHAR(6), ot_version VARCHAR(10));
The Compaq server is a remote windows server on my local workgroup, so the fully qualified path is just "compaq" and the URL is valid:
INSERT INTO files VALUES (9,9,'file://Compaq/FTQ/00000003.pdf',NULL,NULL);
INSERT INTO files VALUES (13,13,'file://Compaq/FTQ/01.txt',NULL,NULL);
CREATE INDEX file_index ON files(path) INDEXTYPE IS ctxsys.context
PARAMETERS ('datastore ctxsys.URL_DATASTORE format column ot_format');
but when I enter:
Select * from CTX_User_Index_errors, I see the following errors:
DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/00000003.pdf
DRG-11609: URL store: unable to open local file specified by file://Compaq/FTQ/01.txt
Did I miss something?
Do I need to install anything on the file server?
I would like to convince my company that Oracle can be much quicker than Microsoft's Indexing Service because it can avoid joining two large result sets (one result set from Full_text (indexing service) and one for specific data contained in fields in the MS-SQL database.) Full Text Searches commonly take 40 - 60 seconds where there are 1.5 million multi-page PDF files for a particular set that I sample search on. Without this massive join, I believe I can get the search to run in under 10 seconds.

Thank you!
File_Datastore worked fine.
I was staying away from File_Datastore because the information I gathered from googling suggested that file_datastore would only work locally.
Now I just have to get Oracle to pull data out of tables in a MS-SQL database on the local network (don't have a clue yet), and then have it index compiled file paths.
Then MS-SQL can query Oracle with index and full-text criteria and Oracle can send back a result set
It may sound like a bad way of performing Full-Text Queries, but anything will be better than the way things are currently running. We are currently performing Full Text Searches on a table that is rebuilt nightly, so the table containing millions of file paths is not live..
It would be so much better if we just migrated to Oracle, but we currently do not have the resources.

Oracle Text, multi_column_datastore describe index

Similar Messages

Maybe you are looking for