Oracle Text and filters

Hello everybody,
here is my question:
we have a system and we want users to be able to load documents in the database(upload them, they will be stored in BLOB field of a table), to make full text search on them( I think we should use INSO_FILTER for that), and to be able to download documents they are interested in.
My question is what is the support for the INSO_FILTER( becouse it is not from Oracle), will there be a version for Linux, will oracle make a filter, and what will be the support from oracle?
Thanks
Stanislav

How about the INSO support in 8.1.7EE? I've seen conflicting reports about it, especially the ctxhx binary missing in the
Linux install. Thanks.
INSO is supported in 8.1.7EE, it is NOT supported in 8.1.6EE.

Similar Messages

Oracle Text and Workspace Manager

Has anybody incoroporated Workspace Manager and Oracle text together. How is the Oracle Text index handled? Can users in different workspaces submit documents, have them indexed and be the only ones to see those documents?

Hi,
I do not have much experience with Oracle text, and am unsure exactly how it works. As such, I would suggest to file a TAR requesting this information.
Regards,
Ben

Oracle Text and Custom item type

I have a custom item type defined that has custom attributes. With Oracle Text disabled, I can use a custom search to find these items either by their standard attributes (name,description, etc.) or their custom attributes. However, as soon as I enable Oracle Text (and allow the indexing process to complete), I can no longer locate these items in a search.
I assume that I need to do something to tell Oracle Text to index these types of items, but could not find anything in documentation. Any assistance would be appreciated.
Also, assuming we get around the above issue, if I have Oracle Text enabled, does this mean I will not be able to find these items after they are created until the next scheduled index update? Is there away around this besides killing Oracle Text?
Rgds/Mark M.
Portal 9.0.2.6

The indexes need to be synchronised for the items to be searched and returned. Hence until the next scheduled index update, the item will not be returned. To explicitly udpate the indexes, run the procedure in the portal database connected as the schema owner.
SQL> exec wwv_context.sync();

Oracle Text and TREC

Hello,
I am new to SQL, Oracle, and Oracle Text and need to use Oracle Text to index about 2GB of files (located on the filesystem), each of which contain multiple documents. These documents are all in SGML format with the relevant data I need being inbetween DOCNO and TEXT tags.
So far I understand I need to create a table similar to the following...
create table "DocTable" ("Docno" number, "Text" text)
...and then either use a CONTEXT or CTXCAT index, but im not sure which.
In general im not too sure what to do. Any help is appreciated.
Thanks :)

it actually depends upon what u r searching for
A file datastore is works as below
1) a location where all your files are stored - say /mydocs
so you need to create a preference
begin
ctx_ddl.create_preference('COMMON_DIR','FILE_DATASTORE');
ctx_ddl.set_attribute('COMMON_DIR','PATH','/mydocs');
end;
now create a table where you list down all the file names, the doc id is something for your reference. This can be any number which you prefer. But has to be unique, as this a pkey.
create table mytable(id number primary key, docs varchar2(2000));
insert into mytable values(111555,'first.txt');
insert into mytable values(111556,'second.txt');
commit;
Now indexing, which ctaully fetches the documents from the file location
create index myindex on mytable(docs)
indextype is ctxsys.context
parameters ('datastore COMMON_DIR');
now the queries on the table will be using CONTAINS operator - as you have created a CONTEXT index.
So , you need to fist determine what kind of queries you need to make. On the basis of that you can create the index.

Manual Install of Oracle Text and XDB for 11g

I can't find the Metalink manual install docs for Oracle Text and XDB. For unknown reasons the DB was created with out them. I see the docs for 9i and 10g, but not 11g.
Thanks!

Hi,
I think you are in APEX forum.
This question seems more suitable for installation forum: Database Installation
Also, please see if this thread helps: How do I install ORACLE TEXT
Ta,
Trent

Oracle Text and cyrillic charsets

i'm trying to index a column containing a mix of word documents and text documents (in a combination of koi8-r, iso-8859-5 and utf-8) stored in a blob column via oracle text.
(the database has a native charset of AL32UTF8.)
the table looks like this:
BLOB_TEST (
ID NUMBER,
DATA BLOB,
FMT VARCHAR2,
CSET VARCHAR2,
LANG VARCHAR2(10)
i set up my index to use inso_filter with the format and charset options.
CREATE INDEX blob_ling ON blob_test(data)
indextype IS ctxsys.CONTEXT
PARAMETERS('datastore ctxsys.direct_datastore
lexer ctxsys.world_lexer
filter ctxsys.inso_filter
stoplist ctxsys.default_stoplist
language column lang
format column fmt
charset column cset' );
however when i try to run full text queries using non-ascii (in this case cyrillic) queries, only the word documents ever have hits.
ie given a word document and a text document containing the exact same cyrillic string, a query for that exact string using contains only returns the word document.
what could be causing this behavior, and how do i actually enable character set filtering?

When I used BASIC_LEXER with the same environment with cyrillic documents, all works perfectly.

Index rules in oracle text and query using matches

Dear All,
I would like to ask about rules and matches function in oracle text.
I followed an example in oracle text application developer's guide.
I have a rule table like this :
1 oracle
2 larry or ellison
3 oracle and text
4 market share
then, I create an index to that table. This is needed for calling matches function. Here is the syntax :
create index queryx on queries(query_string)
indextype is ctxsys.ctxrule;
then, I noticed that the result on DR$QUERYX$I table as follows :
LARRY 0 2 2 1 (BLOB)
MARKET 0 4 4 1 (BLOB) {MARKET} {SHARE}
ORACLE 0 1 1 1 (BLOB)
ORACLE 0 3 3 1 (BLOB) {TEXT}
ELLISON 0 2 2 1 (BLOB)
What I want to ask is why doesn't the words 'share' and 'text' appear in the DR$QUERYX$ table?
When we use matches function, it then search on the index result and consequently it wion't find the 'share' word. so when for example I do query like this :
select query_id from queries where matches(query_string,' It only share ten percent of all products sold')>0
it will give 0 result since the no word in ' It only share ten percent of all products sold' was in index table. But actually it could possibly be categorized as the 4 category which rules is 'market share'
I tried this in a larger set of data and get same result.
Here is my generated rules from my document collection :
1 {REQUIREMENTS} & {ELICITATION}
1 {REQUIREMENTS} ~ {ELICITATION} & {ACTOR}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} & {FURPS}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} & {PROC}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} & {SPEED}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} & {DOCUME}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} & {PLACED}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} ~ {PLACED} & {UNNECESSARY}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} ~ {PLACED} ~ {UNNECESSARY} & {MISUSE}
1 {INTERPRETATION} ~ {REQUIREMENTS}
2 {DESIGN} & {REPRESENTATION}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} & {OCTOBER}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} & {PROCEDURAL}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} & {STRICT}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} ~ {STRICT} & {GRASP}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} ~ {STRICT} ~ {GRASP} & {MANY} & {LAYER}
2 {DESIGN} ~ {REPRESENTATION} ~ {MAY}
3 {PM} & {TESTING} & {ATTRIBUTI}
And this is the index table result with ctxrule :
(only the token_text column shown)
PM
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
INTERPRETATION
so when I try to classify a document with the word ouline inside it, it should produce category 1 (based on the rules) but since there are no word 'outline' in index tabel, the matches will return 0 means that the document is not classifiedto any category. I don't understand why it happen. Anybody knows about this? I would really appreciate any help.
Thank you very much.

Hm, I see. It do make sense. so nice to know.
But then in the second example I gift where I used larger table, as shown below :
Here is my generated rules from my document collection :
1 {REQUIREMENTS} & {ELICITATION}
1 {REQUIREMENTS} ~ {ELICITATION} & {ACTOR}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} & {FURPS}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} & {PROC}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} & {SPEED}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} & {DOCUME}
1 {INTERPRETATION} ~ {REQUIREMENTS}
2 {DESIGN} & {REPRESENTATION}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} & {OCTOBER}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} & {PROCEDURAL}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} & {STRICT}
2 {DESIGN} ~ {REPRESENTATION} ~ {MAY}
3 {PM} & {TESTING} & {ATTRIBUTI}
As far as I know, the sign ' ~ ' means 'OR' and '&' means 'and' . So based on the 4th line in my table :
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
it can be concluded that if any of the words stated there been queried, so the category '1' will appear as a result. But then before we can use 'matches' to query it, we need ti create index for the rules table . I did it and the result were :
(only the token_text column shown)
PM
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
INTERPRETATION
there were no words other than PM, DESIGN< REQUIREMENTS and INTERPRETATION. Why the words REQUIREMENTS, ELICITATION, ACTOR, FURPS, OUTLINE don't appear in the index result?

Oracle Text and APEX

Hello
Tried the Oracle White Paper - Oracle Text Web Applications
Created the table and populated with relevant url links
create table htmldb_documentation(
id number,
doc_title varchar2(4000),
doc_url varchar2(4000))
then created the index
create index htmldb_doc_ctxidx on htmldb_documentation(doc_url)
indextype is ctxsys.context
parameters ('datastore CTXSYS.URL_DATASTORE')
Then ran my SQL for the report in Toad and APEX SQL Workshop>SQL Commands before creating an APEX Region based on a SQL Report
select score(1) relevance, doc_title, doc_url
from htmldb_documentation
where CONTAINS (doc_url, :P1_SEARCH, 1) > 0
order by 1 desc
After running the APEX Report I get error
report error:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 1
I also ran these grant commands after I received this error
grant ctxapp to demo;
grant execute on ctx_cls to demo;
grant execute on ctx_ddl to demo;
grant execute on ctx_doc to demo;
grant execute on ctx_output to demo;
grant execute on ctx_query to demo;
grant execute on ctx_report to demo;
grant execute on ctx_thes to demo;
grant execute on ctx_ulexer to demo;
Any ideas ?? I'm running APEX 3.1.0.00.32 on Oracle 10.2.0.1 on WindowsXP
If I replace the bind variable :P1_SEARCH, with a literal value the error disappears

Couple of things to check:
1) do you have an item called P1_SEARCH in your application?
2) If so, make sure that it has a value; otherwise, Oracle Text gets confused and will throw that error.
You may want to consider using a PL/SQL Function Returning SQL Query that will only append the CONTAINS clause if P1_SEARCH has some value.
Thanks,
- Scott -
http://sumnertechnologies.com
http://spendolini.blogspot.com

Intermedia text (oracle text) and excel

Hi,
is it possible to index just one single sheet instead of the whole excel-workbook in
oracle text? Or is it possible to name just one sheet (e.g. table1) in a 'contains'
query? Additionally, is it possible to searc for the number of occurences of a term
within one document (i.e. excel-sheet)? Thanks for your help in advance.
Best Regards,
Dan

Hi Dan,
is it possible to index just one single sheet instead of the whole excel-workbook in
oracle text? No
Or is it possible to name just one sheet (e.g. table1) in a 'contains'
query?No
Additionally, is it possible to searc for the number of occurences of a term
within one document (i.e. excel-sheet)? Thanks for your help in advance.I am not sure at the moment, maybe with some kind of rankin.
The problem is the the INSO-filter converts the excel file to a html file and oracle is reading this generated file. And in file there is no information about excel sheets.
Regards,
Thomas

Oracle Text and MINUS character

Hi all,
I have following problem:
- I have created a Oracle Text Index on the VARCHAR2 column:
BEGIN
ctx_ddl.create_preference('SUBSTRING_PREF','BASIC_WORDLIST');
ctx_ddl.set_attribute('SUBSTRING_PREF','SUBSTRING_INDEX','TRUE');
END;
CREATE INDEX IDX_TEXT_1 ON MY_TABLE
(COLUMN1)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('wordlist SUBSTRING_PREF memory 50m')
NOPARALLEL;- I execute the following SELECT:
SELECT mt.*
FROM MY_TABLE mt
WHERE contains(mt.COLUMN1, 'test%') > 0;It returns all records where the column1 contains the entry "test" + something else.
BUT the records where column1 contains the entries "my-test" + something else or "owr-test" + something else, a.s.o.
It should return only the records with "test" + something else entries.
How can I change the Index or query to achieve it?
Best regards

By default, the hyphen is treated as a break character and the words on either side are indexed as separate tokens. To change this behavior, you can create a lexer, set the printjoins attribute of the lexer to include the hyphen, then use that lexer in your index parameters. Then, strings of characters containing a hyphen will be indexed as one token, including the hyphen. Please see the example below.
SCOTT@orcl_11gR2> CREATE TABLE my_table (column1 VARCHAR2(60))
2 /
Table created.
SCOTT@orcl_11gR2> INSERT ALL
2 INTO my_table VALUES ('test')
3 INTO my_table VALUES ('testing')
4 INTO my_table VALUES ('my-test')
5 INTO my_table VALUES ('owr-test')
6 SELECT * FROM DUAL
7 /
4 rows created.
SCOTT@orcl_11gR2> BEGIN
2    ctx_ddl.create_preference('SUBSTRING_PREF','BASIC_WORDLIST');
3    ctx_ddl.set_attribute('SUBSTRING_PREF','SUBSTRING_INDEX','TRUE');
4    CTX_DDL.CREATE_PREFERENCE ('test_lex', 'BASIC_LEXER');
5    CTX_DDL.SET_ATTRIBUTE ('test_lex', 'PRINTJOINS', '-');
6 END;
7 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> CREATE INDEX IDX_TEXT_1 ON MY_TABLE (COLUMN1)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS
4    ('wordlist SUBSTRING_PREF
5       LEXER        test_lex
6       memory    50m')
7 NOPARALLEL
8 /
Index created.
SCOTT@orcl_11gR2> SELECT token_text FROM dr$idx_text_1$i
2 /
TOKEN_TEXT
MY-TEST
OWR-TEST
TEST
TESTING
4 rows selected.
SCOTT@orcl_11gR2> SELECT mt.*
2 FROM   MY_TABLE mt
3 WHERE contains (mt.COLUMN1, 'test%') > 0
4 /
COLUMN1
test
testing
2 rows selected.

Oracle Text and table joins

Is it possible to create an index on a view as opposed to a table, when using Oracle Text?

Thanks for the earlier response.
I was also loooking at Oracle Ultra Search. I figured out that Ultr search provides out of the box query application to do free-form text search, parameterized search and so on....
I also read that Oracle Ultra search uses Oracle text as the underlying technology to do the search.
I have a requirement where most of my searches will be on structured data spread across multiple tables. But I also have a requirement to index some static content which resides on the file system.
Is it possible to mix and match query results i.e. first perform a search on the structured data and then perform another search on the static document data and then megre results from both and display back to the user?
Thanks for the help

Oracle Text And Oracle Ultra search

Hi all,
I have a problem. I have some files in the server file system e.g C:/docs. I want to search these MS Office word files in order to see if they contain a word. I tried oracle ultra search but when i put File data source it provides a form file:/// and when i put C:/docs it gives me file://localhost/C:/docs. Can i configure Oracle ultra search to search into the server file system?Where is the directory that oracle ultra search searches a file?
How can index and search the files in the file systems exept oracle ultra search? Can Oracle Text do my job?
Sorry to bother and thank you in advance for your help
Antonis.

Hi,
Searching a word in a document (MS Word) can definitely be done using Oracle Text.
But, you other requirements can make things a bit complicated.
Oracle Text have something called file_datastore, where you need to mention the file_path/s and individual file names too.
Once you mention them, i.e. path and file names, oracle text can read the file , index them and you can query using simple queries.
If the number of files are huge, entering all the file names can be difficult. In that case you can give the following a try.
-- You can use a perl script to read the file name in a directory, put it into a file in a particular format (1 file name per line) and then load that text file into a table using ctxloader.
You can go through the FILE_DATASTORE and CTXLOADER examples from oracle text reference document.

Oracle Text and Replication

We need to use an Oracle Text index on an application table in a read-only materialised-view replicated environment. The Oracle Text index needs to exist at both the master site and the materialised-view site.
We see two possible alternatives:
(1) Replicate read-only copies of the Oracle Text Index tables and indexes to the materialised view site, in addition to replication of the application table.
or
(2) Replicate only the master application table and create local instances of the Oracle Text index at both the master and materialised view sites.
Which method is best? Any experience, advice or references that you can provide would be much appreciated.
Many thanks for your help,
Peter

See [url http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14226/repoverview.htm#i15730]Introduction to Advanced Replication, [url http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14229/strms_over.htm#i1006084] Introduction to Streams and [url http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14228/gen_rep.htm#i1007573]Understanding Streams Replication

Offline search engine based on Oracle Text and XML SDK?

Hi,
I am pretty new to Oracle Text, so I am not shure if the
question is correct.
We have an intranet, with lots of MS Word, PDF and HTML files.
The site is starting to get out of hand, there is to many
information distributed all over it.
We would need a search engine for these files. The front would
be HTML or XML based on XSQL. The backend, that is the search
engine would look like this: I would store the HTML, PDF and
Word files inf the file system and regularly reload them into
Oracle into BFILE columns. I could user Oracle Text to index the
content and use XDK to make the searching part for it.
Is the Oracle Text feasable?
Every hint would be highly appreciated.
Tamas Szecsy

We would need a search engine for these files. The front would
be HTML or XML based on XSQL. The backend, that is the search
engine would look like this: I would store the HTML, PDF and
Word files inf the file system and regularly reload them into
Oracle into BFILE columns. I could user Oracle Text to index the
content and use XDK to make the searching part for it.
Is the Oracle Text feasable?You can store HTML, PDF and Word documents in the file system or
in a database column and make it searchable using Oracle Text.
You can write a JSP or PSP that executes the SQL statement and
then displays the result. Take a look at the Indagine sample
code (Java - it's a little bit old: version 8.1.5) or the "Text
search with PL/SQL Server Pages" notes. All available from
otn.oracle.com/products/text -> sample code.

Difference between Oracle Text and XML DB?

We are currently storing XML's in Oracle text. I understand that XML DB is faster to retrieve XML's based on conditional search within XML.
Is there a place I could find the difference between these two?

Text offer xpath like searching, assuming that you don't need to worry about little things like namespaces :). Oracle XML DB offers native XML storage, indexing and searching fully compliant with rellevant XML standards.

Oracle Text and filters

Similar Messages

Maybe you are looking for