Oracle Text Query of abbreviated word / name

I'm new to Oracle Text so please excuse the (probably) simple question. I want to be able to create a search that excludes (includes?) special characters and/or spaces between an abbreviated name. I'm not sure if it's possible but I would like to be able to return all of the below results if someone queried for "ABC" in one form or another.
Would this be something I'd add to a thesaurus? I see there is a STOPLIST but I'm not sure if there is the opposite of a stoplist.
Thanks in advance!
Regards,
Rich
set def off;
drop table docs;
CREATE TABLE docs (id NUMBER PRIMARY KEY, text VARCHAR2(200));
INSERT INTO docs VALUES(1, 'ABC are my favorite letters.');
INSERT INTO docs VALUES(2, 'My favorite letters are A,B,C');
INSERT INTO docs VALUES(3, 'The best letters are A.B.C.');
INSERT INTO docs VALUES(4, 'Three of the word letters are A-B-C.');
INSERT INTO docs VALUES(5, 'A B C are great letters.');
INSERT INTO docs VALUES(6, 'AB and C are easy letters to remember');
INSERT INTO docs VALUES(7, 'What if we used A, B, & C?');
commit;
begin
ctx_ddl.drop_preference('english_lexar');
end;
begin
ctx_ddl.create_preference('english_lexar', 'BASIC_LEXER');
ctx_ddl.set_attribute('english_lexar', 'printjoins', '_-');
ctx_ddl.set_attribute('english_lexar', 'skipjoins', '-.');
--ctx_ddl.set_attribute ( 'english_lexar', 'index_themes', 'YES');
ctx_ddl.set_attribute ( 'english_lexar', 'index_text', 'YES');
ctx_ddl.set_attribute ( 'english_lexar', 'index_stems', 'SPANISH');
ctx_ddl.set_attribute ( 'english_lexar', 'mixed_case', 'YES');
ctx_ddl.set_attribute ( 'english_lexar', 'base_letter', 'YES');
end;
begin
ctx_ddl.drop_preference('STEM_FUZZY_PREF');
end;
begin
ctx_ddl.create_preference('STEM_FUZZY_PREF', 'BASIC_WORDLIST');
ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_MATCH','ENGLISH');
ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_SCORE','0');
ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_NUMRESULTS','5000');
ctx_ddl.set_attribute('STEM_FUZZY_PREF','SUBSTRING_INDEX','TRUE');
ctx_ddl.set_attribute('STEM_FUZZY_PREF','PREFIX_INDEX','TRUE');
ctx_ddl.set_attribute('STEM_FUZZY_PREF','STEMMER','ENGLISH');
end;
begin
ctx_ddl.drop_preference('wildcard_pref');
end;
begin
Ctx_Ddl.create_Preference('wildcard_pref', 'BASIC_WORDLIST');
ctx_ddl.set_attribute('wildcard_pref', 'wildcard_maxterms', 100) ;
end;
DROP index myindex;
create index myindex on docs (text)
indextype is ctxsys.context
parameters ( 'LEXER english_lexar Wordlist wildcard_pref' );
EXEC CTX_DDL.SYNC_INDEX('myindex', '2M');
SELECT SCORE(1), id, text FROM docs WHERE CONTAINS(text, 'ABC', 1) > 0;It may be that my SQL statement isn't taking advantage of the Text options -- i.e. I'm forgetting something obvious :)

Indexes are case-insensitive by default, so let's ignore that.
You can make wal-mart and wal*mart match walmart by defining "-" and "*" as SKIPJOINS characters. However, you cannot make wal mart match walmart, other than by using NDATA.
NDATA does seem to work - any variation of wal mart walmart wal*mart and wal-mart do manage to match both walmart and wal mart. See example:
SQL> create table testcase (text varchar2(2000));
Table created.
SQL> insert into testcase values ('<nd>walmart</nd>');
1 row created.
SQL> insert into testcase values ('<nd>wal mart</nd>');
1 row created.
SQL> exec ctx_ddl.drop_section_group('tcsg')
PL/SQL procedure successfully completed.
SQL> exec ctx_ddl.create_section_group('tcsg', 'xml_section_group')
PL/SQL procedure successfully completed.
SQL> exec ctx_ddl.add_ndata_section('tcsg', 'nd', 'nd')
PL/SQL procedure successfully completed.
SQL> create index testcase_index on testcase(text)
2 indextype is ctxsys.context
3 parameters ('section group tcsg')
4 /
Index created.
SQL> select * from testcase where contains (text, 'ndata(nd, wal mart)') > 0;
TEXT
<nd>walmart</nd>
<nd>wal mart</nd>
SQL> select * from testcase where contains (text, 'ndata(nd, wal-mart)') > 0;
TEXT
<nd>walmart</nd>
<nd>wal mart</nd>
SQL> select * from testcase where contains (text, 'ndata(nd, wal*mart)') > 0;
TEXT
<nd>walmart</nd>
<nd>wal mart</nd>
SQL> select * from testcase where contains (text, 'ndata(nd, walmart)') > 0;
TEXT
<nd>walmart</nd>
<nd>wal mart</nd>Edited by: Roger Ford on Jun 21, 2012 10:22 AM

Similar Messages

Oracle Text query: Escaping characters and specifying progression sequences

How can I combine the escaping of a search string and the specification of progression sequences within an oracle text query
so that in all cases the correct results are delivered (see example below)?
The scenario in which to use this is the following:
+ Database: Oracle Database 10g Enterprise Edition Release 10.2.0.2.0
+ Requirement: Hitlist of results ordered by score whereby the different part within
the result list are specified using progression sequences within oracle text query
Example:
create table service_provider (
id number,
name_c varchar(100),
uri_c varchar(255)
insert into service_provider values (1,'ABB Company Mgmt','http://www.abb-company-mgmt.de');
insert into service_provider values (2,'Dr. Abbas Ming','http://www.dr-abbas-ming.de');
insert into service_provider values (3,'SABBATA United','http://www.sabbata-united.de');
insert into service_provider values (4,'ABB','http://www.abb.de');
insert into service_provider values (5,'AND Company Mgmt','http://www.and-company-mgmt.de');
insert into service_provider values (6,'Dr. Andas Ming','http://www.dr-andas-ming.de');
insert into service_provider values (7,'SANDATA United','http://www.sandata-united.de');
insert into service_provider values (8,'AND','http://www.and.de');
Query 1: works correctly in this case
select * from (
select /*+ FIRST_ROWS */ score(1), this_.*
from service_provider this_
where
CONTAINS ( this_.NAME_C , '<QUERY><textquery grammar="CONTEXT">' ||
'<progression>' ||
'<seq>abb</seq>' ||
'<seq>abb%</seq>' ||
'<seq>%abb%</seq>' ||
'<seq>fuzzy(abb,1,100,WEIGHT)</seq>' ||
'</progression></textquery></QUERY>', 1 ) > 0
order by score(1) desc, this_.NAME_C
) where rownum < 21
delivers
76     4     ABB     http://www.abb.de
76     1     ABB Company Mgmt     http://www.abb-company-mgmt.de
51     2     Dr. Abbas Ming     http://www.dr-abbas-ming.de
26     3     SABBATA United     http://www.sabbata-united.de
Query 2: procudes error
select * from (
select /*+ FIRST_ROWS */ score(1), this_.*
from service_provider this_
where
CONTAINS ( this_.NAME_C , '<QUERY><textquery grammar="CONTEXT">' ||
'<progression>' ||
'<seq>and</seq>' ||
'<seq>and%</seq>' ||
'<seq>%and%</seq>' ||
'<seq>fuzzy(and,1,100,WEIGHT)</seq>' ||
'</progression></textquery></QUERY>', 1 ) > 0
order by score(1) desc, this_.NAME_C
) where rownum < 21
produces ORA-29902, ORA-20000, DRG-50901 because AND is a reserved word in oracle text
So we need escaping ...
Query 3: does not work correctly
select * from (
select /*+ FIRST_ROWS */ score(1), this_.*
from service_provider this_
where
CONTAINS ( this_.NAME_C , '<QUERY><textquery grammar="CONTEXT">' ||
'<progression>' ||
'<seq>{abb}</seq>' ||
'<seq>{abb%}</seq>' ||
'<seq>{%abb%}</seq>' ||
'<seq>fuzzy({abb},1,100,WEIGHT)</seq>' ||
'</progression></textquery></QUERY>', 1 ) > 0
order by score(1) desc, this_.NAME_C
) where rownum < 21
delivers
76     4     ABB     http://www.abb.de
76     1     ABB Company Mgmt     http://www.abb-company-mgmt.de
Query 4: does not produce an error, but also does not work correctly
select * from (
select /*+ FIRST_ROWS */ score(1), this_.*
from service_provider this_
where
CONTAINS ( this_.NAME_C , '<QUERY><textquery grammar="CONTEXT">' ||
'<progression>' ||
'<seq>{and}</seq>' ||
'<seq>{and%}</seq>' ||
'<seq>{%and%}</seq>' ||
'<seq>fuzzy({and},1,100,WEIGHT)</seq>' ||
'</progression></textquery></QUERY>', 1 ) > 0
order by score(1) desc, this_.NAME_C
) where rownum < 21
delivers
76     8     AND     http://www.and.de
76     5     AND Company Mgmt     http://www.and-company-mgmt.de

Anywhere that you just use the word by itself, enclose it in {}, but anywhere that you add % on either side or both don't enclose it in {}. Please see the demonstration below.
SCOTT@10gXE> SELECT * FROM v$version
2 /
BANNER
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Product
PL/SQL Release 10.2.0.1.0 - Production
CORE     10.2.0.1.0     Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
SCOTT@10gXE> create table service_provider
2    (id     number,
3      name_c     varchar(100),
4      uri_c     varchar(255))
5 /
Table created.
SCOTT@10gXE> insert all
2 into service_provider values (1,'ABB Company Mgmt','http://www.abb-company-mgmt.de')
3 into service_provider values (2,'Dr. Abbas Ming','http://www.dr-abbas-ming.de')
4 into service_provider values (3,'SABBATA United','http://www.sabbata-united.de')
5 into service_provider values (4,'ABB','http://www.abb.de')
6 into service_provider values (5,'AND Company Mgmt','http://www.and-company-mgmt.de')
7 into service_provider values (6,'Dr. Andas Ming','http://www.dr-andas-ming.de')
8 into service_provider values (7,'SANDATA United','http://www.sandata-united.de')
9 into service_provider values (8,'AND','http://www.and.de')
10 into service_provider values (9,'EBB','fuzzy test')
11 into service_provider values (10,'OND','fuzzy test')
12 select * from dual
13 /
10 rows created.
SCOTT@10gXE> CREATE INDEX your_index
2 ON service_provider (name_c)
3 INDEXTYPE IS CTXSYS.CONTEXT
4 PARAMETERS ('STOPLIST CTXSYS.EMPTY_STOPLIST')
5 /
Index created.
SCOTT@10gXE> VARIABLE search_string VARCHAR2 (100)
SCOTT@10gXE> EXEC :search_string := 'abb'
PL/SQL procedure successfully completed.
SCOTT@10gXE> COLUMN name_c FORMAT A20 WORD_WRAPPED
SCOTT@10gXE> COLUMN uri_c FORMAT A40
SCOTT@10gXE> select *
2 from   (select /*+ FIRST_ROWS */ score(1), this_.*
3           from   service_provider this_
4           where CONTAINS
5                 (this_.NAME_C ,
6                  '<QUERY>
7                  <textquery grammar="CONTEXT">
8                    <progression>
9                      <seq>{'         || :search_string || '}</seq>
10                      <seq>'         || :search_string || '%</seq>
11                      <seq>%'         || :search_string || '%</seq>
12                      <seq>fuzzy({' || :search_string || '},1,100,WEIGHT)</seq>
13                    </progression>
14                 </textquery>
15                  </QUERY>', 1 ) > 0
16           order by score(1) desc, this_.NAME_C)
17 where rownum < 21
18 /
SCORE(1)         ID NAME_C               URI_C
        76          4 ABB                  http://www.abb.de
        76          1 ABB Company Mgmt     http://www.abb-company-mgmt.de
        51          2 Dr. Abbas Ming       http://www.dr-abbas-ming.de
        26          3 SABBATA United       http://www.sabbata-united.de
         4          9 EBB                  fuzzy test
SCOTT@10gXE> EXEC :search_string := 'and'
PL/SQL procedure successfully completed.
SCOTT@10gXE> /
SCORE(1)         ID NAME_C               URI_C
        76          8 AND                  http://www.and.de
        76          5 AND Company Mgmt     http://www.and-company-mgmt.de
        51          6 Dr. Andas Ming       http://www.dr-andas-ming.de
        26          7 SANDATA United       http://www.sandata-united.de
         5         10 OND                  fuzzy test
SCOTT@10gXE>

Oracle text query

Hi,
I have a View object with various attributes (eg, name1, name2, name3, address1, address2, address3 etc). A query/table component based on this view object works just fine. However, I wish to replace name1, name2, name3 and other attributes in the query with just 'name'. These attributes are still to be shown in the result table. This new 'name' attribute will be used in an Oracle Text query clause, instead of individual searches on each attribute.
My plan was to simply make the various name1, name2 etc attributes non-'queryable' in the View def to hide them from the query. Then I'd add a transient 'name' attribute. My hope was, that I could override the getWhereClause() in the ViewObjectImpl and simply tack on the oracle text clause to the WHERE (example below):
WHERE CONTAINS (
SOMECOLUMN,
            '<query>
   <textquery lang="ENGLISH" grammar="CONTEXT">TRANSIENT_ATTR_VALUE
..... Oracle Text query grammar stuff here .... </query>') > 0How do I access the transient value in the ViewObjectImpl to add the above SQL? Or am I going about this in completely the wrong way?
thanks,
Barry.

Based on what I found in
http://www.oracle.com/technology/oramag/oracle/09-nov/o69frame.html?_template=/ocom/print
and
http://blogs.oracle.com/smuenchadf/examples/
136.     Introducing a Checkbox to Toggle a Custom SQL Predicate on an LOV's Search Form. [11.1.1.0.0] 19-NOV-2008
I have the following implementation, which seems to work. Does anyone see any problems with this?
With regard to SQL injection, does ViewCriteriaItem sanitise the 'val' from the query, or should I do that manually here myself?
    @Override
    public java.lang.String getCriteriaItemClause(ViewCriteriaItem vci) {
        if ("OraTextTransientAttrib".equals(vci.getAttributeDef().getName())) {
            if (vci.getViewCriteria().isCriteriaForQuery()) {
                String val = (String)vci.getValue();
                logger.debug("Doing oracle text name search on '" + val + "'");
                // simplified version of my oracle text query
                return "CONTAINS ('<query>..... " + val + "....</query>') > 0 ";
            } else {
                // SQL predicate for no changes to the results
                // spaces needed if you have several of these blocks
                return " 1=1 ";
        // other blocks for other similar oracle text attribs
        return super.getCriteriaItemClause(vci);
    }

"MS" reserved word in oracle text query?

Wondering if anyone has run into the string "MS" behaving as a reserved word in oracle text queries. For example, this specification returns all records from Texas:
'<query>
<textquery>
<progression>
<seq> TX WITHIN CUSTOMER_STATE </seq>
</progression>
</textquery>
</query>'
But this one does NOT find any results for Mississippi:
'<query>
<textquery>
<progression>
<seq> MS WITHIN CUSTOMER_STATE </seq>
</progression>
</textquery>
</query>'
I've confirmed we have data that should match, and I've tried escaping it with the sequences as described in the SQL docs (I've tried single quotes, pairs of single quotes, braces, and combinations of those) . And trying to find info on the web is tough since all web queries that contain 'MS' bring back tons of Microsoft-relevant information.
Can anyone nudge me in the right direction for a better google-search, or some materials in these forums (my initial searches here didn't turn anything up either).
Thanks for any feedback!
jh

Wondering if anyone has run into the string "MS" behaving as a reserved word in oracle text queries.Maybe because »MS« is in the default english stoplist?:
English Default Stoplist.

Using Oracle Text to search through WORD, EXCEL and PDF documents

Hello again,
What I would like to know is if I have a WORD or PDF document stored in a table. Is it possible to use Oracle Text to search through the actual WORD or PDF document?
Thanks
Doug

Yes you can do context sensitive searches on both PDF and Word docs. With the PDF you need to make sure they are text and not images. Some scanners will create PDFs that are nothing more than images of document.
Below is code sample that I made some time back to demonstrate the searching capabilities of Oracle Text. Note that the example makes use of the inso_filter that is no longer shipped with Oracle begging with Patch set 10.1.0.4. See metalink note 298017.1 for the changes. See the following link for more information on developing with Oracle Text.
http://download-west.oracle.com/docs/cd/B14117_01/text.101/b10729/toc.htm
begin example.
-- The following needs to be executed
-- as sys.
DROP DIRECTORY docs_dir;
CREATE OR REPLACE DIRECTORY docs_dir
AS 'C:\sql\oracle_text\documents';
GRANT READ ON DIRECTORY docs_dir TO text;
-- End sys ran SQL
DROP TABLE db_docs CASCADE CONSTRAINTS PURGE;
CREATE TABLE db_docs (
id NUMBER,
format VARCHAR2(10),
location VARCHAR2(50),
document BLOB,
CONSTRAINT i_db_docs_p PRIMARY KEY(id)
-- Several notes need to be made about this anonymous block.
-- First the 'DOCS_DIR' parameter is a directory object name.
-- This directory object name must be in upper case.
DECLARE
f_lob BFILE;
b_lob BLOB;
document_name VARCHAR2(50);
BEGIN
document_name := 'externaltables.doc';
INSERT INTO db_docs
VALUES (1, 'binary', 'C:\sql\oracle_text\documents\externaltables.doc', empty_blob())
RETURN document INTO b_lob;
f_lob := BFILENAME('DOCS_DIR', document_name);
DBMS_LOB.FILEOPEN(f_lob, DBMS_LOB.FILE_READONLY);
DBMS_LOB.LOADFROMFILE(b_lob, f_lob, DBMS_LOB.GETLENGTH(f_lob));
DBMS_LOB.FILECLOSE(f_lob);
COMMIT;
END;
-- build the index
-- Note that this index differs than the file system stored file
-- in that paramter datastore is ctxsys.defautl_datastore and not
-- ctxsys.file_datastore. FILE_DATASTORE is for documents that
-- exist on the file system. DEFAULT_DATASTORE is for documents
-- that are stored in the column.
create index db_docs_ctx on db_docs(document)
indextype is ctxsys.context
parameters (
'datastore ctxsys.default_datastore
filter ctxsys.inso_filter
format column format');
--search for something that is known to not be in the document.
SELECT SCORE(1), id, location
FROM db_docs
WHERE CONTAINS(document, 'Jenkinson', 1) > 0;
--search for something that is known to be in the document.
SELECT SCORE(1), id, location
FROM db_docs
WHERE CONTAINS(document, 'Albright', 1) > 0;

Oracle Text Query taking too long

When we run a query:
select docid from Tbl1 where contains(doc,'queryterm',1)>0;
on 2 million docs it runs in <2 seconds
When we run an insert into another table based on a search:
insert into Tbl2 (col1,col2) select 10,col2 from Tbl1 where rowid<2000; (10 in the select statement is a constant)
it runs in <2 seconds
Here's the kicker:
insert into Tbl2 (col1,col2) select col1,col2 from Tbl1 where contains(doc,'queryterm',1)>0;
it runs in 60 seconds and produces ~2k rows
Is there any hint that we can use to fix this?
TIA!

We've looked hard at the performance notes for Oracle Text, the Application guide and the FAQ on it.
We've dropped the index on the table being inserted, turn off logging and used the Parallel hint on the insert. There is still a bit of a disconnect between insert speed, select speed and both together. The index was built using the parallel option so the queries should be parallel if I understand the performance hints correctly.

Oracle Text query parser - sample code

I've posted a new entry on my "searchtech" blog which includes code for a "Google-like" query syntax parser:
https://blogs.oracle.com/searchtech/entry/oracle_text_query_parser
Currently it's just sample code, but if it goes down well we might include it, or something similar, in a future release of the product.
I'd very much welcome feedback on it, either here on the forum, or on the blog, or directly to me email address (which is included in the download file).
Thanks, everyone.

When I select the "open in browser" option for each now, I get formatted, readable code, which can easily be copied and pasted into a file without the extra txt extension, and I much prefer that. So, for me, that is a sufficient fix.
It seems like this is handy, virtually idiot-proof code, easy to create the package, easy to use it, and provides the Google-like search that users expect, without raising errors or producing unexpected results. Frequently, on the OraFAQ forums, where I am a moderator, when there are various ways to solve a problem and I provide a Text solution, the complaint is that it is too complicated to create all of the formatting to fix potential problems with user input. Your code solves that problem and I hope it will be included in the next version. If you don't mind, I will post an announcement in the OraFAQ Text forum with the permanent link that you provided.

Indexing accentuated word in oracle text

Hello.
I have some problems understanding how oracle text works with accentuated words.
I want to store french words encoded in utf8, for example the french word libération which is encoded as 'libÂ©ration'(utf8 conversion)
in the database.(note that the database in utf8 encoded).
begin
ctx_ddl.create_preference('doc_lexer_perigee', 'BASIC_LEXER');
ctx_ddl.set_attribute('doc_lexer_perigee', 'printjoins', '_-');
ctx_ddl.set_attribute('doc_lexer_perigee', 'BASE_LETTER', 'YES');
ctx_ddl.set_attribute('doc_lexer_perigee','index_themes','yes');
end;
Above is the definition of the lexer used when indexing french documents.
Below is some lines found in oracle documentation :
base_letter
Specify whether characters that have diacritical marks (umlauts, cedillas, acute
accents, and so on) are converted to their base form before being stored in the Text
index. The default is NO (base-letter conversion disabled). For more information on
base-letter conversions and base_letter_type, see Base-Letter Conversion on
page 15-2.
According to what I understand above, the word 'libération' stores as 'libÂ©ration' should also be stored as 'liberation'.
But when I search documents containing the word 'liberation', oracle found no documents matching my query.
Is there anything I have misunderstood about base_letter conversion ?

Indeed, i think I have found a solution to my problems(changed the value of the NLS_LANG parameter) : things seem to work as I want now

Creating of the word-frequency histogram from the Oracle Text

I need make from the Oracle Text index of the "word-frequency histogram", this is list of the tokens in this index, where each token contains the list of documents that contain that token and frequency this token in the every document. Don´t anybody know how to get this data from Oracle Text index so that result will save to the table or to the text file?

You can use ctx_report.token_info to decipher the token_info column, but I don't think the report format that it produces is what you want. You can use a query template and specify algorithm=count to obtain the number of times a token appears in the indexed column. You can do that for every token by using the dr$...$i table, as shown below. Formatting is preserved by prefacing the code with pre enclosed in square brackets on the line above all of the code and /pre in square brackets on the line below all of the code.
SCOTT@10gXE> create table otntest
2    (doc_id       number primary key,
3      document varchar2(100))
4 /
Table created.
SCOTT@10gXE> insert all
2 into otntest values (1, 'This is a test for generating a histogram')
3 into otntest values (2, 'Histogram shows the list of documents that contain that token and frequency')
4 into otntest values (3, 'frequency histogram frequency histogram frequency')
5 select * from dual
6 /
3 rows created.
SCOTT@10gXE> create index otntest_ctx_idx
2 on otntest(document)
3 indextype is ctxsys.context
4 /
Index created.
SCOTT@10gXE> column token_text format a30
SCOTT@10gXE> select t.doc_id, i.token_text, score (1) as token_count
2 from   otntest t,
3          (select distinct token_text
4           from   dr$otntest_ctx_idx$i) i
5           where contains
6                 (document,
7                  '<query>
8                  <textquery grammar="CONTEXT">'
9                  || i.token_text ||
10                  '</textquery>
11                  <score datatype="INTEGER" algorithm="COUNT"/>
12                  </query>',
13                  1) > 0
14 order by doc_id, token_text
15 /
    DOC_ID TOKEN_TEXT                     TOKEN_COUNT
         1 GENERATING                               1
         1 HISTOGRAM                                1
         1 TEST                                     1
         2 CONTAIN                                  1
         2 DOCUMENTS                                1
         2 FREQUENCY                                1
         2 HISTOGRAM                                1
         2 LIST                                     1
         2 SHOWS                                    1
         2 TOKEN                                    1
         3 FREQUENCY                                3
         3 HISTOGRAM                                2
12 rows selected.
SCOTT@10gXE>

Oracle Text Example

Can someone post a quick example of an Oracle Text query?

Ben,
Thanks for the quick answer! I was teaching an APEX class and encouraging them to use the forum. I said "I bet someone answers this in an hour or less". You did it in 13 minutes! I tried to ask a question that didn't require any research, so I hope you didn't invest much time in it.
Thanks again,
Tyler
Tyler Muth
http://tylermuth.wordpress.com
"Applied Oracle Security: Developing Secure Database and Middleware Environments": http://sn.im/aos.book

Oracle companion cd themes for oracle text

Hi i want to install oracle cd companion, but i have not understood what i have to download to run it.
I need use themes for oracle text query's,and oracle 10g xe don't support themes.
Is there a simply procedure that i can follow to install companion correctly?
I have windows xp and oracle 10g.
Important file is droldus.dat..
Thank you very much!!!

I have to install companion cd only for testing oracle text's themes on my computer..
and nothing else...
but which are applications that i have installed??
html db??oracle workflow server??i can't understand!!!Are useful in my case, or i must install it?

Differences Oracle Text Soundex Search & Standar Soundex

Hi all,
I want to ask some question:
1. Is Oracle Text soundex searching using soundex matching algorithm invented
by Donald Knuth?
2. Why Oracle Text soundex search returns different results to a standard
soundex?
3. Can anybody describe how Oracle Text soundex searching process?
Thanx,
Robby

Hi Ron,
thank for your reply.
I've already read the thread and soundex matching algorithm invented by Donald Knuth.
but sorry i still don't understand about oracle soundex searching.
According to Knuth's algorithm the first letter is the important key to searching.
i.e with standard soundex a word "PEEL" will find "PILE" or "P???" and so on.
but with oracle text soundex search a word "PEEL" will find "PILE", "BEEL", "BELL", "FEEL", "VERE" etc.
Is oracle text soundex search not using Knuth's algorithm? if is then how the process work?
Thanks,
Robby

Index rules in oracle text and query using matches

Dear All,
I would like to ask about rules and matches function in oracle text.
I followed an example in oracle text application developer's guide.
I have a rule table like this :
1 oracle
2 larry or ellison
3 oracle and text
4 market share
then, I create an index to that table. This is needed for calling matches function. Here is the syntax :
create index queryx on queries(query_string)
indextype is ctxsys.ctxrule;
then, I noticed that the result on DR$QUERYX$I table as follows :
LARRY 0 2 2 1 (BLOB)
MARKET 0 4 4 1 (BLOB) {MARKET} {SHARE}
ORACLE 0 1 1 1 (BLOB)
ORACLE 0 3 3 1 (BLOB) {TEXT}
ELLISON 0 2 2 1 (BLOB)
What I want to ask is why doesn't the words 'share' and 'text' appear in the DR$QUERYX$ table?
When we use matches function, it then search on the index result and consequently it wion't find the 'share' word. so when for example I do query like this :
select query_id from queries where matches(query_string,' It only share ten percent of all products sold')>0
it will give 0 result since the no word in ' It only share ten percent of all products sold' was in index table. But actually it could possibly be categorized as the 4 category which rules is 'market share'
I tried this in a larger set of data and get same result.
Here is my generated rules from my document collection :
1 {REQUIREMENTS} & {ELICITATION}
1 {REQUIREMENTS} ~ {ELICITATION} & {ACTOR}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} & {FURPS}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} & {PROC}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} & {SPEED}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} & {DOCUME}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} & {PLACED}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} ~ {PLACED} & {UNNECESSARY}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} ~ {PLACED} ~ {UNNECESSARY} & {MISUSE}
1 {INTERPRETATION} ~ {REQUIREMENTS}
2 {DESIGN} & {REPRESENTATION}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} & {OCTOBER}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} & {PROCEDURAL}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} & {STRICT}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} ~ {STRICT} & {GRASP}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} ~ {STRICT} ~ {GRASP} & {MANY} & {LAYER}
2 {DESIGN} ~ {REPRESENTATION} ~ {MAY}
3 {PM} & {TESTING} & {ATTRIBUTI}
And this is the index table result with ctxrule :
(only the token_text column shown)
PM
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
INTERPRETATION
so when I try to classify a document with the word ouline inside it, it should produce category 1 (based on the rules) but since there are no word 'outline' in index tabel, the matches will return 0 means that the document is not classifiedto any category. I don't understand why it happen. Anybody knows about this? I would really appreciate any help.
Thank you very much.

Hm, I see. It do make sense. so nice to know.
But then in the second example I gift where I used larger table, as shown below :
Here is my generated rules from my document collection :
1 {REQUIREMENTS} & {ELICITATION}
1 {REQUIREMENTS} ~ {ELICITATION} & {ACTOR}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} & {FURPS}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} & {PROC}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} & {SPEED}
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} & {DOCUME}
1 {INTERPRETATION} ~ {REQUIREMENTS}
2 {DESIGN} & {REPRESENTATION}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} & {OCTOBER}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} & {PROCEDURAL}
2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} & {STRICT}
2 {DESIGN} ~ {REPRESENTATION} ~ {MAY}
3 {PM} & {TESTING} & {ATTRIBUTI}
As far as I know, the sign ' ~ ' means 'OR' and '&' means 'and' . So based on the 4th line in my table :
1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
it can be concluded that if any of the words stated there been queried, so the category '1' will appear as a result. But then before we can use 'matches' to query it, we need ti create index for the rules table . I did it and the result were :
(only the token_text column shown)
PM
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
DESIGN
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
REQUIREMENTS
INTERPRETATION
there were no words other than PM, DESIGN< REQUIREMENTS and INTERPRETATION. Why the words REQUIREMENTS, ELICITATION, ACTOR, FURPS, OUTLINE don't appear in the index result?

Oracle Text word count in 10g?

Given a clob column full of text in 10g and a particular word, is there a way to return the frequency (word count) of this word in the documents search? Not a count of the records returned but an actual count of the number of times the word is in the documents searched. I couldn't seem to find how in the Oracle text documentation, seems like it would be a simple operation, so I may be looking in the wrong place. Any tips?

In 10g, you can specify algorithm="count" within a query template. I have demonstrated in 11g below, but have used it in 10g previously and it is in the 10g documentaiton.
SCOTT@orcl_11g> drop table t;
Table dropped.
SCOTT@orcl_11g>
SCOTT@orcl_11g> create table t (id varchar2(20) primary key, text varchar2(2000));
Table created.
SCOTT@orcl_11g>
SCOTT@orcl_11g> insert into t values ('1', 'the cat cat cat dog dog sat on the big brown mat');
1 row created.
SCOTT@orcl_11g> insert into t values ('2', 'the big brown mat sat on the big brown mat');
1 row created.
SCOTT@orcl_11g>
SCOTT@orcl_11g> create index ti on t(text) indextype is ctxsys.context;
Index created.
SCOTT@orcl_11g>
SCOTT@orcl_11g> variable search_string varchar2(10)
SCOTT@orcl_11g> exec :search_string := 'cat'
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> select :search_string, id, score (0) as frequency_count
2 from   t
3 where contains
4            (text,
5             '<query>
6             <textquery lang="ENGLISH" grammar="CONTEXT">'
7             || :search_string ||
8            '</textquery>
9             <score datatype="INTEGER" algorithm="COUNT"/>
10           </query>',
11             0) > 0
12 /
:SEARCH_STRING                   ID                   FREQUENCY_COUNT
cat                              1                                  3
SCOTT@orcl_11g> exec :search_string := 'cat dog'
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> /
:SEARCH_STRING                   ID                   FREQUENCY_COUNT
cat dog                          1                                  1
SCOTT@orcl_11g> exec :search_string := 'brown mat'
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> /
:SEARCH_STRING                   ID                   FREQUENCY_COUNT
brown mat                        1                                  1
brown mat                        2                                  2
SCOTT@orcl_11g>

Querying Oracle Text using phrase with equivalence operator and NEAR

Hello,
I have two queries I'm running that are returning puzzling results. One query is a subset of the other. The queries use a NEAR operator and an equivalence operator.
Query 1:
NEAR((sister,father,mother=yo mama=mi madre),20) This is returning 3 results
I believe Query 1 should return all records containing the words sister AND father AND (mother OR yo mama OR mi madre) that are within 20 words of each other.
Query 2 (a subset of Query 1):
NEAR((sister,father,mother=yo mama),20) This is returning 5 results
I believe Query 2 should return all records containing the words sister AND father AND (mother OR yo mama) that are within 20 words of each other.
Why would Query 1 be returning fewer results than Query 2, when Query 2 is a subset of Query 1? Shouldn't Query 1 return at least the same amount or more results than Query 2?
~Mimi

For future questions about Oracle Text, you can try the Oracle Text forum at: Text
There you have more chances of recieveing an awnser.

Oracle Text Query of abbreviated word / name

Similar Messages

Maybe you are looking for