Oracle Text Query of abbreviated word / name

I'm new to Oracle Text so please excuse the (probably) simple question. I want to be able to create a search that excludes (includes?) special characters and/or spaces between an abbreviated name. I'm not sure if it's possible but I would like to be able to return all of the below results if someone queried for "ABC" in one form or another.
Would this be something I'd add to a thesaurus? I see there is a STOPLIST but I'm not sure if there is the opposite of a stoplist.
Thanks in advance!
Regards,
Rich
set def off;
drop table docs;
CREATE TABLE docs (id NUMBER PRIMARY KEY, text VARCHAR2(200));
INSERT INTO docs VALUES(1, 'ABC are my favorite letters.');
INSERT INTO docs VALUES(2, 'My favorite letters are A,B,C');
INSERT INTO docs VALUES(3, 'The best letters are A.B.C.');
INSERT INTO docs VALUES(4, 'Three of the word letters are A-B-C.');
INSERT INTO docs VALUES(5, 'A B C are great letters.');
INSERT INTO docs VALUES(6, 'AB and C are easy letters to remember');
INSERT INTO docs VALUES(7, 'What if we used A, B, & C?');
commit;
begin
ctx_ddl.drop_preference('english_lexar');
end;
begin
ctx_ddl.create_preference('english_lexar', 'BASIC_LEXER');
ctx_ddl.set_attribute('english_lexar', 'printjoins', '_-');
ctx_ddl.set_attribute('english_lexar', 'skipjoins', '-.');
--ctx_ddl.set_attribute ( 'english_lexar', 'index_themes', 'YES');
ctx_ddl.set_attribute ( 'english_lexar', 'index_text', 'YES');
ctx_ddl.set_attribute ( 'english_lexar', 'index_stems', 'SPANISH');
ctx_ddl.set_attribute ( 'english_lexar', 'mixed_case', 'YES');
ctx_ddl.set_attribute ( 'english_lexar', 'base_letter', 'YES');
end;
begin
ctx_ddl.drop_preference('STEM_FUZZY_PREF');
end;
begin
  ctx_ddl.create_preference('STEM_FUZZY_PREF', 'BASIC_WORDLIST');
  ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_MATCH','ENGLISH');
  ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_SCORE','0');
  ctx_ddl.set_attribute('STEM_FUZZY_PREF','FUZZY_NUMRESULTS','5000');
  ctx_ddl.set_attribute('STEM_FUZZY_PREF','SUBSTRING_INDEX','TRUE');
  ctx_ddl.set_attribute('STEM_FUZZY_PREF','PREFIX_INDEX','TRUE');
  ctx_ddl.set_attribute('STEM_FUZZY_PREF','STEMMER','ENGLISH');
end;
begin
ctx_ddl.drop_preference('wildcard_pref');
end;
begin
    Ctx_Ddl.create_Preference('wildcard_pref', 'BASIC_WORDLIST');
    ctx_ddl.set_attribute('wildcard_pref', 'wildcard_maxterms', 100) ;
end;
DROP index myindex;
create index myindex on docs (text)
  indextype is ctxsys.context
  parameters ( 'LEXER english_lexar Wordlist wildcard_pref' );
EXEC CTX_DDL.SYNC_INDEX('myindex', '2M');
SELECT SCORE(1), id, text FROM docs WHERE CONTAINS(text, 'ABC', 1) > 0;It may be that my SQL statement isn't taking advantage of the Text options -- i.e. I'm forgetting something obvious :)

Indexes are case-insensitive by default, so let's ignore that.
You can make wal-mart and wal*mart match walmart by defining "-" and "*" as SKIPJOINS characters. However, you cannot make wal mart match walmart, other than by using NDATA.
NDATA does seem to work - any variation of wal mart walmart wal*mart and wal-mart do manage to match both walmart and wal mart. See example:
SQL> create table testcase (text varchar2(2000));
Table created.
SQL> insert into testcase values ('<nd>walmart</nd>');
1 row created.
SQL> insert into testcase values ('<nd>wal mart</nd>');
1 row created.
SQL> exec ctx_ddl.drop_section_group('tcsg')
PL/SQL procedure successfully completed.
SQL> exec ctx_ddl.create_section_group('tcsg', 'xml_section_group')
PL/SQL procedure successfully completed.
SQL> exec ctx_ddl.add_ndata_section('tcsg', 'nd', 'nd')
PL/SQL procedure successfully completed.
SQL> create index testcase_index on testcase(text)
  2  indextype is ctxsys.context
  3  parameters ('section group tcsg')
  4  /
Index created.
SQL> select * from testcase where contains (text, 'ndata(nd, wal mart)') > 0;
TEXT
<nd>walmart</nd>
<nd>wal mart</nd>
SQL> select * from testcase where contains (text, 'ndata(nd, wal-mart)') > 0;
TEXT
<nd>walmart</nd>
<nd>wal mart</nd>
SQL> select * from testcase where contains (text, 'ndata(nd, wal*mart)') > 0;
TEXT
<nd>walmart</nd>
<nd>wal mart</nd>
SQL> select * from testcase where contains (text, 'ndata(nd, walmart)') > 0;
TEXT
<nd>walmart</nd>
<nd>wal mart</nd>Edited by: Roger Ford on Jun 21, 2012 10:22 AM

Similar Messages

  • Oracle Text query: Escaping characters and specifying progression sequences

    How can I combine the escaping of a search string and the specification of progression sequences within an oracle text query
    so that in all cases the correct results are delivered (see example below)?
    The scenario in which to use this is the following:
    + Database: Oracle Database 10g Enterprise Edition Release 10.2.0.2.0
    + Requirement: Hitlist of results ordered by score whereby the different part within
    the result list are specified using progression sequences within oracle text query
    Example:
    create table service_provider (
    id number,
    name_c varchar(100),
    uri_c varchar(255)
    insert into service_provider values (1,'ABB Company Mgmt','http://www.abb-company-mgmt.de');
    insert into service_provider values (2,'Dr. Abbas Ming','http://www.dr-abbas-ming.de');
    insert into service_provider values (3,'SABBATA United','http://www.sabbata-united.de');
    insert into service_provider values (4,'ABB','http://www.abb.de');
    insert into service_provider values (5,'AND Company Mgmt','http://www.and-company-mgmt.de');
    insert into service_provider values (6,'Dr. Andas Ming','http://www.dr-andas-ming.de');
    insert into service_provider values (7,'SANDATA United','http://www.sandata-united.de');
    insert into service_provider values (8,'AND','http://www.and.de');
    Query 1: works correctly in this case
    select * from (
    select /*+ FIRST_ROWS */ score(1), this_.*
    from service_provider this_
    where
    CONTAINS ( this_.NAME_C , '<QUERY><textquery grammar="CONTEXT">' ||
    '<progression>' ||
    '<seq>abb</seq>' ||
    '<seq>abb%</seq>' ||
    '<seq>%abb%</seq>' ||
    '<seq>fuzzy(abb,1,100,WEIGHT)</seq>' ||
    '</progression></textquery></QUERY>', 1 ) > 0
    order by score(1) desc, this_.NAME_C
    ) where rownum < 21
    delivers
    76     4     ABB     http://www.abb.de
    76     1     ABB Company Mgmt     http://www.abb-company-mgmt.de
    51     2     Dr. Abbas Ming     http://www.dr-abbas-ming.de
    26     3     SABBATA United     http://www.sabbata-united.de
    Query 2: procudes error
    select * from (
    select /*+ FIRST_ROWS */ score(1), this_.*
    from service_provider this_
    where
    CONTAINS ( this_.NAME_C , '<QUERY><textquery grammar="CONTEXT">' ||
    '<progression>' ||
    '<seq>and</seq>' ||
    '<seq>and%</seq>' ||
    '<seq>%and%</seq>' ||
    '<seq>fuzzy(and,1,100,WEIGHT)</seq>' ||
    '</progression></textquery></QUERY>', 1 ) > 0
    order by score(1) desc, this_.NAME_C
    ) where rownum < 21
    produces ORA-29902, ORA-20000, DRG-50901 because AND is a reserved word in oracle text
    So we need escaping ...
    Query 3: does not work correctly
    select * from (
    select /*+ FIRST_ROWS */ score(1), this_.*
    from service_provider this_
    where
    CONTAINS ( this_.NAME_C , '<QUERY><textquery grammar="CONTEXT">' ||
    '<progression>' ||
    '<seq>{abb}</seq>' ||
    '<seq>{abb%}</seq>' ||
    '<seq>{%abb%}</seq>' ||
    '<seq>fuzzy({abb},1,100,WEIGHT)</seq>' ||
    '</progression></textquery></QUERY>', 1 ) > 0
    order by score(1) desc, this_.NAME_C
    ) where rownum < 21
    delivers
    76     4     ABB     http://www.abb.de
    76     1     ABB Company Mgmt     http://www.abb-company-mgmt.de
    Query 4: does not produce an error, but also does not work correctly
    select * from (
    select /*+ FIRST_ROWS */ score(1), this_.*
    from service_provider this_
    where
    CONTAINS ( this_.NAME_C , '<QUERY><textquery grammar="CONTEXT">' ||
    '<progression>' ||
    '<seq>{and}</seq>' ||
    '<seq>{and%}</seq>' ||
    '<seq>{%and%}</seq>' ||
    '<seq>fuzzy({and},1,100,WEIGHT)</seq>' ||
    '</progression></textquery></QUERY>', 1 ) > 0
    order by score(1) desc, this_.NAME_C
    ) where rownum < 21
    delivers
    76     8     AND     http://www.and.de
    76     5     AND Company Mgmt     http://www.and-company-mgmt.de

    Anywhere that you just use the word by itself, enclose it in {}, but anywhere that you add % on either side or both don't enclose it in {}. Please see the demonstration below.
    SCOTT@10gXE> SELECT * FROM v$version
      2  /
    BANNER
    Oracle Database 10g Express Edition Release 10.2.0.1.0 - Product
    PL/SQL Release 10.2.0.1.0 - Production
    CORE     10.2.0.1.0     Production
    TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
    NLSRTL Version 10.2.0.1.0 - Production
    SCOTT@10gXE> create table service_provider
      2    (id     number,
      3       name_c     varchar(100),
      4       uri_c     varchar(255))
      5  /
    Table created.
    SCOTT@10gXE> insert all
      2  into service_provider values (1,'ABB Company Mgmt','http://www.abb-company-mgmt.de')
      3  into service_provider values (2,'Dr. Abbas Ming','http://www.dr-abbas-ming.de')
      4  into service_provider values (3,'SABBATA United','http://www.sabbata-united.de')
      5  into service_provider values (4,'ABB','http://www.abb.de')
      6  into service_provider values (5,'AND Company Mgmt','http://www.and-company-mgmt.de')
      7  into service_provider values (6,'Dr. Andas Ming','http://www.dr-andas-ming.de')
      8  into service_provider values (7,'SANDATA United','http://www.sandata-united.de')
      9  into service_provider values (8,'AND','http://www.and.de')
    10  into service_provider values (9,'EBB','fuzzy test')
    11  into service_provider values (10,'OND','fuzzy test')
    12  select * from dual
    13  /
    10 rows created.
    SCOTT@10gXE> CREATE INDEX your_index
      2  ON service_provider (name_c)
      3  INDEXTYPE IS CTXSYS.CONTEXT
      4  PARAMETERS ('STOPLIST CTXSYS.EMPTY_STOPLIST')
      5  /
    Index created.
    SCOTT@10gXE> VARIABLE search_string VARCHAR2 (100)
    SCOTT@10gXE> EXEC :search_string := 'abb'
    PL/SQL procedure successfully completed.
    SCOTT@10gXE> COLUMN name_c FORMAT A20 WORD_WRAPPED
    SCOTT@10gXE> COLUMN uri_c  FORMAT A40
    SCOTT@10gXE> select *
      2  from   (select /*+ FIRST_ROWS */ score(1), this_.*
      3            from   service_provider this_
      4            where  CONTAINS
      5                  (this_.NAME_C ,
      6                   '<QUERY>
      7                   <textquery grammar="CONTEXT">
      8                     <progression>
      9                       <seq>{'         || :search_string || '}</seq>
    10                       <seq>'         || :search_string || '%</seq>
    11                       <seq>%'         || :search_string || '%</seq>
    12                       <seq>fuzzy({' || :search_string || '},1,100,WEIGHT)</seq>
    13                     </progression>
    14                  </textquery>
    15                   </QUERY>', 1 ) > 0
    16            order  by score(1) desc, this_.NAME_C)
    17  where  rownum < 21
    18  /
      SCORE(1)         ID NAME_C               URI_C
            76          4 ABB                  http://www.abb.de
            76          1 ABB Company Mgmt     http://www.abb-company-mgmt.de
            51          2 Dr. Abbas Ming       http://www.dr-abbas-ming.de
            26          3 SABBATA United       http://www.sabbata-united.de
             4          9 EBB                  fuzzy test
    SCOTT@10gXE> EXEC :search_string := 'and'
    PL/SQL procedure successfully completed.
    SCOTT@10gXE> /
      SCORE(1)         ID NAME_C               URI_C
            76          8 AND                  http://www.and.de
            76          5 AND Company Mgmt     http://www.and-company-mgmt.de
            51          6 Dr. Andas Ming       http://www.dr-andas-ming.de
            26          7 SANDATA United       http://www.sandata-united.de
             5         10 OND                  fuzzy test
    SCOTT@10gXE>

  • Oracle text query

    Hi,
    I have a View object with various attributes (eg, name1, name2, name3, address1, address2, address3 etc). A query/table component based on this view object works just fine. However, I wish to replace name1, name2, name3 and other attributes in the query with just 'name'. These attributes are still to be shown in the result table. This new 'name' attribute will be used in an Oracle Text query clause, instead of individual searches on each attribute.
    My plan was to simply make the various name1, name2 etc attributes non-'queryable' in the View def to hide them from the query. Then I'd add a transient 'name' attribute. My hope was, that I could override the getWhereClause() in the ViewObjectImpl and simply tack on the oracle text clause to the WHERE (example below):
    WHERE CONTAINS (
    SOMECOLUMN,
                '<query>
       <textquery lang="ENGLISH" grammar="CONTEXT">TRANSIENT_ATTR_VALUE
    ..... Oracle Text query grammar stuff here ....  </query>') > 0How do I access the transient value in the ViewObjectImpl to add the above SQL? Or am I going about this in completely the wrong way?
    thanks,
    Barry.

    Based on what I found in
    http://www.oracle.com/technology/oramag/oracle/09-nov/o69frame.html?_template=/ocom/print
    and
    http://blogs.oracle.com/smuenchadf/examples/
    136.     Introducing a Checkbox to Toggle a Custom SQL Predicate on an LOV's Search Form. [11.1.1.0.0] 19-NOV-2008
    I have the following implementation, which seems to work. Does anyone see any problems with this?
    With regard to SQL injection, does ViewCriteriaItem sanitise the 'val' from the query, or should I do that manually here myself?
        @Override
        public java.lang.String getCriteriaItemClause(ViewCriteriaItem vci) {
            if ("OraTextTransientAttrib".equals(vci.getAttributeDef().getName())) {
                if (vci.getViewCriteria().isCriteriaForQuery()) {
                    String val = (String)vci.getValue();
                    logger.debug("Doing oracle text name search on '" + val + "'");
                    // simplified version of my oracle text query
                    return "CONTAINS ('<query>..... " + val + "....</query>') > 0 ";
                } else {
                    // SQL predicate for no changes to the results
                    // spaces needed if you have several of these blocks
                    return " 1=1 ";
            // other blocks for other similar oracle text attribs
            return super.getCriteriaItemClause(vci);
        }

  • "MS" reserved word in oracle text query?

    Wondering if anyone has run into the string "MS" behaving as a reserved word in oracle text queries. For example, this specification returns all records from Texas:
    '<query>
    <textquery>
    <progression>
    <seq> TX WITHIN CUSTOMER_STATE </seq>
    </progression>
    </textquery>
    </query>'
    But this one does NOT find any results for Mississippi:
    '<query>
    <textquery>
    <progression>
    <seq> MS WITHIN CUSTOMER_STATE </seq>
    </progression>
    </textquery>
    </query>'
    I've confirmed we have data that should match, and I've tried escaping it with the sequences as described in the SQL docs (I've tried single quotes, pairs of single quotes, braces, and combinations of those) . And trying to find info on the web is tough since all web queries that contain 'MS' bring back tons of Microsoft-relevant information.
    Can anyone nudge me in the right direction for a better google-search, or some materials in these forums (my initial searches here didn't turn anything up either).
    Thanks for any feedback!
    jh

    Wondering if anyone has run into the string "MS" behaving as a reserved word in oracle text queries.Maybe because »MS« is in the default english stoplist?:
    English Default Stoplist.

  • Using Oracle Text to search through WORD, EXCEL and PDF documents

    Hello again,
    What I would like to know is if I have a WORD or PDF document stored in a table. Is it possible to use Oracle Text to search through the actual WORD or PDF document?
    Thanks
    Doug

    Yes you can do context sensitive searches on both PDF and Word docs. With the PDF you need to make sure they are text and not images. Some scanners will create PDFs that are nothing more than images of document.
    Below is code sample that I made some time back to demonstrate the searching capabilities of Oracle Text. Note that the example makes use of the inso_filter that is no longer shipped with Oracle begging with Patch set 10.1.0.4. See metalink note 298017.1 for the changes. See the following link for more information on developing with Oracle Text.
    http://download-west.oracle.com/docs/cd/B14117_01/text.101/b10729/toc.htm
    begin example.
    -- The following needs to be executed
    -- as sys.
    DROP DIRECTORY docs_dir;
    CREATE OR REPLACE DIRECTORY docs_dir
    AS 'C:\sql\oracle_text\documents';
    GRANT READ ON DIRECTORY docs_dir TO text;
    -- End sys ran SQL
    DROP TABLE db_docs CASCADE CONSTRAINTS PURGE;
    CREATE TABLE db_docs (
    id NUMBER,
    format VARCHAR2(10),
    location VARCHAR2(50),
    document BLOB,
    CONSTRAINT i_db_docs_p PRIMARY KEY(id)
    -- Several notes need to be made about this anonymous block.
    -- First the 'DOCS_DIR' parameter is a directory object name.
    -- This directory object name must be in upper case.
    DECLARE
    f_lob BFILE;
    b_lob BLOB;
    document_name VARCHAR2(50);
    BEGIN
    document_name := 'externaltables.doc';
    INSERT INTO db_docs
    VALUES (1, 'binary', 'C:\sql\oracle_text\documents\externaltables.doc', empty_blob())
    RETURN document INTO b_lob;
    f_lob := BFILENAME('DOCS_DIR', document_name);
    DBMS_LOB.FILEOPEN(f_lob, DBMS_LOB.FILE_READONLY);
    DBMS_LOB.LOADFROMFILE(b_lob, f_lob, DBMS_LOB.GETLENGTH(f_lob));
    DBMS_LOB.FILECLOSE(f_lob);
    COMMIT;
    END;
    -- build the index
    -- Note that this index differs than the file system stored file
    -- in that paramter datastore is ctxsys.defautl_datastore and not
    -- ctxsys.file_datastore. FILE_DATASTORE is for documents that
    -- exist on the file system. DEFAULT_DATASTORE is for documents
    -- that are stored in the column.
    create index db_docs_ctx on db_docs(document)
    indextype is ctxsys.context
    parameters (
    'datastore ctxsys.default_datastore
    filter ctxsys.inso_filter
    format column format');
    --search for something that is known to not be in the document.
    SELECT SCORE(1), id, location
    FROM db_docs
    WHERE CONTAINS(document, 'Jenkinson', 1) > 0;
    --search for something that is known to be in the document.  
    SELECT SCORE(1), id, location
    FROM db_docs
    WHERE CONTAINS(document, 'Albright', 1) > 0;

  • Oracle Text Query taking too long

    When we run a query:
    select docid from Tbl1 where contains(doc,'queryterm',1)>0;
    on 2 million docs it runs in <2 seconds
    When we run an insert into another table based on a search:
    insert into Tbl2 (col1,col2) select 10,col2 from Tbl1 where rowid<2000; (10 in the select statement is a constant)
    it runs in <2 seconds
    Here's the kicker:
    insert into Tbl2 (col1,col2) select col1,col2 from Tbl1 where contains(doc,'queryterm',1)>0;
    it runs in 60 seconds and produces ~2k rows
    Is there any hint that we can use to fix this?
    TIA!

    We've looked hard at the performance notes for Oracle Text, the Application guide and the FAQ on it.
    We've dropped the index on the table being inserted, turn off logging and used the Parallel hint on the insert. There is still a bit of a disconnect between insert speed, select speed and both together. The index was built using the parallel option so the queries should be parallel if I understand the performance hints correctly.

  • Oracle Text query parser - sample code

    I've posted a new entry on my "searchtech" blog which includes code for a "Google-like" query syntax parser:
    https://blogs.oracle.com/searchtech/entry/oracle_text_query_parser
    Currently it's just sample code, but if it goes down well we might include it, or something similar, in a future release of the product.
    I'd very much welcome feedback on it, either here on the forum, or on the blog, or directly to me email address (which is included in the download file).
    Thanks, everyone.

    When I select the "open in browser" option for each now, I get formatted, readable code, which can easily be copied and pasted into a file without the extra txt extension, and I much prefer that. So, for me, that is a sufficient fix.
    It seems like this is handy, virtually idiot-proof code, easy to create the package, easy to use it, and provides the Google-like search that users expect, without raising errors or producing unexpected results. Frequently, on the OraFAQ forums, where I am a moderator, when there are various ways to solve a problem and I provide a Text solution, the complaint is that it is too complicated to create all of the formatting to fix potential problems with user input. Your code solves that problem and I hope it will be included in the next version. If you don't mind, I will post an announcement in the OraFAQ Text forum with the permanent link that you provided.

  • Indexing accentuated word in oracle text

    Hello.
    I have some problems understanding how oracle text works with accentuated words.
    I want to store french words encoded in utf8, for example the french word libération which is encoded as 'lib©ration'(utf8 conversion)
    in the database.(note that the database in utf8 encoded).
    begin
    ctx_ddl.create_preference('doc_lexer_perigee', 'BASIC_LEXER');
    ctx_ddl.set_attribute('doc_lexer_perigee', 'printjoins', '_-');
    ctx_ddl.set_attribute('doc_lexer_perigee', 'BASE_LETTER', 'YES');
    ctx_ddl.set_attribute('doc_lexer_perigee','index_themes','yes');
    end;
    Above is the definition of the lexer used when indexing french documents.
    Below is some lines found in oracle documentation :
    base_letter
    Specify whether characters that have diacritical marks (umlauts, cedillas, acute
    accents, and so on) are converted to their base form before being stored in the Text
    index. The default is NO (base-letter conversion disabled). For more information on
    base-letter conversions and base_letter_type, see Base-Letter Conversion on
    page 15-2.
    According to what I understand above, the word 'libération' stores as 'lib©ration' should also be stored as 'liberation'.
    But when I search documents containing the word 'liberation', oracle found no documents matching my query.
    Is there anything I have misunderstood about base_letter conversion ?

    Indeed, i think I have found a solution to my problems(changed the value of the NLS_LANG parameter) : things seem to work as I want now

  • Creating of the word-frequency histogram from the Oracle Text

    I need make from the Oracle Text index of the "word-frequency histogram", this is list of the tokens in this index, where each token contains the list of documents that contain that token and frequency this token in the every document. Don´t anybody know how to get this data from Oracle Text index so that result will save to the table or to the text file?

    You can use ctx_report.token_info to decipher the token_info column, but I don't think the report format that it produces is what you want. You can use a query template and specify algorithm=count to obtain the number of times a token appears in the indexed column. You can do that for every token by using the dr$...$i table, as shown below. Formatting is preserved by prefacing the code with pre enclosed in square brackets on the line above all of the code and /pre in square brackets on the line below all of the code.
    SCOTT@10gXE> create table otntest
      2    (doc_id       number primary key,
      3       document  varchar2(100))
      4  /
    Table created.
    SCOTT@10gXE> insert all
      2  into otntest values (1, 'This is a test for generating a histogram')
      3  into otntest values (2, 'Histogram shows the list of documents that contain that token and frequency')
      4  into otntest values (3, 'frequency histogram frequency histogram frequency')
      5  select * from dual
      6  /
    3 rows created.
    SCOTT@10gXE> create index otntest_ctx_idx
      2  on otntest(document)
      3  indextype is ctxsys.context
      4  /
    Index created.
    SCOTT@10gXE> column token_text format a30
    SCOTT@10gXE> select t.doc_id, i.token_text, score (1) as token_count
      2  from   otntest t,
      3           (select distinct token_text
      4            from   dr$otntest_ctx_idx$i) i
      5            where  contains
      6                  (document,
      7                   '<query>
      8                   <textquery grammar="CONTEXT">'
      9                   || i.token_text ||
    10                   '</textquery>
    11                   <score datatype="INTEGER" algorithm="COUNT"/>
    12                   </query>',
    13                   1) > 0
    14  order  by doc_id, token_text
    15  /
        DOC_ID TOKEN_TEXT                     TOKEN_COUNT
             1 GENERATING                               1
             1 HISTOGRAM                                1
             1 TEST                                     1
             2 CONTAIN                                  1
             2 DOCUMENTS                                1
             2 FREQUENCY                                1
             2 HISTOGRAM                                1
             2 LIST                                     1
             2 SHOWS                                    1
             2 TOKEN                                    1
             3 FREQUENCY                                3
             3 HISTOGRAM                                2
    12 rows selected.
    SCOTT@10gXE>

  • Oracle Text Example

    Can someone post a quick example of an Oracle Text query?

    Ben,
    Thanks for the quick answer! I was teaching an APEX class and encouraging them to use the forum. I said "I bet someone answers this in an hour or less". You did it in 13 minutes! I tried to ask a question that didn't require any research, so I hope you didn't invest much time in it.
    Thanks again,
    Tyler
    Tyler Muth
    http://tylermuth.wordpress.com
    "Applied Oracle Security: Developing Secure Database and Middleware Environments": http://sn.im/aos.book

  • Oracle companion cd themes for oracle text

    Hi i want to install oracle cd companion, but i have not understood what i have to download to run it.
    I need use themes for oracle text query's,and oracle 10g xe don't support themes.
    Is there a simply procedure that i can follow to install companion correctly?
    I have windows xp and oracle 10g.
    Important file is droldus.dat..
    Thank you very much!!!

    I have to install companion cd only for testing oracle text's themes on my computer..
    and nothing else...
    but which are applications that i have installed??
    html db??oracle workflow server??i can't understand!!!Are useful in my case, or i must install it?

  • Differences Oracle Text Soundex Search & Standar Soundex

    Hi all,
    I want to ask some question:
    1. Is Oracle Text soundex searching using soundex matching algorithm invented
    by Donald Knuth?
    2. Why Oracle Text soundex search returns different results to a standard
    soundex?
    3. Can anybody describe how Oracle Text soundex searching process?
    Thanx,
    Robby

    Hi Ron,
    thank for your reply.
    I've already read the thread and soundex matching algorithm invented by Donald Knuth.
    but sorry i still don't understand about oracle soundex searching.
    According to Knuth's algorithm the first letter is the important key to searching.
    i.e with standard soundex a word "PEEL" will find "PILE" or "P???" and so on.
    but with oracle text soundex search a word "PEEL" will find "PILE", "BEEL", "BELL", "FEEL", "VERE" etc.
    Is oracle text soundex search not using Knuth's algorithm? if is then how the process work?
    Thanks,
    Robby

  • Index rules in oracle text and query using matches

    Dear All,
    I would like to ask about rules and matches function in oracle text.
    I followed an example in oracle text application developer's guide.
    I have a rule table like this :
    1 oracle
    2 larry or ellison
    3 oracle and text
    4 market share
    then, I create an index to that table. This is needed for calling matches function. Here is the syntax :
    create index queryx on queries(query_string)
    indextype is ctxsys.ctxrule;
    then, I noticed that the result on DR$QUERYX$I table as follows :
    LARRY 0 2 2 1 (BLOB)
    MARKET 0 4 4 1 (BLOB) {MARKET} {SHARE}
    ORACLE 0 1 1 1 (BLOB)
    ORACLE 0 3 3 1 (BLOB) {TEXT}
    ELLISON 0 2 2 1 (BLOB)
    What I want to ask is why doesn't the words 'share' and 'text' appear in the DR$QUERYX$ table?
    When we use matches function, it then search on the index result and consequently it wion't find the 'share' word. so when for example I do query like this :
    select query_id from queries where matches(query_string,' It only share ten percent of all products sold')>0
    it will give 0 result since the no word in ' It only share ten percent of all products sold' was in index table. But actually it could possibly be categorized as the 4 category which rules is 'market share'
    I tried this in a larger set of data and get same result.
    Here is my generated rules from my document collection :
    1 {REQUIREMENTS} & {ELICITATION}
    1 {REQUIREMENTS} ~ {ELICITATION} & {ACTOR}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} & {FURPS}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} & {PROC}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} & {SPEED}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} & {DOCUME}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} & {PLACED}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} ~ {PLACED} & {UNNECESSARY}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} ~ {DOCUME} ~ {PLACED} ~ {UNNECESSARY} & {MISUSE}
    1 {INTERPRETATION} ~ {REQUIREMENTS}
    2 {DESIGN} & {REPRESENTATION}
    2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} & {OCTOBER}
    2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} & {PROCEDURAL}
    2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} & {STRICT}
    2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} ~ {STRICT} & {GRASP}
    2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} ~ {STRICT} ~ {GRASP} & {MANY} & {LAYER}
    2 {DESIGN} ~ {REPRESENTATION} ~ {MAY}
    3 {PM} & {TESTING} & {ATTRIBUTI}
    And this is the index table result with ctxrule :
    (only the token_text column shown)
    PM
    DESIGN
    DESIGN
    DESIGN
    DESIGN
    DESIGN
    DESIGN
    DESIGN
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    INTERPRETATION
    so when I try to classify a document with the word ouline inside it, it should produce category 1 (based on the rules) but since there are no word 'outline' in index tabel, the matches will return 0 means that the document is not classifiedto any category. I don't understand why it happen. Anybody knows about this? I would really appreciate any help.
    Thank you very much.

    Hm, I see. It do make sense. so nice to know.
    But then in the second example I gift where I used larger table, as shown below :
    Here is my generated rules from my document collection :
    1 {REQUIREMENTS} & {ELICITATION}
    1 {REQUIREMENTS} ~ {ELICITATION} & {ACTOR}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} & {FURPS}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} & {PROC}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} & {SPEED}
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE} ~ {PROC} ~ {SPEED} & {DOCUME}
    1 {INTERPRETATION} ~ {REQUIREMENTS}
    2 {DESIGN} & {REPRESENTATION}
    2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} & {OCTOBER}
    2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} & {PROCEDURAL}
    2 {DESIGN} ~ {REPRESENTATION} & {MAY} & {FOUNDATI} ~ {OCTOBER} ~ {PROCEDURAL} & {STRICT}
    2 {DESIGN} ~ {REPRESENTATION} ~ {MAY}
    3 {PM} & {TESTING} & {ATTRIBUTI}
    As far as I know, the sign ' ~ ' means 'OR' and '&' means 'and' . So based on the 4th line in my table :
    1 {REQUIREMENTS} ~ {ELICITATION} ~ {ACTOR} ~ {FURPS} ~ {OUTLINE}
    it can be concluded that if any of the words stated there been queried, so the category '1' will appear as a result. But then before we can use 'matches' to query it, we need ti create index for the rules table . I did it and the result were :
    (only the token_text column shown)
    PM
    DESIGN
    DESIGN
    DESIGN
    DESIGN
    DESIGN
    DESIGN
    DESIGN
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    REQUIREMENTS
    INTERPRETATION
    there were no words other than PM, DESIGN< REQUIREMENTS and INTERPRETATION. Why the words REQUIREMENTS, ELICITATION, ACTOR, FURPS, OUTLINE don't appear in the index result?

  • Oracle Text word count in 10g?

    Given a clob column full of text in 10g and a particular word, is there a way to return the frequency (word count) of this word in the documents search? Not a count of the records returned but an actual count of the number of times the word is in the documents searched. I couldn't seem to find how in the Oracle text documentation, seems like it would be a simple operation, so I may be looking in the wrong place. Any tips?

    In 10g, you can specify algorithm="count" within a query template. I have demonstrated in 11g below, but have used it in 10g previously and it is in the 10g documentaiton.
    SCOTT@orcl_11g> drop table t;
    Table dropped.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> create table t (id varchar2(20) primary key, text varchar2(2000));
    Table created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> insert into t values ('1', 'the cat cat cat dog dog sat on the big brown mat');
    1 row created.
    SCOTT@orcl_11g> insert into t values ('2', 'the big brown mat sat on the big brown mat');
    1 row created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> create index ti on t(text) indextype is ctxsys.context;
    Index created.
    SCOTT@orcl_11g>
    SCOTT@orcl_11g> variable search_string varchar2(10)
    SCOTT@orcl_11g> exec :search_string := 'cat'
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11g> select :search_string, id, score (0) as frequency_count
      2  from   t
      3  where  contains
      4             (text,
      5              '<query>
      6              <textquery lang="ENGLISH" grammar="CONTEXT">'
      7              || :search_string ||
      8             '</textquery>
      9              <score datatype="INTEGER" algorithm="COUNT"/>
    10            </query>',
    11              0) > 0
    12  /
    :SEARCH_STRING                   ID                   FREQUENCY_COUNT
    cat                              1                                  3
    SCOTT@orcl_11g> exec :search_string := 'cat dog'
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11g> /
    :SEARCH_STRING                   ID                   FREQUENCY_COUNT
    cat dog                          1                                  1
    SCOTT@orcl_11g> exec :search_string := 'brown mat'
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11g> /
    :SEARCH_STRING                   ID                   FREQUENCY_COUNT
    brown mat                        1                                  1
    brown mat                        2                                  2
    SCOTT@orcl_11g>

  • Querying Oracle Text using phrase with equivalence operator and NEAR

    Hello,
    I have two queries I'm running that are returning puzzling results. One query is a subset of the other. The queries use a NEAR operator and an equivalence operator.
    Query 1:
    NEAR((sister,father,mother=yo mama=mi madre),20) This is returning 3 results
    I believe Query 1 should return all records containing the words sister AND father AND (mother OR yo mama OR mi madre) that are within 20 words of each other.
    Query 2 (a subset of Query 1):
    NEAR((sister,father,mother=yo mama),20) This is returning 5 results
    I believe Query 2 should return all records containing the words sister AND father AND (mother OR yo mama) that are within 20 words of each other.
    Why would Query 1 be returning fewer results than Query 2, when Query 2 is a subset of Query 1? Shouldn't Query 1 return at least the same amount or more results than Query 2?
    ~Mimi

    For future questions about Oracle Text, you can try the Oracle Text forum at: Text
    There you have more chances of recieveing an awnser.

Maybe you are looking for

  • Office Web Apps Server Certificate For External

    Hi guys, I am requesting a DigiCert certificate for my environment Exchange 2013. Can I include the SAN name for Office Web Apps server, such as externalowa.domain.com in to the Exchange generated certificate? From theory wise it seems logic, but kin

  • Unable to copy hidden package "Configuration Manager Client Package" to local DP

    Content status lists it as successfully distributed to all DPs (including the local site server) However, component status for SMS_DISTRIBUTION_MANAGER continually logs these events: Distribution Manager failed to process package "Configuration Manag

  • Profile Manager - Why create Enrollment Profiles?

    So a similar question was asked previously: Why use an enrollment profile? I've read through it and I don't think the answers provided tell the whole story, so I'd like to ask again adding some of my own thought and clarifications on the previous thr

  • Mac mini and sony 46X2000

    I have a mac mini, os 10.4.11 connected to my SONY46X2000 via VGA cable. It works perfectly most of the time, screen res is set to 1024x1280 and the display is recognised as a sonyTV. I am however experiencing the odd drop out, where the screen goes

  • Idoc to RNIF with pdf attachment

    Hi I am implementing a Idoc to RNIF with pdf attachment. I have to pick the pdf file from an ftp server in PI based on the invoice number in the idoc. I could not find a blog or documentation on how to attach the pdf file. The only viable solution I