Fuzzy Searches

Is there anywhere I can find the algorithm Oracle uses for the CONTEXT fuzzy search (as in SELECT surname from person_source where
contains(surname,'?Humphrey') > 0;
I would like to build a function for use outside CONTEXT that incorproates the same algorithm.
Fran

This is more information about our scenario:
We have two groups in the datastore:
concat:
1.) hierarchy:(example text) 321826 325123 543123
2.) page: Actual document text.
321826 325123 543123 represents ids in a hierarchy structure. As you move from left to right the number of times the number occurs is less so there should be less exact matches.
Example: In this index all pages have 321826 as the first value. A few pages have 543123 and all others will have some other number as the last value.
if I do this query:
contains(concat,(321826 within hierarchy ) and ('personnel') within page)
it takes about 10 seconds because it 321826 will hit all pages.
if I do this query:
contains(concat,(543123 within hierarchy ) and ('personnel') within page)
it takes only about 1 second because it 543123 will hit just a few pages.
BUT:::::::
Fuzzy search....
if I do this query:
search A.) contains(concat,(321826 within hierarchy ) and ?('personnel') within page)
it takes about 30 seconds because it 321826 will hit all pages. This is okay for performance for this.
BUT if I do this query:
search B.) contains(concat,(543123 within hierarchy ) and ?('personnel') within page)
it takes about 30 seconds even though 543123 will hit only a few pages.
This should be faster than 30 seconds because you're searching over only a fraction of material for the fuzzy search part.
We've played with different variations on the () and the '' but nothing seems to change this.
Any advice on how to make search B.) faster??
We don't understand why see the different speeds in the exact match and we DON'T see the different speeds in the fuzzy search...
I can send you some test data with the index and query scripts if you want.
Our indexes are on large tables (2,000,000) rows.
TIA
Colleen Geislinger.

Similar Messages

  • Trying to pass a variable to fuzzy search

    I'm trying to write code like this:
       for x in 1 .. 6 loop
           v_searchword := listgetat(replace(p_searchphrase,',',' '),x,' ');
           for c1 in (select * from
                          (select score(1) as score, searchterms, suggestions from suggestions_table
                           where contains(searchterms,'fuzzy({'||v_searchword||'},,,weight)',1)>0
                           order by score desc)
                       where rownum < 10) loop                                    
            end loop;                  
       end loop;Someone passes in a long search phrase. I separate it into words and take up to the first 6. The set of words is looped through. Each word in turn is assigned to the v_searchword variable. I then do an Oracle Text fuzzy search on that word. The above code, however, gives me an Oracle Text parser error (DRG-50901: text query parser syntax error...).
    I've modified the code so that the all-important line reads *where contains(searchterms,'fuzzy({v_searchword},,,weight)',1)>0*, and whilst that doesn't produce a syntax error, it doesn't produce any results, either! Words that I know will generate suggestions when I do a manual fuzzy search in plain SQL (such as "womman" and "tomartoe") don't generate anything in this case, because (I think) instead of searching for 'womman' or 'tomartoe' it's actually just searching for the word 'v_searchword' each time.
    Could someone tell me how to write my code so that the correct word is passed into the contains function each time, please? It seems syntactically not very difficult, but I'm stumped!

    If any value for v_searchword is null it would result in an invalid syntax, searching for {}. This would happen if there was no such element, such as no sixth word in a string of five words. You might also want to remove duplciate spaces from the string. Please see the demonstration below that first reproduces, then corrects the error simply by adding a condition that v_searchword is not null.
    SCOTT@orcl_11g> create table suggestions_table
      2    (searchterms  varchar2 (30),
      3       suggestions  varchar2 (20))
      4  /
    Table created.
    SCOTT@orcl_11g> insert all
      2  into suggestions_table values ('woman', null)
      3  into suggestions_table values ('women', null)
      4  into suggestions_table values ('tomato', null)
      5  into suggestions_table values ('tomatoes', null)
      6  select * from dual
      7  /
    4 rows created.
    SCOTT@orcl_11g> create index your_index
      2  on suggestions_table (searchterms)
      3  indextype is ctxsys.context
      4  /
    Index created.
    SCOTT@orcl_11g> CREATE OR REPLACE FUNCTION listgetat
      2       (p_string    VARCHAR2,
      3        p_element   INTEGER,
      4        p_separator VARCHAR2 DEFAULT ' ')
      5       RETURN          VARCHAR2
      6  AS
      7    v_string      VARCHAR2 (32767);
      8  BEGIN
      9    -- ensure there are starting and ending separators:
    10    v_string := p_separator || p_string || p_separator;
    11    -- remove all double separators:
    12    WHILE INSTR (v_string, p_separator || p_separator) > 0 LOOP
    13        v_string := REPLACE (v_string, p_separator || p_separator, p_separator);
    14    END LOOP;
    15    -- check if element exists:
    16    IF LENGTH (v_string) - LENGTH (REPLACE (v_string, p_separator, '')) >
    17         LENGTH (p_separator) * p_element
    18    THEN
    19        v_string := SUBSTR (v_string,
    20                      INSTR (v_string, p_separator, 1, p_element)
    21                      + LENGTH (p_separator));
    22        RETURN SUBSTR (v_string, 1, INSTR (v_string, p_separator) - 1);
    23    ELSE
    24        RETURN NULL;
    25    END IF;
    26  END listgetat;
    27  /
    Function created.
    SCOTT@orcl_11g> -- reproduction of error:
    SCOTT@orcl_11g> create or replace procedure test_proc
      2    (p_searchphrase     in varchar2)
      3  as
      4    v_searchword    varchar2 (100);
      5  begin
      6       for x in 1 .. 6 loop
      7           v_searchword := listgetat(replace(p_searchphrase,',',' '),x,' ');
      8 
      9           for c1 in (select * from
    10                    (select score(1) as score, searchterms, suggestions from suggestions_table
    11                     where contains(searchterms,'fuzzy({'||v_searchword||'},,,weight)',1)>0
    12                     order by score desc)
    13                 where rownum < 10) loop
    14              dbms_output.put_line
    15             (lpad (c1.score, 3) || ' ' ||
    16              rpad (c1.searchterms, 30) || ' ' ||
    17              v_searchword);
    18            end loop;
    19       end loop;
    20  end test_proc;
    21  /
    Procedure created.
    SCOTT@orcl_11g> show errors
    No errors.
    SCOTT@orcl_11g> exec test_proc ('womman,and,tomartoe')
    38 woman                          womman
    25 women                          womman
    29 tomato                         tomartoe
    26 tomatoes                       tomartoe
    BEGIN test_proc ('womman,and,tomartoe'); END;
    ERROR at line 1:
    ORA-29902: error in executing ODCIIndexStart() routine
    ORA-20000: Oracle Text error:
    DRG-50901: text query parser syntax error on line 1, column 8
    ORA-06512: at "SCOTT.TEST_PROC", line 9
    ORA-06512: at line 1
    SCOTT@orcl_11g> -- correction of error:
    SCOTT@orcl_11g> create or replace procedure test_proc
      2    (p_searchphrase     in varchar2)
      3  as
      4    v_searchword    varchar2 (100);
      5  begin
      6       for x in 1 .. 6 loop
      7           v_searchword := listgetat(replace(p_searchphrase,',',' '),x,' ');
      8           -- check if xth word exists:
      9           if v_searchword is not null then
    10             for c1 in (select * from
    11                      (select score(1) as score, searchterms, suggestions from suggestions_table
    12                       where contains(searchterms,'fuzzy({'||v_searchword||'},,,weight)',1)>0
    13                       order by score desc)
    14                   where rownum < 10) loop
    15             dbms_output.put_line
    16               (lpad (c1.score, 3) || ' ' ||
    17                rpad (c1.searchterms, 30) || ' ' ||
    18                v_searchword);
    19              end loop;
    20           end if;
    21       end loop;
    22  end test_proc;
    23  /
    Procedure created.
    SCOTT@orcl_11g> show errors
    No errors.
    SCOTT@orcl_11g> exec test_proc ('womman,and,tomartoe')
    38 woman                          womman
    25 women                          womman
    29 tomato                         tomartoe
    26 tomatoes                       tomartoe
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11g>

  • Help with fuzzy search (doesn't work if change order of certain 2 letters)

    Hi,
    need some help with fuzzy search. It's pretty simple - we use fuzzy search on varchar2 columns that contain first name and last_name. The problem is that i don't really understand why it can't find name in some cases.
    Say i want to search for 'Taekpaul'. Then
    where CONTAINS(first_name,'fuzzy(TAEKPAUL)',1) > 0 - works
    where CONTAINS(first_name,'fuzzy(TAEKPALU)',1) > 0 - works (changed order of the 2 last letters)
    where CONTAINS(first_name,'fuzzy(TEAKPAUL)',1) > 0 - doesn't work, finds 'Tejpaul' that is completely unrelated (changed 2nd, 3rd order)
    How can i make it find 'Taekpaul' even if i search for TEAKPAUL? Is it related to index? Like Text index should be created with some different parameters?
    Thanks!
    Edited by: Maitreya2 on Mar 3, 2010 2:08 PM

    Thanks, adding '!' worked :)
    Do you know where i can read more about '!' and other special characters? I think i didn't see anything like that here: http://download.oracle.com/docs/cd/B14117_01/text.101/b10730/cqoper.htm#BABBJGFJ
    I also started using JARO_WINKLER_SIMILARITY function that is actually better i think for what i do. But it's very buggy - sometimes Oracle crashes and kills connection when you try to use it.
    Ahha, it's here: http://download.oracle.com/docs/cd/B19306_01/text.102/b14218/cqspcl.htm
    So, ! is soundex. Whatever it means..
    Edited by: Maitreya2 on Mar 5, 2010 12:14 PM

  • Fuzzy search not returning results?

    I'm executing a phonetic search on the nm_resource column. my application allows a call center employee to search on the resource name (nm_resource), if the resource is not found then they will enter a new one. The problem is someone may have already entered the resource name but spelled it incorrectly resulting in duplicate records for the same resource name. To enable the call center to retrieve records that may have the same sound but are spelled differently we have implemented the fuzzy search capability of Oracle text. Things have been going very nicely for the most part with the exception of this one issue we're trying to understand.
    Using the query below we're searching for the resource name "rosies" the actual record in the database was entered as "rosy's". the search returns (rosies,rosie's,rosys) and does not return ---> rosy's <--- the record i'm interested in
    it is reasonable to expect rosy's to be returned in the result set? my query should retunn the max fuzzy expansions and all fuzzy scores.
    select score(1), nm_resource, ADDR_RSRC_ST_LN_1, id_resource, ADDR_RSRC_CITY FROM caps_resource where
    CONTAINS (nm_resource,'fuzzy(rosies, 0, 5000, weight)',1)>0
    union
    select /*+index(caps_resource ind_caps_resource_8)*/ 10, nm_resource, ADDR_RSRC_ST_LN_1, id_resource, ADDR_RSRC_CITY from caps_resource 
    where NM_RESOURCE_UPPER like upper(replace(replace('%' || 'rosies' || '%',' '), '-'))
    and rownum<500 order by 1 DESC;
    any help explaining this is much appriciated.
    Regards,

    When you index "Rosy's", by default it sees the apostropohe as a delimiter and tokenizes and indexes "Rosy" and "s" separately. So, you could only find it by searching the singular form or the singular form obtained by using stemming. However, if you set the apostrophoe as a skipjoin, then it tokenizes and indexes "Rosys" as one token that you can then search for that using "rosies":. Please see the demonstration below. You might also be interested in soundex, which can be used with Oracle Text, or the functions in the utl_match package or metaphone.
    SCOTT@orcl_11g> CREATE TABLE caps_resource
      2    (nm_resource  VARCHAR2 (30))
      3  /
    Table created.
    SCOTT@orcl_11g> INSERT ALL
      2  INTO caps_resource VALUES ('Rosy''s')
      3  SELECT * FROM DUAL
      4  /
    1 row created.
    SCOTT@orcl_11g> SELECT * FROM caps_resource
      2  /
    NM_RESOURCE
    Rosy's
    SCOTT@orcl_11g> CREATE INDEX your_text_idx ON caps_resource (nm_resource)
      2  INDEXTYPE IS CTXSYS.CONTEXT
      3  PARAMETERS
      4       ('STOPLIST CTXSYS.EMPTY_STOPLIST')
      5  /
    Index created.
    SCOTT@orcl_11g> SELECT token_text FROM dr$your_text_idx$i
      2  /
    TOKEN_TEXT
    ROSY
    S
    SCOTT@orcl_11g> SELECT * FROM caps_resource
      2  WHERE  CONTAINS (nm_resource, 'FUZZY (rosies, 0, 5000, weight)') > 0
      3  /
    no rows selected
    SCOTT@orcl_11g> DROP INDEX your_text_idx
      2  /
    Index dropped.
    SCOTT@orcl_11g> BEGIN
      2    CTX_DDL.CREATE_PREFERENCE ('your_lexer', 'BASIC_LEXER');
      3    CTX_DDL.SET_ATTRIBUTE ('your_lexer', 'SKIPJOINS', '''');
      4  END;
      5  /
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11g> CREATE INDEX your_text_idx ON caps_resource (nm_resource)
      2  INDEXTYPE IS CTXSYS.CONTEXT
      3  PARAMETERS
      4    ('STOPLIST CTXSYS.EMPTY_STOPLIST
      5        LEXER       your_lexer')
      6  /
    Index created.
    SCOTT@orcl_11g> SELECT token_text FROM dr$your_text_idx$i
      2  /
    TOKEN_TEXT
    ROSYS
    SCOTT@orcl_11g> SELECT * FROM caps_resource
      2  WHERE  CONTAINS (nm_resource, 'FUZZY (rosies, 0, 5000, weight)') > 0
      3  /
    NM_RESOURCE
    Rosy's
    SCOTT@orcl_11g>

  • How to implement fuzzy search in Query variables

    Dear Experts,
    Fuzzy search is eazy implemented in the abap  , but I do not know how to implement fuzzy search in Query variables
    our company have  a report,with input variable of customer code,   the user want to input 3 bits as fuzzy search. for example,
    the customer code  have 10 bit,  she want to only  input 3 bits before-- EAE *
    and hope the results will  be displayed.  if you have any solution , please advise. 
    ManyTthanks.
    Best Regards.
    Steve

    closed

  • No active external product for the fuzzy search (FBL1N)

    Hi,
    I am in transaction FBL1N and want to search the vendor by F4 help.
    1.  When I hit F-4, the Search window appears.
    2.  It gives an pop-up message "No active external product for the fuzzy search".
    3.  When we open the help for the pop-up message it says :
    No active external product for the fuzzy search
    Message no. F2807
    Diagnosis
    The connection of an external product is required for the fuzzy search.
    For more information, see Note 176559.
    I have looked through the SAP Note 176559 but was not really relevant.
    Regards,
    Rohidas Shinde

  • Passing parameters for fuzzy search

    Hello,
    I am using Oracle 11.2 and do fuzzy search as following:
    Create table tb_test(Nm varchar2(32));
    create index fuzzy_idx on tb_test(Nm) indextype is ctxsys.context parameters(' Wordlist STEM_FUZZY_PREF');
    select * from tb_test where contains(Nm, 'fuzzy(Wndy,,,weight)',1) >0;
    The query works fine for hardcoded string 'Wndy'. I just wonder how can I use parameter to pass the match string in PLSQL?
    Thanks,

    try this (not tested):
    Procedure findMatchNm(nmStr in VARCHAR2)
    IS
    oraCursor REF CURSOR
    str_val varchar2(100);BEGIN
    str_val := 'fuzzy('||nmStr||',,,weight)';OPEN OraCursor FOR
    'SELECT NM FROM TB_test WHERE contains(Nm, :s, 1)>0' USING str_val;LOOP
    FETCH...
    END LOOP;
    END;
    Edited by: stefan nebesnak on Jan 17, 2013 12:49 PM
    using bind variable

  • How score() function works in HANA fuzzy search

    Hi, i am confused by the score() returned value when i use this in fuzzy search in HANA
    CREATE COLUMN TABLE test_similar_calculation_mode
    ( id INTEGER PRIMARY KEY, s text);
    INSERT INTO test_similar_calculation_mode VALUES ('1','stringg');
    INSERT INTO test_similar_calculation_mode VALUES ('2','string theory');
    INSERT INTO test_similar_calculation_mode VALUES ('3','this is a very very very long string');
    INSERT INTO test_similar_calculation_mode VALUES ('4','this is another very long string');
    SELECT TO_INT(SCORE()*100)/100 AS score, id, s FROM test_similar_calculation_mode WHERE CONTAINS(s, 'theory', FUZZY(0.9, 'similarCalculationMode=compare')) ORDER BY score DESC;
    the returned list is just as below
    SCORE      ID      S
    0.84            2       string theory
    why i assign 0.9 as threshold in fuzzy function while this line with score 0.84 also be returned.
    from my understanding, the S field is text data type, so the string actually is divided into seperate word list, so the score should be 1.0, is it right?
    any hints is very appreciate,thanks

    Hi William,
    By default the score() function returns a TF/IDF score for text data types.
    To get back the fuzzy score, you have to use the search option 'textSearch=compare' (or 'ts=compare'). Without other options, this gives an average score of all tokens ('string' and 'theory') and you get a score of 0.7 as a result.
    To ignore the additional token 'string' in the database, you have to specify another option that tells the score function to use the tokens from the user input only ('cnmt=input').
    So you should use
    TO_INT(SCORE()*100)/100 AS score, id, s
    FROM test_similar_calculation_mode
    WHERE CONTAINS(s, 'theory', FUZZY(0.9, 'scm=compare, ts=compare, cnmt=input'))
    ORDER BY score DESC;
    to get the expected results.
    Regards,
    Jörg

  • E-Recruiting TREX and fuzzy search

    Hello,
    is fuzzy search available for the e-Recruiting module. We are using TREX 7.0 and e-Recruiting 6.0.
    If that functionality is available, how does it exactly work and is it customizable?
    thanks
    Koen

    Hi Guys
    Sorry to jump on this thread with nothing to add except my problem.
    We are currently in the process of implementing E-Recruitment 3.0 as an extension to ECC 5. I have configured the system and have been doing some unit testing, and have found that when I
    - try to "Apply  Directly" in tab "Career and Job". If I enter the reference details of the specific job it finds the specific. Or if I leave the input field blank and hit the search button, it returns all jobs posted. However, if I try using a the "Search for Jobs" link option, I get a consistent error "An internal error occurred. Please try again later"
    When I check transaction SLG1.....and the error log says the termination occured in a program called CL_HRRCF_ABDTRACT_CONTROLLER==CM001 line 56 and CL_HRRCF_SEARCH_MASK_GROUP====CM00M line 15
    Same thing happens when searching for candidates to assign requisitions to.
    Has anyone come across this prolem before? Any help will be greatly appreciated, as we have a tight deadline.
    Cheers

  • Text 10g fuzzy search performance

    Hello to everybody in this community,
    im new to this and I got a question which belongs to Oracle Text 10g.
    My Setup:
    Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit
    8 Cores with each 2,5 GHz
    64 GB RAM
    What I'd like to do:
    I'd like to compare a large amount of row sets with each other in a way that human caused mistakes (eg spelling, typing mistakes) will be tolerated.
    So my TEXT CONTEXT setup is as follows:
    MULTI_COLUMN_DATASTORE with each Column to compare.
    begin
      ctx_ddl.create_preference('my_datastore', 'MULTI_COLUMN_DATASTORE');
      ctx_ddl.set_attribute('my_datastore', 'columns', 'column1, ...');
    end;
    BASIC_LEXER - with GERMAN settings:
    begin
       ctx_ddl.create_preference('my_lexer', 'BASIC_LEXER');
       ctx_ddl.set_attribute('my_lexer', 'index_themes', 'NO');
       ctx_ddl.set_attribute('my_lexer', 'index_text', 'YES');
       ctx_ddl.set_attribute('my_lexer', 'alternate_spelling', 'GERMAN');
       ctx_ddl.set_attribute('my_lexer', 'composite', 'GERMAN');
       ctx_ddl.set_attribute('my_lexer', 'index_stems', 'GERMAN');
       ctx_ddl.set_attribute('my_lexer', 'new_german_spelling', 'YES');
    end;
    BASIC_WORDLIST - with GERMAN settings:
    begin
       ctx_ddl.create_preference('my_wordlist', 'BASIC_WORDLIST');
       ctx_ddl.set_attribute('my_wordlist','FUZZY_MATCH','GERMAN');
       ctx_ddl.set_attribute('my_wordlist','FUZZY_SCORE','60'); --defaults
       ctx_ddl.set_attribute('my_wordlist','FUZZY_NUMRESULTS','100'); --defaults
       --ctx_ddl.set_attribute('my_wordlist','SUBSTRING_INDEX','TRUE'); --uncommented due to long creation time of index
       ctx_ddl.set_attribute('my_wordlist','STEMMER','GERMAN');
    end;
    And a BASIC_SECTION_GROUP with a field_section for each column.
    begin
      ctx_ddl.create_section_group(
        group_name => 'my_section_group',
        group_type => 'BASIC_SECTION_GROUP'
      ctx_ddl.add_field_section(
        group_name   => 'my_section_group',
        section_name => 'column1',
        tag          => 'column1'
    end;
    I create the index with
    create index idx_myfulltextindex on fulltexttest(column1)
    indextype is ctxsys.context
    parameters ('datastore my_datastore
                 section group my_section_group
                 lexer my_lexer
                 wordlist my_wordlist
                 stoplist ctxsys.empty_stoplist')
    Everything works functionally fine.
    In my test scenario i got a table with around 100.000 Rows which has a primary key which is not in the CONTEXT index.
    The Problem:
    I do a query like:
    SELECT SCORE(1), a.*
    FROM fulltexttest a
    WHERE CONTAINS(a.column1, 'FUZZY(({TEST}),,,W) WITHIN COUMN1', 1)
      AND a.primkey BETWEEN 1000 AND 4000
    This will do a fulltext search in a set of 3000 rows. The response time here is nearly immediate. Maybe a second.
    If I do the same in a cursor for many times (>1000) with different search terms, it is takes a long time ofcourse. In the average it does 1 query per second.
    I thought this could not be that slow and i tested the same with:
    SELECT SCORE(1), a.*
    FROM fulltexttest a
    WHERE CONTAINS(a.column1, '({TEST}) WITHIN COUMN1', 1)
      AND a.primkey BETWEEN 1000 AND 4000
    NOTE there is no Fuzzy search anymore...
    With this it is up to 20 times faster.
    The cpu of the server reaches about 15% load while processing the fuzzy query.
    So:
    If I do a fuzzy search, it seems not to access the index. I thought I was telling the index to compute the results of 100 expansions in advance.
    Am I doing it wrong? Or is it not possible to build an Index especially for fuzzy search ?
    Are there any suggestions to increase the performance? Note that I read the guide (7 Tuning Oracle Text) already. None of the hints caused remedy.
    I would appreciate if anyone is able to help me in this case... Or just give a hint.
    Thank you,
    Dominik

    Here is a simplified example, first without, then with SDATA.  Please note the differences is indexes, queries, and execution plans.
    SCOTT@orcl12c> CREATE TABLE fulltexttest
      2    (primkey  NUMBER PRIMARY KEY,
      3      column1  VARCHAR2(30))
      4  /
    Table created.
    SCOTT@orcl12c> CREATE SEQUENCE seq
      2  /
    Sequence created.
    SCOTT@orcl12c> INSERT INTO fulltexttest
      2  SELECT seq.NEXTVAL, object_name
      3  FROM  all_objects
      4  /
    89826 rows created.
    SCOTT@orcl12c> create index idx_myfulltextindex
      2  on fulltexttest(column1)
      3  indextype is ctxsys.context
      4  /
    Index created.
    SCOTT@orcl12c> SET AUTOTRACE ON EXPLAIN
    SCOTT@orcl12c> SELECT SCORE(1), a.*
      2  FROM  fulltexttest a
      3  WHERE  CONTAINS
      4            (a.column1,
      5            'FUZZY(({TEST}),,,W)',
      6            1) > 0
      7  AND    a.primkey BETWEEN 1 AND 4000
      8  /
      SCORE(1)    PRIMKEY COLUMN1
            53        247 SQL$TEXT
            53        248 I_SQL$TEXT_PKEY
            53        249 I_SQL$TEXT_HANDLE
    3 rows selected.
    Execution Plan
    Plan hash value: 2971213997
    | Id  | Operation                          | Name                | Rows  | Bytes | Cost (%CPU)| Time    |
    |  0 | SELECT STATEMENT                    |                    |    1 |    42 |    13  (0)| 00:00:01 |
    |  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| FULLTEXTTEST        |    1 |    42 |    13  (0)| 00:00:01 |
    |  2 |  BITMAP CONVERSION TO ROWIDS      |                    |      |      |            |          |
    |  3 |    BITMAP AND                      |                    |      |      |            |          |
    |  4 |    BITMAP CONVERSION FROM ROWIDS  |                    |      |      |            |          |
    |  5 |      SORT ORDER BY                  |                    |      |      |            |          |
    |*  6 |      DOMAIN INDEX                  | IDX_MYFULLTEXTINDEX |  2500 |      |    4  (0)| 00:00:01 |
    |  7 |    BITMAP CONVERSION FROM ROWIDS  |                    |      |      |            |          |
    |  8 |      SORT ORDER BY                  |                    |      |      |            |          |
    |*  9 |      INDEX RANGE SCAN              | SYS_C0035980        |  2500 |      |    9  (0)| 00:00:01 |
    Predicate Information (identified by operation id):
      6 - access("CTXSYS"."CONTAINS"("A"."COLUMN1",'FUZZY(({TEST}),,,W)',1)>0)
      9 - access("A"."PRIMKEY">=1 AND "A"."PRIMKEY"<=4000)
    Note
      - dynamic statistics used: dynamic sampling (level=2)
    SCOTT@orcl12c> SET AUTOTRACE OFF
    SCOTT@orcl12c> DROP INDEX idx_myfulltextindex
      2  /
    Index dropped.
    SCOTT@orcl12c> create index idx_myfulltextindex
      2  on fulltexttest(column1)
      3  indextype is ctxsys.context
      4  FILTER BY primkey
      5  /
    Index created.
    SCOTT@orcl12c> SET AUTOTRACE ON EXPLAIN
    SCOTT@orcl12c> SELECT SCORE(1), a.*
      2  FROM  fulltexttest a
      3  WHERE  CONTAINS
      4            (a.column1,
      5            'FUZZY(({TEST}),,,W) AND SDATA (primkey BETWEEN 1 AND 4000)',
      6            1) > 0
      7  /
      SCORE(1)    PRIMKEY COLUMN1
            53        247 SQL$TEXT
            53        248 I_SQL$TEXT_PKEY
            53        249 I_SQL$TEXT_HANDLE
    3 rows selected.
    Execution Plan
    Plan hash value: 1298620335
    | Id  | Operation                  | Name                | Rows  | Bytes | Cost (%CPU)| Time    |
    |  0 | SELECT STATEMENT            |                    |    41 |  1722 |    12  (0)| 00:00:01 |
    |  1 |  TABLE ACCESS BY INDEX ROWID| FULLTEXTTEST        |    41 |  1722 |    12  (0)| 00:00:01 |
    |*  2 |  DOMAIN INDEX              | IDX_MYFULLTEXTINDEX |      |      |    4  (0)| 00:00:01 |
    Predicate Information (identified by operation id):
      2 - access("CTXSYS"."CONTAINS"("A"."COLUMN1",'FUZZY(({TEST}),,,W) AND SDATA (primkey
                  BETWEEN 1 AND 4000)',1)>0)
    Note
      - dynamic statistics used: dynamic sampling (level=2)
    SCOTT@orcl12c>

  • How to find a data including% using fuzzy searching

    Hello,
    I have a column that has a value "wwt%abc" . How can I find this data using SQL*PLUS and fuzzy searching condition? I wish to use
    the statement like this:
    select name from mytable where name like ...
    thanks for any help

    Susan,
    I think you could use ESCAPE clause to specify that % is to be interpreted literaly.
    select * from mytable where name like '%wwt\%abc%' escape '\';
    I hope this helps.
    Cheema
    null

  • Concatenated datastore fuzzy searches and performance...

    Oracle 8.1.7:
    I am using the concatenated datastore and indexing two columns.
    The query I am executing includes an exact match on one column and a fuzzy match on the second column.
    When I execute the query, performance should improve as the exact match column is set to return less values.
    This is the case when we execute an exact match search on both columns.
    However, when one column is an exact match and the second column is a fuzzy match this is not true.
    Is this normal processing??? and why??? Is this a bug??
    If you need more information please let me know.
    We are under a deadline and this is our final road block.
    TIA
    Colleen GEislinger

    This is more information about our scenario:
    We have two groups in the datastore:
    concat:
    1.) hierarchy:(example text) 321826 325123 543123
    2.) page: Actual document text.
    321826 325123 543123 represents ids in a hierarchy structure. As you move from left to right the number of times the number occurs is less so there should be less exact matches.
    Example: In this index all pages have 321826 as the first value. A few pages have 543123 and all others will have some other number as the last value.
    if I do this query:
    contains(concat,(321826 within hierarchy ) and ('personnel') within page)
    it takes about 10 seconds because it 321826 will hit all pages.
    if I do this query:
    contains(concat,(543123 within hierarchy ) and ('personnel') within page)
    it takes only about 1 second because it 543123 will hit just a few pages.
    BUT:::::::
    Fuzzy search....
    if I do this query:
    search A.) contains(concat,(321826 within hierarchy ) and ?('personnel') within page)
    it takes about 30 seconds because it 321826 will hit all pages. This is okay for performance for this.
    BUT if I do this query:
    search B.) contains(concat,(543123 within hierarchy ) and ?('personnel') within page)
    it takes about 30 seconds even though 543123 will hit only a few pages.
    This should be faster than 30 seconds because you're searching over only a fraction of material for the fuzzy search part.
    We've played with different variations on the () and the '' but nothing seems to change this.
    Any advice on how to make search B.) faster??
    We don't understand why see the different speeds in the exact match and we DON'T see the different speeds in the fuzzy search...
    I can send you some test data with the index and query scripts if you want.
    Our indexes are on large tables (2,000,000) rows.
    TIA
    Colleen Geislinger.

  • Email address validation, is there a way to use Regex or other fuzzy searching?

    I would like to use PL/SQL for Email address validation, is there a way to use Regex (regular expressions) or some other fuzzy searching for that? Using % and _ wildcards only take you so far...
    I need something that will verify alphanumeric charectors (no ",'.:#@&*^ etc.) any ideas?
    Current code:
    if email not like '_%@_%.__%' or email like '%@%@%' or email like '% %' or email like '%"%' or email like '%''%' or email like '%
    %' then
    The last line is to make sure there are no linebreaks in the middle of the email address, is there a better way to signify a line break, like \n or an ascii equivilent?

    Michael:
    The as noted in the previous post, DBI is a Perl package that allows Perl to talk to various databases, including Oracle. We use DBI on several UNIX servers, and it does not require ODBC, and I have always found it to be extremely quick. Things may be different in the Windows world.
    If you are spooling files out to run through Perl anyway, you may want to take a look at DBI. You could probably modify your existing scripts to use DBI fairly easily. The basic structure using DBI is like:
    use DBI;
    my dbh;       # A database handle
    my sth;       # A statment handle
    my sqlstr;    # SQL statement
    my db_vars;   # Variables for your db columns
    # Connect to the database
    $dbh = DBI->connect( "dbi:Oracle:service_name","user/password");
    $sqlstr = 'SELECT * FROM emp WHERE id = ?' # even takes bind variables
    #Prepare statement
    $sth = $dbh->prepare($sqlstr);
    $sth->execute(12345);  # Execute with values for bind if desired
    # Walk the "cursor"
    while (($db_vars) = $sth->fetchrow_array()) {
       your processing here

  • Fuzzy searching and concatenated datastore query performance problems.

    I am using the concatenated datastore and indexing two columns.
    The query I am executing includes an exact match on one column and a fuzzy match on the second column.
    When I execute the query, performance should improve as the exact match column is set to return less values.
    This is the case when we execute an exact match search on both columns.
    However, when one column is an exact match and the second column is a fuzzy match this is not true.
    Is this normal processing??? and why??? Is this a bug??
    If you need more information please let me know.
    We are under a deadline and this is our final road block.
    TIA
    Colleen GEislinger

    I see that you have posted the message in the Oracle text forum, good! You should get a better, more timely answer there.
    Larry

  • [WTA] Perform Fuzzy/Matching/Search of Similarity Text

    This are my sample data:
    With
    vCAR_MODEL AS (
    Select '1' AS MODEL_ID, 'CITY' AS CAR_MODEL FROM DUAL UNION ALL
    Select '2' AS MODEL_ID, 'HOOOONDA' AS CAR_MODEL FROM DUAL UNION ALL
    Select '3' AS MODEL_ID, 'CRUZE' AS CAR_MODEL FROM DUAL UNION ALL
    Select '5' AS MODEL_ID, 'HONDA CRUZE' AS CAR_MODEL FROM DUAL
    vCAR_MODEL_DETAIL AS (
    Select '1' AS MODEL_DETAIL_ID, 'HONDA @ CITY' AS CAR_MODEL , SYSTIMESTAMP + 1 AS UPDATE_DATE FROM DUAL UNION ALL
    Select '2' AS MODEL_DETAIL_ID, 'HONDA,CITY' AS CAR_MODEL, SYSTIMESTAMP + 2 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '3' AS MODEL_DETAIL_ID, 'HONDA|| CITY' AS CAR_MODEL, SYSTIMESTAMP + 3 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '4' AS MODEL_DETAIL_ID, 'CIIIITY @ HOOOONDA' AS CAR_MODEL, SYSTIMESTAMP + 4 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '5' AS MODEL_DETAIL_ID, 'HONDA' AS CAR_MODEL,SYSTIMESTAMP + 5 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '6' AS MODEL_DETAIL_ID, 'CHEVY @ CRUZE' AS CAR_MODEL,SYSTIMESTAMP + 6 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '7' AS MODEL_DETAIL_ID, 'CRUZE' AS CAR_MODEL,SYSTIMESTAMP + 7 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '8' AS MODEL_DETAIL_ID, 'HONDA CRUZE' AS CAR_MODEL,SYSTIMESTAMP + 8 AS UPDATE_DATE  FROM DUAL
    Select * from vCAR_MODEL_DETAIL------------------------------------------------------------------------------------
    CAR_MODEL_ID     CAR_MODEL     UPDATE_DATE
    1     HONDA @ CITY     6-May-13
    2     HONDA, CITY     7-May-13
    3     HONDA|| CITY     8-May-13
    4     CIIIITY @ HOOOONDA     9-May-13
    5     HONDA     10-May-13
    6     CHEVY @ CRUZE     11-May-13
    7     CRUZE     12-May-13
    8     HONDA CRUZE     13-May-13
    and what I want actually is:
    With
    vCAR_MODEL AS (
    Select '1' AS MODEL_ID, 'CITY' AS CAR_MODEL FROM DUAL UNION ALL
    Select '2' AS MODEL_ID, 'HOOOONDA' AS CAR_MODEL FROM DUAL UNION ALL
    Select '3' AS MODEL_ID, 'CRUZE' AS CAR_MODEL FROM DUAL UNION ALL
    Select '5' AS MODEL_ID, 'HONDA CRUZE' AS CAR_MODEL FROM DUAL
    vCAR_MODEL_DETAIL AS (
    --Select '1' AS MODEL_DETAIL_ID, 'HONDA @ CITY' AS CAR_MODEL , SYSTIMESTAMP + 1 AS UPDATE_DATE FROM DUAL UNION ALL
    --Select '2' AS MODEL_DETAIL_ID, 'HONDA,CITY' AS CAR_MODEL, SYSTIMESTAMP + 2 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '3' AS MODEL_DETAIL_ID, 'HONDA|| CITY' AS CAR_MODEL, SYSTIMESTAMP + 3 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '4' AS MODEL_DETAIL_ID, 'CIIIITY @ HOOOONDA' AS CAR_MODEL, SYSTIMESTAMP + 4 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '5' AS MODEL_DETAIL_ID, 'HONDA' AS CAR_MODEL,SYSTIMESTAMP + 5 AS UPDATE_DATE  FROM DUAL UNION ALL
    --Select '6' AS MODEL_DETAIL_ID, 'CHEVY @ CRUZE' AS CAR_MODEL,SYSTIMESTAMP + 6 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '7' AS MODEL_DETAIL_ID, 'CRUZE' AS CAR_MODEL,SYSTIMESTAMP + 7 AS UPDATE_DATE  FROM DUAL UNION ALL
    Select '8' AS MODEL_DETAIL_ID, 'HONDA CRUZE' AS CAR_MODEL,SYSTIMESTAMP + 8 AS UPDATE_DATE  FROM DUAL
    Select * from vCAR_MODEL_DETAIL------------------------------------------------------------------------------------
    CAR_MODEL_ID     CAR_MODEL     UPDATE_DATE
    3     HONDA|| CITY     8-May-13
    4     CIIIITY @ HOOOONDA     9-May-13
    5     HONDA     10-May-13
    7     CRUZE     12-May-13
    8     HONDA CRUZE     13-May-13
    The main table is "vCAR_MODEL" and the detail table is "vCAR_MODEL_DETAIL", the purpose is to fuzzy search based on "vCAR_MODEL" over "vCAR_MODEL_DETAIL".
    And the detail table is pickup from MAX "UPDATE_DATE" column.
    My problem is how do I perform fuzzy search over those symbols where cross join over the main table?
    any idea?

    From Text Area, I got an answer

Maybe you are looking for

  • How can i import into imovie using the camera's built in memory?

    How can I import into imovie from my cannon video camera built in memory.  All that comes up when I connect the USB cable is what is stored on the memory card and it doesn't give me the option to use what is stored on the built in memory

  • Can I install Oracle10g AS R2 in WIndows XP Pro?

    Hi, Just to confirm is it really true that we cannot install Oracle10g AS R2 in Windows XP Pro? The things I need from application server are: - plsql support for web application - oracle discoverer - oracle reports server Please advise. Thank you.

  • Help - I don't want to trigger an event!!

    public void tableChanged( TableModelEvent event ){   int row = event.getFirstRow();   int area = c.area(input);   if(table.getValueAt(row,QUANTITY) != null){     table.setValueAt(""+area,row,SUBTOTAL); //This causes an infinite loop!  I don't want it

  • HT1414 Email won't download since updating to the new version

    Email won't download after updating new version for iPhone

  • Format as 3 decimal places

    Hi All, i have one issue. In this QUANTITY field is there. after mapping i have to see this field with 3 decimals like for exmple sourc field value is   325 but i want this value in target field like 325.000 kindly give me answer as early as possible