Concatenated datastore fuzzy searches and performance...

Oracle 8.1.7:
I am using the concatenated datastore and indexing two columns.
The query I am executing includes an exact match on one column and a fuzzy match on the second column.
When I execute the query, performance should improve as the exact match column is set to return less values.
This is the case when we execute an exact match search on both columns.
However, when one column is an exact match and the second column is a fuzzy match this is not true.
Is this normal processing??? and why??? Is this a bug??
If you need more information please let me know.
We are under a deadline and this is our final road block.
TIA
Colleen GEislinger

This is more information about our scenario:
We have two groups in the datastore:
concat:
1.) hierarchy:(example text) 321826 325123 543123
2.) page: Actual document text.
321826 325123 543123 represents ids in a hierarchy structure. As you move from left to right the number of times the number occurs is less so there should be less exact matches.
Example: In this index all pages have 321826 as the first value. A few pages have 543123 and all others will have some other number as the last value.
if I do this query:
contains(concat,(321826 within hierarchy ) and ('personnel') within page)
it takes about 10 seconds because it 321826 will hit all pages.
if I do this query:
contains(concat,(543123 within hierarchy ) and ('personnel') within page)
it takes only about 1 second because it 543123 will hit just a few pages.
BUT:::::::
Fuzzy search....
if I do this query:
search A.) contains(concat,(321826 within hierarchy ) and ?('personnel') within page)
it takes about 30 seconds because it 321826 will hit all pages. This is okay for performance for this.
BUT if I do this query:
search B.) contains(concat,(543123 within hierarchy ) and ?('personnel') within page)
it takes about 30 seconds even though 543123 will hit only a few pages.
This should be faster than 30 seconds because you're searching over only a fraction of material for the fuzzy search part.
We've played with different variations on the () and the '' but nothing seems to change this.
Any advice on how to make search B.) faster??
We don't understand why see the different speeds in the exact match and we DON'T see the different speeds in the fuzzy search...
I can send you some test data with the index and query scripts if you want.
Our indexes are on large tables (2,000,000) rows.
TIA
Colleen Geislinger.

Similar Messages

  • Fuzzy searching and concatenated datastore query performance problems.

    I am using the concatenated datastore and indexing two columns.
    The query I am executing includes an exact match on one column and a fuzzy match on the second column.
    When I execute the query, performance should improve as the exact match column is set to return less values.
    This is the case when we execute an exact match search on both columns.
    However, when one column is an exact match and the second column is a fuzzy match this is not true.
    Is this normal processing??? and why??? Is this a bug??
    If you need more information please let me know.
    We are under a deadline and this is our final road block.
    TIA
    Colleen GEislinger

    I see that you have posted the message in the Oracle text forum, good! You should get a better, more timely answer there.
    Larry

  • Text 10g fuzzy search performance

    Hello to everybody in this community,
    im new to this and I got a question which belongs to Oracle Text 10g.
    My Setup:
    Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit
    8 Cores with each 2,5 GHz
    64 GB RAM
    What I'd like to do:
    I'd like to compare a large amount of row sets with each other in a way that human caused mistakes (eg spelling, typing mistakes) will be tolerated.
    So my TEXT CONTEXT setup is as follows:
    MULTI_COLUMN_DATASTORE with each Column to compare.
    begin
      ctx_ddl.create_preference('my_datastore', 'MULTI_COLUMN_DATASTORE');
      ctx_ddl.set_attribute('my_datastore', 'columns', 'column1, ...');
    end;
    BASIC_LEXER - with GERMAN settings:
    begin
       ctx_ddl.create_preference('my_lexer', 'BASIC_LEXER');
       ctx_ddl.set_attribute('my_lexer', 'index_themes', 'NO');
       ctx_ddl.set_attribute('my_lexer', 'index_text', 'YES');
       ctx_ddl.set_attribute('my_lexer', 'alternate_spelling', 'GERMAN');
       ctx_ddl.set_attribute('my_lexer', 'composite', 'GERMAN');
       ctx_ddl.set_attribute('my_lexer', 'index_stems', 'GERMAN');
       ctx_ddl.set_attribute('my_lexer', 'new_german_spelling', 'YES');
    end;
    BASIC_WORDLIST - with GERMAN settings:
    begin
       ctx_ddl.create_preference('my_wordlist', 'BASIC_WORDLIST');
       ctx_ddl.set_attribute('my_wordlist','FUZZY_MATCH','GERMAN');
       ctx_ddl.set_attribute('my_wordlist','FUZZY_SCORE','60'); --defaults
       ctx_ddl.set_attribute('my_wordlist','FUZZY_NUMRESULTS','100'); --defaults
       --ctx_ddl.set_attribute('my_wordlist','SUBSTRING_INDEX','TRUE'); --uncommented due to long creation time of index
       ctx_ddl.set_attribute('my_wordlist','STEMMER','GERMAN');
    end;
    And a BASIC_SECTION_GROUP with a field_section for each column.
    begin
      ctx_ddl.create_section_group(
        group_name => 'my_section_group',
        group_type => 'BASIC_SECTION_GROUP'
      ctx_ddl.add_field_section(
        group_name   => 'my_section_group',
        section_name => 'column1',
        tag          => 'column1'
    end;
    I create the index with
    create index idx_myfulltextindex on fulltexttest(column1)
    indextype is ctxsys.context
    parameters ('datastore my_datastore
                 section group my_section_group
                 lexer my_lexer
                 wordlist my_wordlist
                 stoplist ctxsys.empty_stoplist')
    Everything works functionally fine.
    In my test scenario i got a table with around 100.000 Rows which has a primary key which is not in the CONTEXT index.
    The Problem:
    I do a query like:
    SELECT SCORE(1), a.*
    FROM fulltexttest a
    WHERE CONTAINS(a.column1, 'FUZZY(({TEST}),,,W) WITHIN COUMN1', 1)
      AND a.primkey BETWEEN 1000 AND 4000
    This will do a fulltext search in a set of 3000 rows. The response time here is nearly immediate. Maybe a second.
    If I do the same in a cursor for many times (>1000) with different search terms, it is takes a long time ofcourse. In the average it does 1 query per second.
    I thought this could not be that slow and i tested the same with:
    SELECT SCORE(1), a.*
    FROM fulltexttest a
    WHERE CONTAINS(a.column1, '({TEST}) WITHIN COUMN1', 1)
      AND a.primkey BETWEEN 1000 AND 4000
    NOTE there is no Fuzzy search anymore...
    With this it is up to 20 times faster.
    The cpu of the server reaches about 15% load while processing the fuzzy query.
    So:
    If I do a fuzzy search, it seems not to access the index. I thought I was telling the index to compute the results of 100 expansions in advance.
    Am I doing it wrong? Or is it not possible to build an Index especially for fuzzy search ?
    Are there any suggestions to increase the performance? Note that I read the guide (7 Tuning Oracle Text) already. None of the hints caused remedy.
    I would appreciate if anyone is able to help me in this case... Or just give a hint.
    Thank you,
    Dominik

    Here is a simplified example, first without, then with SDATA.  Please note the differences is indexes, queries, and execution plans.
    SCOTT@orcl12c> CREATE TABLE fulltexttest
      2    (primkey  NUMBER PRIMARY KEY,
      3      column1  VARCHAR2(30))
      4  /
    Table created.
    SCOTT@orcl12c> CREATE SEQUENCE seq
      2  /
    Sequence created.
    SCOTT@orcl12c> INSERT INTO fulltexttest
      2  SELECT seq.NEXTVAL, object_name
      3  FROM  all_objects
      4  /
    89826 rows created.
    SCOTT@orcl12c> create index idx_myfulltextindex
      2  on fulltexttest(column1)
      3  indextype is ctxsys.context
      4  /
    Index created.
    SCOTT@orcl12c> SET AUTOTRACE ON EXPLAIN
    SCOTT@orcl12c> SELECT SCORE(1), a.*
      2  FROM  fulltexttest a
      3  WHERE  CONTAINS
      4            (a.column1,
      5            'FUZZY(({TEST}),,,W)',
      6            1) > 0
      7  AND    a.primkey BETWEEN 1 AND 4000
      8  /
      SCORE(1)    PRIMKEY COLUMN1
            53        247 SQL$TEXT
            53        248 I_SQL$TEXT_PKEY
            53        249 I_SQL$TEXT_HANDLE
    3 rows selected.
    Execution Plan
    Plan hash value: 2971213997
    | Id  | Operation                          | Name                | Rows  | Bytes | Cost (%CPU)| Time    |
    |  0 | SELECT STATEMENT                    |                    |    1 |    42 |    13  (0)| 00:00:01 |
    |  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| FULLTEXTTEST        |    1 |    42 |    13  (0)| 00:00:01 |
    |  2 |  BITMAP CONVERSION TO ROWIDS      |                    |      |      |            |          |
    |  3 |    BITMAP AND                      |                    |      |      |            |          |
    |  4 |    BITMAP CONVERSION FROM ROWIDS  |                    |      |      |            |          |
    |  5 |      SORT ORDER BY                  |                    |      |      |            |          |
    |*  6 |      DOMAIN INDEX                  | IDX_MYFULLTEXTINDEX |  2500 |      |    4  (0)| 00:00:01 |
    |  7 |    BITMAP CONVERSION FROM ROWIDS  |                    |      |      |            |          |
    |  8 |      SORT ORDER BY                  |                    |      |      |            |          |
    |*  9 |      INDEX RANGE SCAN              | SYS_C0035980        |  2500 |      |    9  (0)| 00:00:01 |
    Predicate Information (identified by operation id):
      6 - access("CTXSYS"."CONTAINS"("A"."COLUMN1",'FUZZY(({TEST}),,,W)',1)>0)
      9 - access("A"."PRIMKEY">=1 AND "A"."PRIMKEY"<=4000)
    Note
      - dynamic statistics used: dynamic sampling (level=2)
    SCOTT@orcl12c> SET AUTOTRACE OFF
    SCOTT@orcl12c> DROP INDEX idx_myfulltextindex
      2  /
    Index dropped.
    SCOTT@orcl12c> create index idx_myfulltextindex
      2  on fulltexttest(column1)
      3  indextype is ctxsys.context
      4  FILTER BY primkey
      5  /
    Index created.
    SCOTT@orcl12c> SET AUTOTRACE ON EXPLAIN
    SCOTT@orcl12c> SELECT SCORE(1), a.*
      2  FROM  fulltexttest a
      3  WHERE  CONTAINS
      4            (a.column1,
      5            'FUZZY(({TEST}),,,W) AND SDATA (primkey BETWEEN 1 AND 4000)',
      6            1) > 0
      7  /
      SCORE(1)    PRIMKEY COLUMN1
            53        247 SQL$TEXT
            53        248 I_SQL$TEXT_PKEY
            53        249 I_SQL$TEXT_HANDLE
    3 rows selected.
    Execution Plan
    Plan hash value: 1298620335
    | Id  | Operation                  | Name                | Rows  | Bytes | Cost (%CPU)| Time    |
    |  0 | SELECT STATEMENT            |                    |    41 |  1722 |    12  (0)| 00:00:01 |
    |  1 |  TABLE ACCESS BY INDEX ROWID| FULLTEXTTEST        |    41 |  1722 |    12  (0)| 00:00:01 |
    |*  2 |  DOMAIN INDEX              | IDX_MYFULLTEXTINDEX |      |      |    4  (0)| 00:00:01 |
    Predicate Information (identified by operation id):
      2 - access("CTXSYS"."CONTAINS"("A"."COLUMN1",'FUZZY(({TEST}),,,W) AND SDATA (primkey
                  BETWEEN 1 AND 4000)',1)>0)
    Note
      - dynamic statistics used: dynamic sampling (level=2)
    SCOTT@orcl12c>

  • Concatenated datastore performance with other predicates

    Hi
    I am using context indexes with a concatenated datastore.
    The query is like this -
    select *
    from my_table
    where contains ( my_column, 'token_1 within xx or token_2 within yy ', 1 ) > 0
    and some_other_column = 'xxx'
    There is no index on "some_other_column".
    Would it help to include "some_other_column" in the concatenated datastore? Will this increase the performance of the query, or does it always depends on the type of data we have?
    How is the query of a concatenated datastore fired? Is the $I table queried for each token in the query?
    Thanks and regards
    Pratap

    Yes, it should generally be faster to include "some_other_column" in the
    list for the concatenated datastore.
    The query would then be
    select * from my_table where contains
    ( my_column, '(token_1 within xx or token_2 within yy) and (xxx within some_other_column)', 1 ) > 0
    Note that this is not exactly the same as your query - for example if some_other_column contained "abc xxx xyz" then my query would be a hit but yours would not. If you know the column will only ever contain one word, then they are identical.
    - Roger

  • Firefox clears the searchbox on ebay when i perform a search and nothing happens!

    Firefox clears the search-box on ebay when i perform a search and nothing happens! The page just refreshes and i can't find any items. It works as normal when i try with IE though..
    == This happened ==
    Every time Firefox opened
    == about 3-4 months ago..

    Start Firefox in <u>[[Safe Mode]]</u> to check if one of the extensions is causing the problem (switch to the DEFAULT theme: Firefox (Tools) > Add-ons > Appearance/Themes).
    *Don't make any changes on the Safe mode start window.
    *https://support.mozilla.com/kb/Safe+Mode

  • Perform search and replace in a custom command?

    Hello,
    Does anyone know how to perform a search and replace for selected text in a custom command?
    My specific dilemma is this: I have a large document containing [very poorly formatted] songs. For example:
    <p>Verse1 Verse1 Verse1 Verse1 Verse1</p>
    <p>Verse1 Verse1 Verse1 Verse1 Verse1</p>
    <p>Verse1 Verse1 Verse1 Verse1 Verse1</p>
    <p>Chorus Chorus Chorus Chorus</p>
    <p>Chorus Chorus Chorus Chorus</p>
    <p>Chorus Chorus Chorus Chorus</p>
    <p>Verse2 Verse2 Verse2 Verse2 Verse2</p>
    <p>Verse2 Verse2 Verse2 Verse2 Verse2</p>
    <p>Verse2 Verse2 Verse2 Verse2 Verse2</p>
    <p>Verse3 Verse3 Verse3 Verse3 Verse3</p>
    <p>Verse3 Verse3 Verse3 Verse3 Verse3</p>
    <p>Verse3 Verse3 Verse3 Verse3 Verse3</p>
    I would like to be able to select all the text in a verse and run my custom command on it that would remove the "<p>" and replace "</p>" with "<br />".
    Thanks,
    Kelso

    replaceAll method in String is available only from 1.4 onwards.
    http://java.sun.com/j2se/1.4.1/docs/api/java/lang/String.html#replaceAll(java.lang.String, java.lang.String).
    You could use a method to search and replace a String with another...
    public String replace (String source, String search, String replace) {
        StringBuffer result = new StringBuffer();
        int start = 0;
        int index = 0;
        while ((index = source.indexOf(search, start)) >= 0) {
          result.append(source.substring(start, index));
          result.append(replace);
          start = index + search.length();
        result.append(source.substring(start));
        return result.toString();
    }

  • E-Recruiting TREX and fuzzy search

    Hello,
    is fuzzy search available for the e-Recruiting module. We are using TREX 7.0 and e-Recruiting 6.0.
    If that functionality is available, how does it exactly work and is it customizable?
    thanks
    Koen

    Hi Guys
    Sorry to jump on this thread with nothing to add except my problem.
    We are currently in the process of implementing E-Recruitment 3.0 as an extension to ECC 5. I have configured the system and have been doing some unit testing, and have found that when I
    - try to "Apply  Directly" in tab "Career and Job". If I enter the reference details of the specific job it finds the specific. Or if I leave the input field blank and hit the search button, it returns all jobs posted. However, if I try using a the "Search for Jobs" link option, I get a consistent error "An internal error occurred. Please try again later"
    When I check transaction SLG1.....and the error log says the termination occured in a program called CL_HRRCF_ABDTRACT_CONTROLLER==CM001 line 56 and CL_HRRCF_SEARCH_MASK_GROUP====CM00M line 15
    Same thing happens when searching for candidates to assign requisitions to.
    Has anyone come across this prolem before? Any help will be greatly appreciated, as we have a tight deadline.
    Cheers

  • Difference in WS performance between Search and Retrieve operations?

    All,
    We are currently working on a new repository and planning to use MDM webservices on top of that repository for searching and retrieving the data.
    Now I'm curious about the difference in performance between the Search and the Retrieve operations and also within the Retrieve operation, between the different identification methods (internal ID, auto ID, remote key, unique field and display field).
    Because in the webservices guide is stated that the identification methods are listed in order of best performance, but what are these performance differences between these methods (e.g. a retrieve on internal ID is x times faster than a retrieve on remote key which on his turn is x times faster than a retrieve on display fields which on his turn is x times faster than a search operation on same display field).
    Of course the performance depends on lot of other things as well, but I just want to get a feeling on the performance related to eachother (keeping all other variables that can influence the performance the same!)!
    I hope that any of you has experiences with all possibilities and can share performance measurements between the different operations related to eachother.  Thanks in advance.
    Regards,
    Marcel Herber

    Hi,
    Did you implment Webservices in your site.
    We are also having a similar scenarion where we have to serach a Records in MDM from SAP PI based on the certain criteria. I am concerned about the SAP MDM performance , since we are having heavy amount data being loaded every 30 minutes.
    Please let me know the performace aspects of using Webservices.
    Thanks
    Ganesh Kotti

  • Store XML on Oracle and perform search

    Hi,
    I need to be able to store 10,000 XML documents in Oracle so
    I can performance attribute search against these documents
    Is Oracle 8i a must? Is relational tables the way to go?
    What would a good way to store XML document and retrieve them.
    We are currently storing them as BLOB, where we can't do
    preform searching functionality.
    Many Thanks.
    Kevin
    null

    Oracle XML Team wrote:
    : Kevin Lu (guest) wrote:
    : : Hi,
    : : I need to be able to store 10,000 XML documents in Oracle so
    : : I can performance attribute search against these documents
    : : Is Oracle 8i a must? Is relational tables the way to go?
    : : What would a good way to store XML document and retrieve
    them.
    : : We are currently storing them as BLOB, where we can't do
    : : preform searching functionality.
    : : Many Thanks.
    : : Kevin
    : 8i is the way to go but interMedia's support of XML attribute
    : searching is not in the current release but has been announced
    : for 8.1.6.
    I posted this as a follow-up a previous query but, I need to
    accomplish the same. That is store XML data in CLOB but search
    (and select rows) based on XML element or attribute values of the
    XML documents in the CLOB column. Where can I learn more about
    the InterMedia search. Thanks again.
    Prasad
    null

  • I searched and searched (fuzzy logic 4)

    Hello!
    Right now i was reading for over an hour and i cannot find the right answer about fuzzy logic 4
    I used live update to get the most recent drivers for my mobo msi 845 pe max2.
    (just to be complete I have the HT option in the bios while having a P4 2.4Ghz, for those who want to know, bios 1.20).....
    My problem:
    I used the "auto" function in fuzzy-logic and waited 3 minutes, the FSB is boosted to 164. Than the system hangs and restarts. (thats normal right?) the cpu-temp is 64C when it hangs..
    then after the reboot, nothing has changed and the fsb is back to 133???
    What is wrong? how can i overclock my system with fuzzy logic??
    Sander

    I would NOT use Core-Center Either with your Board, as since it is a 845 Series, I dont Believe that you even have the "Core-Cell" Chip on your Motherboard ..I suggest that you use "Speed-Fan" as this seems to be the Utility that Most of the Members have Success with, and if you want a Utility to Overclock your FSB with, Download Clock-Generator, at http://www.cpuid.com  .....Sean REILLY875

  • Scoring messed up using concatenated datastore Index

    Hi,
    Here is my table structure....
    CREATE TABLE SRCH_KEYWORD_SEARCH_SME
    SYS_ID NUMBER(10) NOT NULL,
    PAPER_NO VARCHAR2(10),
    PRODIDX_ID VARCHAR2(10),
    RESULT_TITLE VARCHAR2(255),
    RESULT_DESCR VARCHAR2(1000) NOT NULL,
    ABSTRACT CLOB,
    SRSLT_CATEGORY_ID VARCHAR2(10) NOT NULL,
    SRSLT_SUB_CATEGORY_ID VARCHAR2(10) NOT NULL,
    ACTIVE_FLAG VARCHAR2(1) DEFAULT 'Y' NOT NULL,
    EVENT_START_DATE DATE,
    EVENT_END_DATE DATE,
    Here is the Concatenated Datastore preference...
       -- Drop any existing storage preference.
       CTX_DDL.drop_preference('SEARCH_STORAGE_PREF');
       -- Create new storage preference.
       CTX_DDL.create_preference('SEARCH_STORAGE_PREF', 'BASIC_STORAGE');
          CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'I_TABLE_CLAUSE', 'tablespace searchidx');
          CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'K_TABLE_CLAUSE', 'tablespace searchidx');
          CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'R_TABLE_CLAUSE', 'tablespace searchidx lob (data) store as (disable storage in row cache)');
          CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'N_TABLE_CLAUSE', 'tablespace searchidx');
          CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'I_INDEX_CLAUSE', 'tablespace searchidx  compress 2');
          CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'P_TABLE_CLAUSE', 'tablespace searchidx');
       -- Drop any existing datastore preference.
       CTX_DDL.drop_preference('SEARCH_DATA_STORE');
       CTX_DDL.DROP_SECTION_GROUP('SEARCH_DATA_STORE_SG');
       -- Create new multi-column datastore preference.
       CTX_DDL.create_preference('SEARCH_DATA_STORE','MULTI_COLUMN_DATASTORE');
       CTX_DDL.set_attribute('SEARCH_DATA_STORE','columns','abstract, srslt_category_id, srslt_sub_category_id, active_flag');
       CTX_DDL.set_attribute('SEARCH_DATA_STORE', 'FILTER','N,N,N,N');
       -- Create new section group preference.
       CTX_DDL.create_section_group ('SEARCH_DATA_STORE_SG','BASIC_SECTION_GROUP');
       CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'abstract',              'abstract',             TRUE);
       CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'srslt_category_id',     'srslt_category_id',    TRUE);
       CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'srslt_sub_category_id', 'srslt_sub_category_id',TRUE);
       CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'active_flag',           'active_flag',          TRUE);
    Here is the context Index
    CREATE INDEX SRCH_KEYWORD_SEARCH_I ON SRCH_KEYWORD_SEARCH_SME(ABSTRACT)
       INDEXTYPE IS CTXSYS.CONTEXT
          PARAMETERS('STORAGE search_storage_pref DATASTORE SEARCH_DATA_STORE SECTION GROUP SEARCH_DATA_STORE_SG' )
    Here is the Query # 1 I am trying out...
    SELECT /*+ FIRST_ROWS(10) */
           SCORE(1) score_nbr,
           k.SYS_ID,
           k.RESULT_TITLE,
    FROM   SRCH_KEYWORD_SEARCH_SME k
    WHERE  CONTAINS (k.ABSTRACT, '<query><textquery><progression><seq>{hitchhiker} WITHIN abstract</seq></progression></textquery></query>',1) > 0
    ORDER BY SCORE(1) DESC;
    Here is the result for Query # 1...
    score_nbr   sys_id     result_title
    54          99220      SME Releases New Book The Hitchhiker's Guide to Lean                                                                                                                                                                                                     72                                    
    43          116583     Lean Leadership Package                                                                                                                                                                                                                                         72                                    
    32          132392     The Hitchhikers Guide to Lean: Lessons from the Road                                                                                                                                                                                                           72                                    
    11          132017     Lean Manufacturing A Plant Floor Guide Book Summary                                                                                                                                                                                                            72                                    
    11          137106     Managing Factory Maintenance, Second Edition                                                                                                                                                                                                                    72                                    
    11          132082     Lean Pocket GuideHere is the Query # 2 I am trying out...
    SELECT /*+ FIRST_ROWS(10) */
           SCORE(1) score_nbr,
           k.SYS_ID,
           k.RESULT_TITLE,
    FROM   SRCH_KEYWORD_SEARCH_SME k
    WHERE  CONTAINS (k.ABSTRACT, '<query><textquery><progression><seq>{hitchhiker} WITHIN abstract AND Y WITHIN active_flag</seq></progression></textquery></query>',1) > 0
    ORDER BY SCORE(1) DESC
    Here is the result for Query # 2...
    score_nbr sys_id     result_title
    3         132017     Lean Manufacturing: A Plant Floor Guide Book Summary                                                                                                                                                                                                            72                                    
    3         137106     Managing Factory Maintenance, Second Edition                                                                                                                                                                                                                    72                                    
    3         132082     Lean Pocket Guide                                                                                                                                                                                                                                               72                                    
    3         132083     The Toyota Way: 14 Management Principles From the World's Greatest...                                                                                                                                                                                           72                                    
    3         132417     Lean Manufacturing: A Plant Floor Guide                                                                                                                                                                                                                         72                                    
    3         132091     Breaking the Cost Barrier: A Proven Approach to Managing and...                                                                                                                                                                                                 72                                    
    3         99318      Conflicting pairs                                                                                                                                                                                                                                               72                                    
    3         132393     One-Piece Flow: Cell Design for Transforming the Production Process                                                                                                                                                                                             72                                    
    3         137091     Learning to See: Value Stream Mapping to Create Value & Eliminate MUDA                                                                                                                                                                                          72                                    
    3         137090     The Purchasing Machine: How the Top 10 Companies Use Best Practices...                                                                                                                                                                                          72                                    
    3         137393     Passion for Manufacturing My question is, why did the scoring went all the way to 3 for ALL the results the above query returned when I used the AND clause
    and added the 2nd column used in the datastore for my query condition..
    Also I want to use progressive relaxation technique in the queries to use stemming & fuzzy search option too.
    Help me out please....
    Thanks in advance.
    - Richard.

    Yes, it's in the doc - it's known as the weight operator.
    http://download.oracle.com/docs/cd/B28359_01/text.111/b28304/cqoper.htm#i998379
    "term*n      Returns documents that contain term. Calculates score by multiplying the raw score of term by n, where n is a number from 0.1 to 10."
    We're just using the operator twice as the limit on "n" is 10 (for no obvious reason I know of!). This is perfectly safe, and common practice.

  • Fuzzy Searches

    Is there anywhere I can find the algorithm Oracle uses for the CONTEXT fuzzy search (as in SELECT surname from person_source where
    contains(surname,'?Humphrey') > 0;
    I would like to build a function for use outside CONTEXT that incorproates the same algorithm.
    Fran

    This is more information about our scenario:
    We have two groups in the datastore:
    concat:
    1.) hierarchy:(example text) 321826 325123 543123
    2.) page: Actual document text.
    321826 325123 543123 represents ids in a hierarchy structure. As you move from left to right the number of times the number occurs is less so there should be less exact matches.
    Example: In this index all pages have 321826 as the first value. A few pages have 543123 and all others will have some other number as the last value.
    if I do this query:
    contains(concat,(321826 within hierarchy ) and ('personnel') within page)
    it takes about 10 seconds because it 321826 will hit all pages.
    if I do this query:
    contains(concat,(543123 within hierarchy ) and ('personnel') within page)
    it takes only about 1 second because it 543123 will hit just a few pages.
    BUT:::::::
    Fuzzy search....
    if I do this query:
    search A.) contains(concat,(321826 within hierarchy ) and ?('personnel') within page)
    it takes about 30 seconds because it 321826 will hit all pages. This is okay for performance for this.
    BUT if I do this query:
    search B.) contains(concat,(543123 within hierarchy ) and ?('personnel') within page)
    it takes about 30 seconds even though 543123 will hit only a few pages.
    This should be faster than 30 seconds because you're searching over only a fraction of material for the fuzzy search part.
    We've played with different variations on the () and the '' but nothing seems to change this.
    Any advice on how to make search B.) faster??
    We don't understand why see the different speeds in the exact match and we DON'T see the different speeds in the fuzzy search...
    I can send you some test data with the index and query scripts if you want.
    Our indexes are on large tables (2,000,000) rows.
    TIA
    Colleen Geislinger.

  • Drilldown Searches and Free-Form Searches

    Hi All
    can you please let me know <b>concept</b> of  Drilldown Searches and Free-Form Searches ?
    I read documentation in help portal.
    Can you kindly focus more light on these concepts with some example
    Thanks in advance
    Mugdha Kulkarni

    Hi Mugdha,
    MDM provides two types of searches:
    <b>Drilldown Search:</b>
    With drilldown search, you can make selections from each search tab, where each tab corresponds to a lookup field in the table
    You can also make selections for each of the attributes linked to the selected category, and each of the qualifiers of a qualified table record.
    <b>Freeform Search:</b>
    With free-form search, you can perform searches on any field that does not lookup its values from a sub table.
    Free-form search also allows you to do “fuzzy” searches with a variety of search operators
    It accepts typed values for one or more fields (like a traditional DBMS query form) and a keyword search that can match keywords in any or all of the fields in a table.
    At each step along the way, the system narrows down the choice of values for each search dimension to show only those that are valid given the current result set based on the previous search selections.
    The result is an extremely flexible and powerful search capability, delivered through an exceptionally smooth and intuitive process.
    Hope this clears your doubts.
    Regards,
    Rashmi Jadhav

  • How Fuzzy score and Score() function works in HANA?

    Hi,
    I read fuzzy developer guide of HANA, but i am not getting how HANA calculate score() and fuzzy score?
    As per developer guide, Score() is calculate using TF/IDF, and I also try to calculate TF/IDF as per WIKI page, but it gives different values. and Score() value is changed as per x value of fuzzy(x) .
    See example
    select score() as sc, *
    from COMPANIES2
    where contains(Companyname,'IBM',fuzzy(0.7))
    it returns
    SC;                             ID; COMPANYNAME;  CONTACT
    0.7599999904632568;   6;  IBM Corp;               M. Master
    and for
    select score() as sc, *
    from COMPANIES2
    where contains(Companyname,'IBM',fuzzy(0.2))
    it return
    SC;                               ID;  COMPANYNAME;       CONTACT
    0.16945946216583252;   2;   SAP in Walldorf Corp;  Master Mister
    0.8392000198364258;     6;   IBM Corp;                   M. Master
    and table content of Companies2 is
    ID; Companyname;           contact
    1;  SAP Corp;                   Mister Master
    2;  SAP in Walldorf Corp;   Master Mister
    3;  ASAP;                         Nister Naster
    4;  ASAP Corp;                 Mixter Maxter
    5;  BSAP orp;                   Imster Marter
    6;  IBM Corp;                    M. Master
    Please provide any formula or algorithm for above.
    Thanks,
    Somnath A. Kadam

    Hi Somnath,
    It seems that the column "Companyname" has data type "SHORTTEXT" and here is the quote from SAP HANA Developer Guide Ch. 10.2.4.8 (p659)
    "Text types support a more sophisticated kind of fuzzy search. Texts are tokenized (split into terms), and the fuzzy comparison is performed term by term.
    When searching with 'SAP' for example, a record like 'SAP Deutschland AG & Co. KG' gets a high score, because the term 'SAP' exists in both texts. A record like 'SAPPHIRE NOW Orlando' gets a lower score, because 'SAP' is just a part of the longer term 'SAPPHIRE' (3 of 8 characters match)."
    So for text columns the score calculation is much more complex than tf-idf.
    As for the different fuzzy score, there is an explanation in the FAQ section ( Ch. 10.2.4.14, p736 "Is the score between request and result always stable for TEXT columns?")
    Basically, for each token, its similarity score will be used to calculate the overall result only if it is higher than the threshold given in fuzzy(). Any token with a lesser similarity score will be excluded. Therefore, slight change in the threshold may influence the overall score greatly.
    Here is an example.
    I added id 7 "SAP ASAP" to the data you used.
    Note that the similarity score between "ASAP" and "BSAP" is slightly over 0.74 and similarity score between "SAP" and "BSAP" is 0.75:
    For
        select score() as sc, * from COMPANIES2  where contains(COMPANYNAME,'BSAP',fuzzy(0.74))
    We get:
    <...omitted...>
    0.7474510073661804;    7;    SAP ASAP;        M. Master
    Now change the  threshold to 0.75 and the result is:
    <...omitted...>
    0.5588234663009644;    7;    SAP ASAP;        M. Master
    ID 7 now gets a lower score because "ASAP" is excluded and only "SAP" is used to calculate the overall result.
    As for tf-idf, it is used in the so-called freestyle search across multiple columns.
    An example from the same guide:
         select score() as sc, * from companies2 where contains((companyname,contact), 'IBM Master', FUZZY(0.7));
    Result:
    0.8103122115135193;    6;    IBM Corp;    M. Master
    Regards
    Roger Tao

  • Oracle Text Concatenated Datastore

    I have read this:
    http://www.oracle.com/technology/sample_code/products/text/htdocs/concatenated_text_datastore/cdstore_readme.html
    I've been trying to follos the 'Installation' section.
    I've downloaded cdstore.sql but I get error ORA-00942 (table does not exist) because ctx_user_cdstore_cols does not exist (at line 618 in the file).
    Indeed, the table created is 'ctx_cdstore_cols' and not 'ctx_user_cdstore_cols'.
    I've changed it to ctx_cdstore_cols and now get ORA-00904 because CDSTORE_NAME is not a column of ctx_cdstores.
    Anyway, I believe that this code should work as is so there is something big I must be missing.
    Has anyone managed to install this package and how please?

    It's not a problem with the concatenated datastore, it's about operator precedence.
    If you search for 'A or B within SECTION', "within" has a higher precedence than "or", so this becomes 'A or (B within SECTION)'. What you need to say is '(A or B) within SECTION', or in your case '(BROOKS or BONDS) within name'
    Hope this helps.
    Roger

Maybe you are looking for

  • FTP Error (The remote server returned an error: (530) Not logged in.)

    Hi, I am unable to create a folder on FTP location via c# code.  I am getting the following error message: The remote server returned an error: (530) Not logged in. I have attached my code for your reference. FtpWebRequest myFtpWebRequest = null; myF

  • Abap faq's

    plz any body abap faq's on everything if it is plz fwd to my mail [email protected] thanking everybody.

  • How to install Air 2.7 on mac ?

    I had downloaded "AdobeAIRSDK.tbz2" and what the next? any steps will help me!

  • Organizer/Photo Editor

    PSE11  Mine starts in Photo Editor.  Is there a simple way to switch to organizer or can I set the startup window to pop up and give me the choice again?

  • Sending duplicate text messages

    My phone is sending duplicate text messages whenever I send to my contacts.  How do I stop this? I've noticed this on my RAZR Maxx and now my HTC One M8.