Progressive relaxation

I need to progressively relax my query, however, what I want is as follows:
If someone searched for say "Acrylic Crochet Throws" then
- the results with exact match should come on top
- followed by the match with stemming but all three words next to each other
- then a near search with max span of 15
- after this I want to repeat the above three conditions with combination of any two words i.e. "Acrylic Crochet", "Acrylic Throws" and "Crochet Throws"
Similarly if the searched string as 4 words, then it should first search for all 4 words, then combination of 3 words, then combination of two words.
Can someone guid me on how to achieve this
Madhup

In the example below, I have written a user-defined function to return the sequences for the progressive query relaxation. I wrote it to handle groups of up to four words. If you want more, just add more nested loops to the function. I included examples of what the function returns. I demonstated a query with a context index and contains, so that you can see that it works by the scores. However, I recommend that you use a ctxcat index and catsearch, as in the last query. There is a bug with context and contains not returning any rows if the first criteria of progressive query relaxation is not met and the optimizer uses the domain index.
SCOTT@10gXE> -- table and data:
SCOTT@10gXE> CREATE TABLE test_tab
2    (test_col VARCHAR2 (50))
3 /
Table created.
SCOTT@10gXE> INSERT ALL
2 INTO test_tab VALUES ('acrylic crochet throws')
3 INTO test_tab VALUES ('acrylic crochets throwing')
4 INTO test_tab VALUES ('acrylic word2 crochet word4 throws')
5 INTO test_tab VALUES ('acrylic crochet')
6 INTO test_tab VALUES ('acrylic throws')
7 INTO test_tab VALUES ('crochet throws')
8 INTO test_tab VALUES ('acrylics crocheting')
9 INTO test_tab VALUES ('acrylics throwing')
10 INTO test_tab VALUES ('crocheting throw')
11 INTO test_tab VALUES ('acrylic word2 crochet')
12 INTO test_tab VALUES ('acrylic word2 throws')
13 INTO test_tab VALUES ('crochet word2 throws')
14 sELECT * FROM DUAL
15 /
12 rows created.
SCOTT@10gXE> -- function:
SCOTT@10gXE> CREATE OR REPLACE FUNCTION seqs
2    (p_words        IN VARCHAR2)
3    RETURN            VARCHAR2
4 AS
5    v_words            VARCHAR2 (32767) := LTRIM (RTRIM (p_words));
6    v_result        VARCHAR2 (32767);
7    v_spaces        INTEGER;
8    v_string        VARCHAR2 (32767);
9    TYPE t_varchar2 IS TABLE OF VARCHAR2(255) INDEX BY BINARY_INTEGER;
10    t_words            t_varchar2;
11 BEGIN
12    WHILE INSTR (v_words, '     ') > 0 LOOP
13       v_words := REPLACE (v_words, ' ', ' ');
14    END LOOP;
15    v_result := v_result || CHR(10) || '<seq>' || v_words || '</seq>';
16    v_result := v_result || CHR(10) || '<seq>$' || REPLACE (v_words, ' ', ' $') || '</seq>';
17    v_result := v_result || CHR(10) || '<seq>NEAR((' || REPLACE (v_words, ' ', ', ') || '), 15, TRUE)</seq>';
18    v_spaces := LENGTH (v_words) - LENGTH (REPLACE (v_words, ' ', ''));
19    IF v_spaces > 1 THEN
20       v_string := v_words || ' ';
21       FOR i IN 1 .. v_spaces + 1 LOOP
22         t_words (i) := SUBSTR (v_string, 1, INSTR (v_string, ' ') - 1);
23         v_string := SUBSTR (v_string, INSTR (v_string, ' ') + 1);
24       END LOOP;
25       FOR n IN REVERSE 1 .. v_spaces - 1 LOOP
26         FOR x IN 1 .. 3 LOOP
27           v_result := v_result || CHR(10) || '<seq>';
28           v_string := '';
29
30           FOR i IN 0 + 1 .. GREATEST (LEAST (v_spaces + 1 - n, v_spaces + 1), 0+1)
31           LOOP
32             IF n >= 1 THEN
33            FOR j IN i + 1 .. GREATEST (LEAST (v_spaces + 2 - n, v_spaces + 1), i+1)
34            LOOP
35              IF n >= 2 THEN
36                FOR k IN j + 1 .. GREATEST (LEAST (v_spaces + 3 - n, v_spaces + 1), j+1)
37                LOOP
38                  IF n >= 3 THEN
39                 FOR l IN k + 1 .. GREATEST (LEAST (v_spaces + 4 - n, v_spaces + 1), k+1)
40                 LOOP
41                   v_string := v_string || t_words(i) || ' ' || t_words(j) || ' ' || t_words(k) || ' ' || t_words(l);
42                   v_string := v_string || CHR(10) || 'OR ';
43                 END LOOP;
44                  ELSE
45                 v_string := v_string || t_words(i) || ' ' || t_words(j) || ' ' || t_words(k);
46                 v_string := v_string || CHR(10) || 'OR ';
47                  END IF;
48                END LOOP;
49              ELSE
50                v_string := v_string || t_words(i) || ' ' || t_words(j);
51                v_string := v_string || CHR(10) || 'OR ';
52              END IF;
53            END LOOP;
54             ELSE
55            v_string := v_string || t_words(i);
56            v_string := v_string || CHR(10) || 'OR ';
57             END IF;
58           END LOOP;
59
60           v_string := RTRIM (RTRIM (v_string, 'OR '), CHR(10));
61           IF x = 2 THEN
62             v_string := '$' || REPLACE (v_string, ' ', ' $');
63           ELSIF x = 3 THEN
64             v_string := 'NEAR(('
65            || REPLACE (REPLACE (v_string, ' ', ','), CHR(10) || 'OR,',
66                     '), 15, TRUE)' || CHR(10) || 'OR NEAR((')
67            || '), 15, TRUE)';
68           END IF;
69           v_result := v_result || v_string;
70           v_result := v_result || '</seq>';
71         END LOOP;
72       END LOOP;
73    END IF;
74    RETURN LTRIM (v_result, CHR(10));
75 END seqs;
76 /
Function created.
SCOTT@10gXE> sHOW ERRORS
No errors.
SCOTT@10gXE> -- examples of what seqs function returns:
SCOTT@10gXE> VARIABLE g_words VARCHAR2(2000)
SCOTT@10gXE> EXEC :g_words := 'word1 word2 word3 word4'
PL/SQL procedure successfully completed.
SCOTT@10gXE> SELECT seqs (:g_words) FROM DUAL
2 /
SEQS(:G_WORDS)
<seq>word1 word2 word3 word4</seq>
<seq>$word1 $word2 $word3 $word4</seq>
<seq>NEAR((word1, word2, word3, word4), 15, TRUE)</seq>
<seq>word1 word2 word3
OR word1 word2 word4
OR word1 word3 word4
OR word2 word3 word4</seq>
<seq>$word1 $word2 $word3
OR $word1 $word2 $word4
OR $word1 $word3 $word4
OR $word2 $word3 $word4</seq>
<seq>NEAR((word1,word2,word3), 15, TRUE)
OR NEAR((word1,word2,word4), 15, TRUE)
OR NEAR((word1,word3,word4), 15, TRUE)
OR NEAR((word2,word3,word4), 15, TRUE)</seq>
<seq>word1 word2
OR word1 word3
OR word1 word4
OR word2 word3
OR word2 word4
OR word3 word4</seq>
<seq>$word1 $word2
OR $word1 $word3
OR $word1 $word4
OR $word2 $word3
OR $word2 $word4
OR $word3 $word4</seq>
<seq>NEAR((word1,word2), 15, TRUE)
OR NEAR((word1,word3), 15, TRUE)
OR NEAR((word1,word4), 15, TRUE)
OR NEAR((word2,word3), 15, TRUE)
OR NEAR((word2,word4), 15, TRUE)
OR NEAR((word3,word4), 15, TRUE)</seq>
SCOTT@10gXE> EXEC :g_words := 'Acrylic Crochet Throws'
PL/SQL procedure successfully completed.
SCOTT@10gXE> SELECT seqs (:g_words) FROM DUAL
2 /
SEQS(:G_WORDS)
<seq>Acrylic Crochet Throws</seq>
<seq>$Acrylic $Crochet $Throws</seq>
<seq>NEAR((Acrylic, Crochet, Throws), 15, TRUE)</seq>
<seq>Acrylic Crochet
OR Acrylic Throws
OR Crochet Throws</seq>
<seq>$Acrylic $Crochet
OR $Acrylic $Throws
OR $Crochet $Throws</seq>
<seq>NEAR((Acrylic,Crochet), 15, TRUE)
OR NEAR((Acrylic,Throws), 15, TRUE)
OR NEAR((Crochet,Throws), 15, TRUE)</seq>
SCOTT@10gXE> -- query with context index and contains
SCOTT@10gXE> -- (not recommended, just used to show score)
SCOTT@10gXE> CREATE INDEX test_idx1 ON test_tab (test_col)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 /
Index created.
SCOTT@10gXE> SELECT test_col, score (1)
2 FROM   test_tab
3 WHERE CONTAINS (test_col,
4                  '<query>
5                  <textquery>
6                    <progression>'
7                    || seqs (:g_words)
8                    || '</progression>
9                  </textquery>
10                </query>',
11                  1) > 0
12 /
TEST_COL                                             SCORE(1)
acrylic crochet throws                                     84
acrylic crochets throwing                                  67
acrylic word2 crochet word4 throws                         54
acrylic crochet                                            34
acrylic throws                                             34
crochet throws                                             34
acrylics crocheting                                        17
acrylics throwing                                          17
crocheting throw                                           17
acrylic word2 crochet                                       3
acrylic word2 throws                                        3
crochet word2 throws                                        3
12 rows selected.
SCOTT@10gXE> -- query with ctxcat index and catsearch (recommended):
SCOTT@10gXE> CREATE INDEX test_idx2 ON test_tab (test_col)
2 INDEXTYPE IS CTXSYS.CTXCAT
3 /
Index created.
SCOTT@10gXE> SELECT test_col
2 FROM   test_tab
3 WHERE CATSEARCH (test_col,
4                  '<query>
5                  <textquery>
6                    <progression>'
7                    || seqs (:g_words)
8                    || '</progression>
9                  </textquery>
10                </query>',
11                  NULL) > 0
12 /
TEST_COL
acrylic crochet throws
acrylic crochets throwing
acrylic word2 crochet word4 throws
acrylic crochet
acrylic throws
crochet throws
acrylics crocheting
acrylics throwing
crocheting throw
acrylic word2 crochet
acrylic word2 throws
crochet word2 throws
12 rows selected.
SCOTT@10gXE>

Similar Messages

Query using progressive relaxation take more time for execution

HI Gurus,
I am creating a query using context index and progressive relaxation
I had started using progressive relaxation after getting inputs from forum {thread:id=2333942} . Using progressive relaxation takes more than 7 seconds for every query. Is there any way we can improve the performance of the query?
create table test_sh4 (text1 clob,text2 clob,text3 clob);
begin
   ctx_ddl.create_preference ('nd_mcd', 'multi_column_datastore');
   ctx_ddl.set_attribute
      ('nd_mcd',
       'columns',
       'replace (text1, '' '', '''') nd1,
        text1 text1,
        replace (text2, '' '', '''') nd2,
        text2 text2');
   ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
   ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+');
   ctx_ddl.create_section_group ('test_sg', 'basic_section_group');
   ctx_ddl.add_field_section ('test_sg', 'text1', 'text1', true);
   ctx_ddl.add_field_section ('test_sg', 'nd1', 'nd1', true);
   ctx_ddl.add_field_section ('test_sg', 'text2', 'text2', true);
   ctx_ddl.add_field_section ('test_sg', 'nd2', 'nd2', true);
end;
create index IX_test_sh4 on test_sh4 (text3)   indextype is ctxsys.context   parameters    ('datastore     nd_mcd   lexer test_lex1 section group     test_sg') ;
alter index IX_test_sh4 REBUILD PARAMETERS ('REPLACE SYNC (ON COMMIT)') ;-- sync index on every commit.
SELECT SCORE(1) score,t.* FROM test_sh4 t WHERE CONTAINS (text3, '
<query>
<textquery>
<progression>
<seq>{GIFT GRILL STAPLES CARD} within text1</seq>
<seq>{GIFTGRILLSTAPLESCARD} within nd1</seq>
<seq>{GIFT GRILL STAPLES CARD} within text2</seq>
<seq>{GIFTGRILLSTAPLESCARD} within nd2</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text1</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text2</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES%) or (%GRILL% and %STAPLES% and %CARD%) or (%GIFT% and %STAPLES% and %CARD%) or (%GIFT% and %GRILL% and %CARD%)) within text1</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES%) or (%GRILL% and %STAPLES% and %CARD%) or (%GIFT% and %STAPLES% and %CARD%) or (%GIFT% and %GRILL% and %CARD%)) within text2</seq>
<seq>((%STAPLES% and %CARD%) or (%GIFT% and %GRILL%) or (%GRILL% and %CARD%) or (%GIFT% and %CARD%) or (%GIFT% and %STAPLES%) or (%GRILL% and %STAPLES%)) within text1</seq>
<seq>((%STAPLES% and %CARD%) or (%GIFT% and %GRILL%) or (%GRILL% and %CARD%) or (%GIFT% and %CARD%) or (%GIFT% and %STAPLES%) or (%GRILL% and %STAPLES%)) within text2</seq>
<seq>((%GIFT% , %GRILL% , %STAPLES% , %CARD%)) within text1</seq>
<seq>((%GIFT% , %GRILL% , %STAPLES% , %CARD%)) within text2</seq>
<seq>((!GIFT and !GRILL and !STAPLES and !CARD)) within text1</seq>
<seq>((!GIFT and !GRILL and !STAPLES and !CARD)) within text2</seq>
<seq>((!GIFT and !GRILL and !STAPLES) or (!GRILL and !STAPLES and !CARD) or (!GIFT and !STAPLES and !CARD) or (!GIFT and !GRILL and !CARD)) within text1</seq>
<seq>((!GIFT and !GRILL and !STAPLES) or (!GRILL and !STAPLES and !CARD) or (!GIFT and !STAPLES and !CARD) or (!GIFT and !GRILL and !CARD)) within text2</seq>
<seq>((!STAPLES and !CARD) or (!GIFT and !GRILL) or (!GRILL and !CARD) or (!GIFT and !CARD) or (!GIFT and !STAPLES) or (!GRILL and !STAPLES)) within text1</seq>
<seq>((!STAPLES and !CARD) or (!GIFT and !GRILL) or (!GRILL and !CARD) or (!GIFT and !CARD) or (!GIFT and !STAPLES) or (!GRILL and !STAPLES)) within text2</seq>
<seq>((!GIFT , !GRILL , !STAPLES , !CARD)) within text1</seq>
<seq>((!GIFT , !GRILL , !STAPLES , !CARD)) within text2</seq>
<seq>((?GIFT and ?GRILL and ?STAPLES and ?CARD)) within text1</seq>
<seq>((?GIFT and ?GRILL and ?STAPLES and ?CARD)) within text2</seq>
<seq>((?GIFT and ?GRILL and ?STAPLES) or (?GRILL and ?STAPLES and ?CARD) or (?GIFT and ?STAPLES and ?CARD) or (?GIFT and ?GRILL and ?CARD)) within text1</seq>
<seq>((?GIFT and ?GRILL and ?STAPLES) or (?GRILL and ?STAPLES and ?CARD) or (?GIFT and ?STAPLES and ?CARD) or (?GIFT and ?GRILL and ?CARD)) within text2</seq>
<seq>((?STAPLES and ?CARD) or (?GIFT and ?GRILL) or (?GRILL and ?CARD) or (?GIFT and ?CARD) or (?GIFT and ?STAPLES) or (?GRILL and ?STAPLES)) within text1</seq>
<seq>((?STAPLES and ?CARD) or (?GIFT and ?GRILL) or (?GRILL and ?CARD) or (?GIFT and ?CARD) or (?GIFT and ?STAPLES) or (?GRILL and ?STAPLES)) within text2</seq>
<seq>((?GIFT , ?GRILL , ?STAPLES , ?CARD)) within text1</seq>
<seq>((?GIFT , ?GRILL , ?STAPLES , ?CARD)) within text2</seq>
</progression>
</textquery>
<score datatype="FLOAT" algorithm="default"/>
</query>',1) >0 ORDER BY score(1) DESC

Progressive relaxation works best when you're only selecting a limited number of rows. If you fetch ALL the rows which satisfy the query, then all the steps in the relaxation will have to run regardless.
If you fetch - say - the first 10 results, then if the first step of the relaxation provides 10 results then there is no need to execute the next step (in fact, due to internal buffering, that won't be exactly true but it's conceptually correct).
The simplest way to do this is reword the query as
SELECT * FROM (
( SELECT SCORE(1) score,t.* FROM test_sh4 t WHERE CONTAINS (text3, '
<query>
<textquery>
</textquery>
<score datatype="FLOAT" algorithm="default"/>
</query>',1) >0 ORDER BY score(1) DESC
WHERE ROWNUM <= 10
You've discovered that leading wild cards don't work too well unless you use SUBSTRING_INDEX. I would encourage you to avoid them altogether if possible, or push them down much lower in the progressive relaxation. Usually, GIFT% is a useful expression (matches GIFTS, GIFTED, etc), %GIFT% is generally no more effective.
There are a lot of steps in your progressive relaxation. It you wanted to reduce the number of steps, you could change:
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text1</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text2</seq>
to
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)*2) within text1 ACCUM ((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text2</seq>
I don't know if this would have any performance benefits - but it's worth trying it to see.

Progressive Relaxation with error

Hi,
I tried to use progress relaxation to calculate matching scores(between t1.album and t2.title) and store them into a table. However, after executing the following script, I got these errors:
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 34
ORA-06512: at line 26
And the outer cursor stopped at some point. I've been working on it for many days already, but I still can't figure it out...
Oh, one more question: what is 'c' in the statement 'for c in (......)'? I got this from http://www.oracle.com/technology/products/text/htdocs/prog_relax.html, but I don't understand what it does...
Thanks a lot!!!!!!
Here is the script:
DECLARE
max_rows integer := 300000;
counter integer := 0;
current_album t1.album%TYPE;
CURSOR album_cursor IS
SELECT distinct album FROM t1;
BEGIN
OPEN album_cursor;
LOOP
FETCH album_cursor INTO current_album;
EXIT WHEN album_cursor%NOTFOUND;
for c in (select score(1) scr, aritst, Title from t2 where contains (Title, '
<query>
<textquery>'||'{'||current_album||'}'||'
<progression>
<seq><rewrite>transform((TOKENS, "{", "}", " "))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "?{", "}", " "))</rewrite>/seq>
<seq><rewrite>transform((TOKENS, "{", "}", "OR"))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "?{", "}", "OR"))</rewrite></seq>
</progression>
</textquery>
</query>
', 1) > 0)
LOOP
counter := counter + 1;
INSERT INTO ALBUM_MATCHED(SEQNUM, SCORE, ARTIST, t2_TITLE, t1_ALBUM)
VALUES(counter, c.scr, c.artist, c.Title, current_album);
commit;
EXIT when counter >= max_rows;
END LOOP;
END LOOP;
CLOSE album_cursor;
END;
**************************************************************************************************************************************************

Hi raford,
I also suspect that it's a problem caused by special characters. That's why I defined the following skipjoins for the matching column indexes:
exec CTX_DDL.CREATE_PREFERENCE ('spe_cha_lexer', 'BASIC_LEXER');
exec CTX_DDL.SET_ATTRIBUTE('spe_cha_lexer', 'SKIPJOINS' , '\,\&\=\{\}\\\[\]\-\;\~\|\$\!\>\*\%\_''\<\:\?\.\+\/\"@#');
create index t1_album_idx on t1 (album)
indextype is ctxsys.context
parameters ('lexer spe_cha_lexer wordlist wildcard_pref sync (on commit)');
create index t2_title_idx on amazonData (title)
indextype is ctxsys.context
parameters ('lexer spe_cha_lexer wordlist wildcard_pref sync (on commit)');
Here is the table structures of t1 and t2, and resultset table album_matched:
create table t1(album varchar2(500));
create table t2(artist varchar2(500), title varchar2(4000));
create table album_matched(SEQNUM number, SCORE number, ARTIST varchar2(500), t2_TITLE varchar2(4000), t1_ALBUM varchar2(500));
Here is some sample data:
for t1:
Madeline Porter
Harry Porter
???? R & H ?????
mandy's candy
for t2.artist, you can put whatever. and for t2.title:
Harry Porter
Basically, you can put everything in t1.album and t2.artist and t2.title, since we downloaded these data from a website. Therefore, there might be a lot of special characters in it, some of which might be miscoded into english from French or some other languages.
And I found the scoring is not right too...both table contain 'Harry Porter', but the matching score has only 68...don't know why...
Thanks a lot!

Progressive relaxation matches ctxrule

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
"CORE 10.2.0.1.0 Production"
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
Is progressive relaxation supported by matches with ctxrule indexes??

No, you can't use progressive relaxation with CTXRULE.

Search inside sections with progressive relaxation

Hi,
I am trying to search for the keywords within the XML. So my search will be made on a particular sections inside the XML. Say, i will search for the keyword Dog inside the tag <Animal>.
My questions are,
1. Is there any other way we can do it without using the within operator?
2. If we are using within, will i be able to apply the progressive relaxation in the search keywords?
3. Also, what is the use of SDATA. if i am using sdata, do i define all the tags in the XML (the tags in the XML are not generic and it may vary for each XML)?
Thanks in advance.
Regards,
Loganathan

I am trying to search for the keywords within the XML. So my search will be made on a particular sections inside the XML. Say, i will search for the keyword Dog inside the tag <Animal>.
1. Is there any other way we can do it without using the within operator?Not that I know of. If you search without the within operator, then it would find the word in any tag.
2. If we are using within, will i be able to apply the progressive relaxation in the search keywords?Yes.
3. Also, what is the use of SDATA. if i am using sdata, do i define all the tags in the XML (the tags in the XML are not generic and it may vary for each XML)?Sdata is for structured data, not unstructured text data. The sdata needs to be in a separate column. If your tags are variable, then you can use ctxsys.auto_section_group.
A lot of what you are asking about was discussed and demonstrated in the following recent thread:
Query Templates and WITHIN

Please help explain strange behaviour of progressive relaxation..

Hello
I am building theo Oracle text querry and I stumbled on a behaviour which I cannot explain...
here is the topo:
I have created a synonym thusly:
ctx_thes.create_relation('GR_THESAURUS','LEBOURGNEUF','SYN','BOURGNEUF');
when I issue this select:
SELECT      SCORE(1) NIVEAU_RECHERCHE, NOM,NO_MATRC,NO_NOM_REGST
FROM DUMMY
WHERE CONTAINS(nom, 'SYN(LEBOURGNEUF, GR_THESAURUS)', 1) > 0 ;
I get the expected result set which containsentries with either 'LEBOURGNEUF' or 'BOURGNEUF'. So far so good.
Now, when I combine search criterias using progressive relaxation, thus:
SELECT SCORE(1) NIVEAU_RECHERCHE, DUMMY.NO_MATRC,DUMMY.NO_NOM_REGST,DUMMY.NOM
FROM DUMMY
WHERE CONTAINS(nom, '<query>
<textquery lang="FRENCH" grammar="CONTEXT">LEBOURGNEUF golf
<progression>
<seq><rewrite>transform((TOKENS, "{", "}", "AND"))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "{", "}", "OR"))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "SYN(", ",GR_THESAURUS)", "OR"))</rewrite></seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="DEFAULT"/>
</query>', 1) > 0;
I do NOT get any results matching synonym 'BOURGNEUF'.. only those with 'LEBOURGNEUF' are returned...
further, if I intentionaly make a syntax error in the line
<seq><rewrite>transform((TOKENS, "SYN(", ",GR_THESAURUS)", "OR"))</rewrite></seq>
say like this:
<seq><rewrite>transform((TOKENS, "xSYN(", ",GR_THESAURUS)", "OR"))</rewrite></seq>
no error is returned and I get the same result set...
so this leads me to conclude that only the first two lines of the query are parsed/executed...
does anyone here have any ideas what is going one here?
in the preceding quire I neeed to add
                                                  <seq><rewrite>transform((TOKENS, "SYN(", ",GR_THESAURUS)", "AND"))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "SYN(", ",GR_THESAURUS)", "OR"))</rewrite></seq>
and possibly
<seq><rewrite>transform((TOKENS, "NT(", ",2,GR_THESAURUS)", "OR"))</rewrite></seq>
can this be done???
why is there no errors when I execute the query (in sqldeveloper) ???
any hints will be greatly appreciated!
Cheers
Edited by: user8848610 on 2009-10-29 07:46

now it works... although the simpler and cleaner solution of using transform would have been perfect, this function does a somewhat adequate job... I put it here so maybe it will help others ;)
     FUNCTION BuildSearchPredicate (texte IN NOM_ASSJT.NOM%TYPE) RETURN VARCHAR2 IS
     sSQL VARCHAR2(5000);
     sSeq1 VARCHAR2(1000);
     sSeq2 VARCHAR2(1000);
     sSeq3 VARCHAR2(1000);
     iFirst NUMBER(1);
     iPosition NUMBER;
     iToken NUMBER;
     CURSOR curWords(line_text IN VARCHAR2)      IS
               select regexp_substr(line_text, '[^ ]+', 1, level) word
               from dual
               connect by regexp_substr(line_text, '[^ ]+', 1, level) is not null;
     BEGIN
     sSQL := sSQL || '<query><textquery lang="FRENCH" grammar="CONTEXT"><progression> ';
     sSeq1 := '';
     iFirst := 1;
     iToken := 0;
     FOR r_curWord IN curWords(texte)
     LOOP
     iToken := iToken + 1;
     IF iFirst = 0 THEN
               sSeq1 := sSeq1 || ' AND {' || trim(r_curWord.word) || '}';
               sSeq2 := sSeq2 || ' OR {' || trim(r_curWord.word) || '}';
               sSeq3 := sSeq3 || ' OR SYN(' || trim(r_curWord.word) || ',GR_THESAURUS)';
          ELSE
               sSeq1 := '{' || trim(r_curWord.word) || '}';
               sSeq2 := '{' || trim(r_curWord.word) || '}';
               sSeq3 := 'SYN(' || trim(r_curWord.word) || ',GR_THESAURUS)';
               iFirst := 0;
          END IF;
     END LOOP;
     sSQL := sSQL || '<seq>' || sSeq1 || '</seq>';
     iPosition := instr(sSeq1, ' AND ');
     IF instr(sSeq1, ' AND ',iPosition + 1) > 0 THEN -- we must have at least 2 AND operator for this to make sense
          WHILE iPosition > 0
          LOOP
IF instr(substr(sSeq1, iPosition + 5), ' AND ') > 0 THEN
sSQL := sSQL || '<seq>' || substr(sSeq1, iPosition + 5) || '</seq>';
END IF;
               iPosition := instr(sSeq1, ' AND ', iPosition + 1);
          END LOOP;
     END IF;
     IF iToken > 1 THEN -- no use in having OR if there is only one word
          sSQL := sSQL || '<seq>' || sSeq2 || '</seq>';
     END IF;
     sSQL := sSQL || '<seq>' || sSeq3 || '</seq>';
     RETURN sSql;
     END BuildSearchPredicate;
END GR_RECH;
this will combine search words using AND and OR and SYN like so:
SELECT GR_RECH.BuildSearchPredicate('LEBOURGNEUF GOLF') FROM DUAL;
will result in :
<query><textquery lang="FRENCH" grammar="CONTEXT"><progression> <seq>{LEBOURGNEUF} AND {GOLF}</seq><seq>{LEBOURGNEUF} OR {GOLF}</seq><seq>SYN(LEBOURGNEUF,GR_THESAURUS) OR SYN(GOLF,GR_THESAURUS)</seq>
thanks for you help!
cheers
gth

Oracle Text progressive relaxation

hello,
We're in the process of evaluating Oracle Text search engine so far so good until yesterday when we added Synonyms to our progressive search criterion and it stop working depending on where we place the synonym search. If we place it first everything else stops working (stemming, fuzzy...) If we place it last then the synonym search stops working. I saw a reference to a bug in this conference that seemed similar to the problem, I believe it mentioned that it had been fixed in 10.2.0.3 (this is the version were on).
The following is a sample of plsql code were executing
select score(1), nm_resource, ADDR_RSRC_ST_LN_1, id_resource, ADDR_RSRC_CITY FROM caps_resource where
CONTAINS (nm_resource,
     '<query>
     <textquery lang="ENGLISH" grammar="CONTEXT">' || res_name ||
     '<progression>
<seq><rewrite>transform((TOKENS, ?{?, ?}?, ?AND?))</rewrite></seq>
<seq><rewrite>transform((TOKENS, ??{?, ?}?, ?AND?))</rewrite>/seq>
<seq><rewrite>transform((TOKENS, ?{?, ?}?, ?OR?))</rewrite></seq>
<seq><rewrite>transform((TOKENS, ??{?, ?}?, ?OR?))</rewrite>/seq>
<seq><rewrite>transform((TOKENS, ?{?, ?}?, ?ACCUM?))</rewrite></seq>
<seq><rewrite>transform((TOKENS, ?{?, ?}?, ?NEAR?))</rewrite></seq>
<seq>' || 'SYN(' || REPLACE('' || res_name || '', ' ', ',IMPACT_tst) AND SYN(') || ',IMPACT_tst)' || '</seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="default"/>
</query>', 1)>0

Here is a suggested alternative. I have used a syns function that I wrote for another user on another thread, that checks for all combinations of words that could amount to a synonym, up to the maximum number of words that you can specify as an input parameter. I have then used that in a separate contains clause and combined the score results, using the score derived from the synonym only when there is no score from the progressive rewrites, ordering by the progressive rewrites first. Also notice that the ordering must be done in an inner sub-query then the rows limited in an outer subquery. Otherwise it can select the first 100 rows, then order them, instead of the other way around, which can produce an entirely different result set.
SCOTT@orcl_11g> CREATE TABLE caps_resource
2    (nm_resource       VARCHAR2 (30))
3 /
Table created.
SCOTT@orcl_11g> INSERT ALL
2 INTO caps_resource VALUES ('Delagarza,Lorenzo')
3 INTO caps_resource VALUES ('Diana De La Garza')
4 INTO caps_resource VALUES ('De La Garza,Fred')
5 INTO caps_resource VALUES ('somebody else')
6 SELECT * FROM DUAL
7 /
4 rows created.
SCOTT@orcl_11g> CREATE INDEX your_index ON caps_resource (nm_resource)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 /
Index created.
SCOTT@orcl_11g> BEGIN
2    CTX_THES.CREATE_THESAURUS ('impact_tst');
3    CTX_THES.CREATE_RELATION ('impact_tst', 'Delagarza', 'SYN', 'De La Garza');
4 END;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> create or replace function syns
2    (p_words in varchar2,
3      p_thes     in varchar2,
4      p_num     in number default 3) -- maximum number of words per synonym phrase
5      return varchar2
6 as
7    v_words_in      varchar2 (32767) := ltrim (p_words) || ' ';
8    v_words_out     varchar2 (32767);
9 begin
10    while instr (v_words_in, ' ') > 0 loop
11       v_words_in := replace (v_words_in, ' ', ' ');
12    end loop;
13    while length (v_words_in) > 1
14    loop
15       for i in reverse 1 .. least (p_num, (length (v_words_in) - length (replace (v_words_in, ' ', ''))))
16       loop
17         if instr (ctx_thes.syn (substr (v_words_in, 1, instr (v_words_in, ' ', 1, i) - 1), p_thes), '|') > 0
18           or i = 1 then
19           v_words_out := v_words_out
20           || ' AND ('
21           || ctx_thes.syn (substr (v_words_in, 1, instr (v_words_in, ' ', 1, i) - 1), p_thes)
22           || ')';
23           v_words_in := substr (v_words_in, instr (v_words_in, ' ', 1, i) + 1);
24           exit;
25         end if;
26       end loop;
27    end loop;
28    return ltrim (v_words_out, ' AND ');
29 end syns;
30 /
Function created.
SCOTT@orcl_11g> show errors
No errors.
SCOTT@orcl_11g> VARIABLE res_name VARCHAR2 (100)
SCOTT@orcl_11g> EXEC :res_name := 'De La Garza'
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> SELECT syns (:res_name, 'impact_tst') FROM DUAL
2 /
SYNS(:RES_NAME,'IMPACT_TST')
({DE LA GARZA}|{DELAGARZA})
SCOTT@orcl_11g> SELECT the_score, nm_resource
2 FROM   (select DECODE (score(1), 0, SCORE(2), SCORE(1)) AS the_score, nm_resource
3           FROM   caps_resource
4           where CONTAINS (nm_resource,
5           '<query>
6           <textquery lang="ENGLISH" grammar="CONTEXT">' || :res_name ||
7            '<progression>
8               <seq><rewrite>transform((TOKENS, "{", "}", "AND"))</rewrite></seq>
9               <seq><rewrite>transform((TOKENS, "?{", "}", "AND"))</rewrite>/seq>
10               <seq><rewrite>transform((TOKENS, "{", "}", "OR"))</rewrite></seq>
11               <seq><rewrite>transform((TOKENS, "?{", "}", "OR"))</rewrite>/seq>
12               <seq><rewrite>transform((TOKENS, "{", "}", "ACCUM"))</rewrite></seq>
13               <seq><rewrite>transform((TOKENS, "{", "}", "NEAR"))</rewrite></seq>
14             </progression>
15           </textquery>
16           <score datatype="INTEGER" algorithm="default"/>
17            </query>', 1) > 0
18           OR     CONTAINS (nm_resource, syns (:res_name, 'impact_tst'), 2) > 0
19           ORDER BY SCORE (1) DESC, SCORE (2) DESC)
20 WHERE ROWNUM < 100
21 /
THE_SCORE NM_RESOURCE
        76 Diana De La Garza
        76 De La Garza,Fred
         5 Delagarza,Lorenzo
SCOTT@orcl_11g>

Progressive relaxation doesn't progress

I just discovered that in a contains() query with a <progression> tag and multiple <seq> conditions, the query does not return any results (ie, does not evaluate any subsequent conditions) if the first condition fails (ie, returns no rows).
Is this the correct behavior? It seems like a bug to me. I dont see it mentioned in the documentation anywhere.
Thanks,
Rory

If you use a ctxcat index and catsearch, instead of a context index and contains, the optimizer uses the domain index and returns the correct results quickly, as shown in the comparison below. The only bad thing about catsearch is that, as far as I know, the score function doesn't work with it.
SCOTT@10gXE> CREATE TABLE presidents
2    (id   NUMBER,
3      name VARCHAR2(60))
4 /
Table created.
SCOTT@10gXE> INSERT INTO presidents VALUES (1, 'William Jefferson Clinton')
2 /
1 row created.
SCOTT@10gXE> BEGIN
2    FOR i IN 1 .. 40 LOOP
3       INSERT INTO presidents
4       SELECT object_id, object_name
5       FROM     all_objects;
6    END LOOP;
7 END;
8 /
PL/SQL procedure successfully completed.
SCOTT@10gXE> SELECT COUNT(*) FROM presidents
2 /
COUNT(*)
    481081
SCOTT@10gXE> CREATE INDEX presidents_idx
2 ON presidents (name)
3 INDEXTYPE IS CTXSYS.CONTEXT
4 /
Index created.
SCOTT@10gXE> EXEC DBMS_STATS.GATHER_TABLE_STATS ('SCOTT', 'PRESIDENTS')
PL/SQL procedure successfully completed.
SCOTT@10gXE> SET TIMING ON
SCOTT@10gXE> SET AUTOTRACE ON EXPLAIN
SCOTT@10gXE> select id, name
2 from   presidents
3 where contains(name,
4                 '<query>
5                 <textquery>
6                   <progression>
7                     <seq>{William} {Clinton}</seq>
8                     <seq>{William} ; {Clinton}</seq>
9                   </progression>
10                 </textquery>
11                  </query>',1) <> 0
12 /
        ID NAME
         1 William Jefferson Clinton
Elapsed: 00:01:35.74
Execution Plan
Plan hash value: 3740813417
| Id | Operation         | Name       | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT |            | 23933 |   514K|   746 (35)| 00:00:09 |
|* 1 | TABLE ACCESS FULL| PRESIDENTS | 23933 |   514K|   746 (35)| 00:00:09 |
Predicate Information (identified by operation id):
   1 - filter("CTXSYS"."CONTAINS"("NAME",'<query>
              <textquery>                       <progression>
              <seq>{William} {Clinton}</seq>                         <seq>{William} ;
              {Clinton}</seq>                       </progression>
               </textquery>                   </query>',1)<>0)
SCOTT@10gXE> SET TIMING OFF
SCOTT@10gXE> SET AUTOTRACE OFF
SCOTT@10gXE> DROP INDEX presidents_idx
2 /
Index dropped.
SCOTT@10gXE> CREATE INDEX presidents_idx
2 ON presidents (name)
3 INDEXTYPE IS CTXSYS.CTXCAT
4 /
Index created.
SCOTT@10gXE> EXEC DBMS_STATS.GATHER_TABLE_STATS ('SCOTT', 'PRESIDENTS')
PL/SQL procedure successfully completed.
SCOTT@10gXE> SET TIMING ON
SCOTT@10gXE> SET AUTOTRACE ON EXPLAIN
SCOTT@10gXE> select id, name
2 from   presidents
3 where catsearch(name,
4                 '<query>
5                 <textquery>
6                   <progression>
7                     <seq>{William} {Clinton}</seq>
8                     <seq>{William} ; {Clinton}</seq>
9                   </progression>
10                 </textquery>
11                  </query>',null) > 0
12 /
        ID NAME
         1 William Jefferson Clinton
Elapsed: 00:00:01.94
Execution Plan
Plan hash value: 777849224
| Id | Operation                   | Name           | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                | 24160 |   519K|   486   (1)| 00:00:06 |
|   1 | TABLE ACCESS BY INDEX ROWID| PRESIDENTS     | 24160 |   519K|   486   (1)| 00:00:06 |
|* 2 |   DOMAIN INDEX              | PRESIDENTS_IDX |       |       |            |          |
Predicate Information (identified by operation id):
   2 - access("CTXSYS"."CATSEARCH"("NAME",'<query>                     <textquery>
                                 <progression>                         <seq>{William}
              {Clinton}</seq>                         <seq>{William} ; {Clinton}</seq>
                       </progression>                     </textquery>
              </query>',NULL)>0)
SCOTT@10gXE>

Progressive relaxation doesn't execute all sequeces

I' trying to execute the following query:
select * from UNIMI_GA.ENTITA_RC entitarc0_ where CONTAINS(entitarc0_.VALORE_INDICIZZATO, '
<query><textquery lang="ITALIAN" grammar="CONTEXT">
<progression>
<seq>$esame NEAR $microbiologiche</seq>
<seq>$esame AND $microbiologiche</seq>
<seq>$esame ACCUM $microbiologiche</seq>
<seq>?esame ACCUM ?microbiologiche</seq>
</progression>
</textquery><score datatype="INTEGER" algorithm="DEFAULT"/></query>
', 1)>0
Scenario 1
There is no field VALORE_INDICIZZATO that contains both "esame" NEAR/AND "microbiologiche" (in exact/stemmed versions).
But there are loads of records with that field containing "esame" or "microbiologiche".
I would expect those records to be returned from this query. But this not happens. The only way I found to obtain what I want is deleting the first 2 seq nodes in the xml (those containing NEAR and AND operators).
Scenario 2
If I add a new record with both "esame" and "microbiologiche" in VALORE_INDICIZZATO and execute the query again, the query returns the last inserted record and all the records that contains "esame" or "microbiologiche".
Is this behavioiur correct?
Thanks
Davide

I believe bug 5060137 was introduced in 10.2.0.1 and fixed in the 10.2.0.3 patch set. I don't have access to Metalink, so I can't tell you exactly where to find the patch set, just that others have found it and used it to fix the problem. I imagine someone on Metalink can help you locate it.

Progression not yielding the desired result

Hi
I have written a text query using the progressive relaxation method. It is not giving me the desired results. Here are the query details:
I have created an Intermedia index on a table with following specs:
BEGIN
CTX_DDL.DROP_PREFERENCE('CTXSYS.COMPANY_SEARCH_MULTI');
     CTX_DDL.CREATE_PREFERENCE('CTXSYS.COMPANY_SEARCH_MULTI', 'MULTI_COLUMN_DATASTORE');
     CTX_DDL.SET_ATTRIBUTE(     'CTXSYS.COMPANY_SEARCH_MULTI',
                    'columns',
                    'COMPANY,
                    DESC_N_PRODS,
                    PROD_DESC_N_PRODS,
                    PG_TITLE_GLUSR,
                    PG_KWD_DESC,
                    GEOGRAPHICAL_PROFILE,
                    GLUSR_DESC,
                    SUBCAT_DESC,
                    CTL_DESC,
                    SHORT_PROFILE,
                    LONG_PROFILE');
     CTX_DDL.DROP_SECTION_GROUP('CTXSYS.COMPANY_SEARCH_GROUP');
     CTX_DDL.CREATE_SECTION_GROUP('CTXSYS.COMPANY_SEARCH_GROUP', 'BASIC_SECTION_GROUP');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F1',     'COMPANY');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F2',     'DESC_N_PRODS');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F3',     'PROD_DESC_N_PRODS');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F4',     'PG_TITLE_GLUSR');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F5',     'PG_KWD_DESC');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F6',     'GEOGRAPHICAL_PROFILE');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F7',     'GLUSR_DESC');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F8',     'SUBCAT_DESC');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F9',     'CTL_DESC');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F10','SHORT_PROFILE');
     CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F11','LONG_PROFILE');
     CTX_DDL.DROP_PREFERENCE('CTXSYS.IIL_LEXER');
     CTX_DDL.CREATE_PREFERENCE('CTXSYS.IIL_LEXER','BASIC_LEXER');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_LEXER', 'INDEX_STEMS', 'ENGLISH');
     CTX_DDL.DROP_PREFERENCE('CTXSYS.IIL_FUZZY_PREF');
     CTX_DDL.CREATE_PREFERENCE('CTXSYS.IIL_FUZZY_PREF', 'BASIC_WORDLIST');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','FUZZY_MATCH','ENGLISH');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','FUZZY_SCORE','60');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','FUZZY_NUMRESULTS','100');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','SUBSTRING_INDEX','TRUE');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','PREFIX_INDEX','TRUE');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','PREFIX_MIN_LENGTH','1');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','PREFIX_MAX_LENGTH','3');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','WILDCARD_MAXTERMS','15000');
     CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','STEMMER','ENGLISH');
END;
CREATE INDEX COMPANY_SEARCH_IM on COMPANY_SEARCH(DUMMY) INDEXTYPE IS
CTXSYS.CONTEXT PARAMETERS
('DATASTORE CTXSYS.COMPANY_SEARCH_MULTI SECTION GROUP CTXSYS.COMPANY_SEARCH_GROUP MEMORY 50M
LEXER CTXSYS.IIL_LEXER WORDLIST CTXSYS.IIL_FUZZY_PREF STOPLIST CTXSYS.IIL_STOPLIST');
Now if I want to search for a string - acrylic crochet
My progressive clause is as follows:
<QUERY>
<TEXTQUERY>
<PROGRESSION>
<SEQ>(acrylic crochet) within F2</SEQ>
<SEQ>($acrylic $crochet) within F2</SEQ>
<SEQ>(acrylic crochet) within F3</SEQ>
<SEQ>($acrylic $crochet) within F3</SEQ>
<SEQ>(NEAR((acrylic,crochet))) within F2</SEQ>
</PROGRESSION>
</TEXTQUERY>
</QUERY>
The data set has a record where F2 Contains following text:
Manufacturers and exporters of yarns like acrylic yarn, viscose yarns, acrylic blended yarn, acrylic knitting yarn, spun yarn, blended yarns, braided thread, chenille yarn, cotton yarn, crochet yarn, dupion silk yarns etc
My problem is that - This record is not coming in the search result.
The record starts appearing if I use only NEAR Clause. as shown below:
<QUERY>
<TEXTQUERY>
<PROGRESSION>
<SEQ>(NEAR((acrylic,crochet))) within F2</SEQ>
</PROGRESSION>
</TEXTQUERY>
</QUERY>
Please advise what could be wrong - is my Index proper, or my progressive clause has some problem or there is something else which I have totally missed.
Regards
Madhup

The discussion in the link below contains the same bug that you have encoutered and some workarounds.
Re: progressive relaxation doesn't progress

Order of words, fuzzy and utl_match

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
"CORE     10.2.0.1.0     Production"
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
create table category(cat_id number(20),cat_type varchar2(3000));
create table category_match(cat_id number(20),cat_type varchar2(3000));
Insert into category (CAT_ID,CAT_TYPE) values (12790,'AUTO CONSULTANTS');
INSERT INTO CATEGORY (CAT_ID,CAT_TYPE) VALUES (23803,'AUTO CONSULTANT');
Insert into category (CAT_ID,CAT_TYPE) values (23804,'CONSULTANT FOR AUTO FINANCE');
Insert into category_match (CAT_ID,CAT_TYPE) values (12790,'AUTO CONSULTANTS');
INSERT INTO CATEGORY_match (CAT_ID,CAT_TYPE) VALUES (23803,'AUTO CONSULTANT');
Insert into category_match (CAT_ID,CAT_TYPE) values (23804,'CONSULTANT FOR AUTO FINANCE');
CREATE INDEX "LOOKING4"."MYINDEX" ON "CATEGORY_MATCH"
    "CAT_TYPE"
INDEXTYPE IS "CTXSYS"."CONTEXT" ;
CREATE INDEX "LOOKING4"."CAT_TYPE_IDX" ON "CATEGORY"
    "CAT_TYPE"
INDEXTYPE IS "CTXSYS"."CTXCAT" ;
select cat_id,CAT_TYPE,UTL_MATCH.edit_distance_similarity(CAT_TYPE,'AUTO CONSULTANT') from
select * from category where catsearch(cat_type,
'<query>
      <textquery grammar="context">
      <progression>
      <seq>auto consultant</seq>
      <seq>?(auto) and ?(consultant)</seq>
        </progression>
        </textquery>
</query>'
,NULL)>0
)where rownum<5
23803     AUTO CONSULTANT     100
12790     AUTO CONSULTANTS     94
23804     CONSULTANT FOR AUTO FINANCE     26
update category set cat_type='CONSULTANTS AUTO' WHERE CAT_ID=12790
select cat_id,CAT_TYPE,UTL_MATCH.edit_distance_similarity(CAT_TYPE,'AUTO CONSULTANT') from
select * from category where catsearch(cat_type,
'<query>
      <textquery grammar="context">
      <progression>
      <seq>auto consultant</seq>
      <seq>?(auto) and ?(consultant)</seq>
        </progression>
        </textquery>
</query>'
,NULL)>0
)where rownum<5
23803     AUTO CONSULTANT     100
12790     CONSULTANTS AUTO     32
23804     CONSULTANT FOR AUTO FINANCE     26
select score(1),cat_id,cat_type from CATEGORY_MATCH where cat_id in(
select cat_id from category where catsearch(cat_type,
'<query>
      <textquery grammar="context">
      <progression>
      <seq>auto consultant</seq>
      <seq>?(auto) and ?(consultant)</seq>
        </progression>
        </textquery>
</query>'
,NULL)>0) AND
contains(cat_type,'?(auto) and ?(consultant)',1)>0
9     23803     AUTO CONSULTANT
9     12790     AUTO CONSULTANTS
9     23804     CONSULTANT FOR AUTO FINANCEi have been using catsearch to use progressive relaxation
there are many "cat_types" like "cat_id" =23803,12790 ,the order of words in a sentence changes
there are upto 10 words in each row of "cat_types" column
among others i have referred
Achieving functionality of many preferences using one context index
and
Re: Fuzzy search - more accurate score??
there is very less possibility of repetition of words in a row
utl match seems to work perfect only when the order of appearance of words is same
if you can suggest a way to get a very close score for cat_id 23803 and 12790 it would be much appreciated
thanks and regards

select *
    FROM   (SELECT score(1),score(2),score(3),score(4),GREATEST (SCORE(1), SCORE(2) - 1, SCORE(3) - 2, SCORE(4) - 3) g_scores,
                  UTL_MATCH.EDIT_DISTANCE_SIMILARITY (CAT_TYPE,'AUTO CONSULTANT') EDS,
                  CAT_ID, CAT_TYPE
              FROM   category_match
              WHERE CONTAINS (cat_type, 'solar water heater* 10 * 10', 1) > 0
              OR     CONTAINS (cat_type, 'NEAR ((?solar, ?water ,?heater), 0, TRUE) * 10 * 10', 2) > 0
             OR     CONTAINS (cat_type, 'NEAR ((?solar, ?water ,?heater), 0, FALSE) * 10 * 10', 3) > 0
             or     CONTAINS (CAT_TYPE, '(?solar AND ?water AND ?heater) * 10 * 10', 4) > 0
             order by g_scores desc, EDS desc)
   WHERE ROWNUM<100
100     100     100     100     100     23     4                  SOLAR WATER HEATER-ANU
100     100     100     100     100     22     26901          SOLAR WATER HEATER SUDARSHAN SAUR
100     100     100     100     100     21     30                  SOLAR WATER HEATER INDUSTRIAL
100     100     100     100     100     20     17379          SOLAR WATER HEATER DEALERS-TATA
100     100     100     100     100     20     26906          SOLAR WATER HEATER NUETECH
100     100     100     100     100     20     11465          SOLAR WATER HEATER DEALERS-ANU
100     100     100     100     100     20     21                  SOLAR WATER HEATER-ZING TATA BP
100     100     100     100     100     20     11463          SOLAR WATER HEATER MANUFACTURERS-ANU
100     100     100     100     100     19     8                  SOLAR WATER HEATER MANUFACTURERS
100     100     100     100     100     19     23                  SOLAR WATER HEATER EVACUATED TUBE
100     100     100     100     100     19     49                  SOLAR WATER HEATER-HOTMAX NOVA TATA BP
100     100     100     100     100     19     13357          SOLAR WATER HEATER INDUSTRIAL DEALERS
100     100     100     100     100     18     16300          SOLAR WATER HEATER-TECHNOMAX
100     100     100     100     100     18     9                  SOLAR WATER HEATER DEALERS-TATA BP
100     100     100     100     100     18     20                  SOLAR WATER HEATER-ZING
100     100     100     100     100     18     18                  SOLAR WATER HEATER-ORB SOLAR
100     100     100     100     100     18     22552          SOLAR WATER HEATER-KOTAK URJA
100     100     100     100     100     18     26908          SOLAR WATER HEATER SUPREME
100     100     100     100     100     17     26907          SOLAR WATER HEATER TECHNOMAX"
100     100     100     100     100     17     13322          SOLAR WATER HEATER DISTRIBUTORS
100     100     100     100     100     17     22                  SOLAR WATER HEATER-ETC TATA BP
100     100     100     100     100     17     48                  SOLAR WATER HEATER-VAJRA PLUS TATA BP
100     100     100     100     100     17     27084          SOLAR WATER HEATER SALES
100     100     100     100     100     16     16236          SOLAR WATER HEATER DEALERS-RACOLD
100     100     100     100     100     16     15                  SOLAR WATER HEATER-NUTECH
100     100     100     100     100     16     1                  SOLAR WATER HEATER DEALERS
100     100     100     100     100     15     2                  SOLAR WATER HEATER DEALERS-TATA BP SOLAR
100     100     100     100     100     15     31                  SOLAR WATER HEATER DOMESTIC
100     100     100     100     100     15     13                  SOLAR WATER HEATER DEALERS-V GUARD
100     100     100     100     100     14     17                  SOLAR WATER HEATER-KAMAL SOLAR
100     100     100     100     100     13     11467          SOLAR WATER HEATER DEALERS-GILMA
100     100     100     100     100     13     19                  SOLAR WATER HEATER-GILMA
100     100     100     100     100     13     10                  SOLAR WATER HEATER REPAIRS & SERVICES-TATA SOLAR
100     100     100     100     100     12     10578          SOLAR WATER HEATER
100     100     100     100     100     11     3                  SOLAR WATER HEATER REPAIRS & SERVICES
0      0     100     100     98     25     10120          WATER HEATER SOLAR INDUSTRIAL
0      0     100     100     98     20     12953          WATER HEATER SOLAR-RACOLD
0      0     100     100     98     17     10119          WATER HEATER SOLAR RESIDENCIAL
{code}
the query is working accurately technically
but is there any way to get 10578 on top
the requirement is
---first
solar water heater
solar water heater dealers
solar water heater manufacturers
solar water heater distributors
solar water heater sales
solar water heater repairs and servicing
---followed by
SOLAR WATER HEATER REPAIRS & SERVICES-TATA SOLAR
SOLAR WATER HEATER-KAMAL SOLAR
SOLAR WATER HEATER DEALERS-TATA BP SOLAR   etc etc
so if the end user types in "solar water" the top row would have a row from the table that has what the end user has entered followed by "dealers" or "manufacturers" or "distributors" or "sales" or "repairs and servicing"
so if a row contains "solar water dealer" it shows up on top
or(if "solar water dealer" is not there and "solar water manufacturers" or "solar water distributors" etc is not present)
a row from the table that has what the end user has entered PLUS "heater" followed by "dealers" or "manufacturers" or "distributors" or "sales" or "repairs and servicing"
so "solar water heater dealers" shows up on top
these words - "dealers" , "manufacturers" , "distributors" , "sales" , "repairs and servicing" etc remain constant
what i am using right now is
{code}
create or replace
procedure HOME_OLD
p_cat_type in varchar2,
P_LOC IN NUMBER,
P_MAX IN NUMBER,
P_MIN IN NUMBER,
P_OUT OUT SYS_REFCURSOR
as
VARIAB varchar2(500);
VARIAB2 varchar2(500);
VARIAB3 varchar2(500);
VARIAB4 varchar2(500);
begin
--VARIAB2:='?'||replace(P_CAT_TYPE,' ',', ?');
--VARIAB3:='?'||replace(P_CAT_TYPE,' ',' ?');
--DBMS_OUTPUT.PUT_LINE(VARIAB2);
--DBMS_OUTPUT.PUT_LINE(VARIAB3);
SELECT stragg(cat_id) into variab
   FROM   (SELECT GREATEST (SCORE(1), SCORE(2) - 1, SCORE(3) - 2, SCORE(4) - 3) score,
                                  CAT_ID, CAT_TYPE
              FROM   category_match
              -- exact words in order:
              WHERE CONTAINS (cat_type,get_basic(P_CAT_TYPE), 1) > 0
              -- similar words next to each other in order:
              OR     CONTAINS (cat_type, get_near_syntax(P_CAT_TYPE), 2) > 0
             -- similar words next to each other in any order:
             OR     CONTAINS (cat_type, get_near_syntax_desc(P_CAT_TYPE), 3) > 0
             -- similar words anywhere in any order:
             OR     CONTAINS (cat_type, get_anywhere(P_CAT_TYPE), 4) > 0
             order by score desc)
   where rownum < 3;
DBMS_OUTPUT.PUT_LINE(VARIAB);
open p_out
   FOR select * from(select rownum r,name,address1,telephone,mobile,CAT_TYP,cat_id,
(case when address2=p_loc and ACT_STATUS='Y' then '1' when address2=p_loc then '2' when address2 in
(select NEARBY_LOC from NEAR_BY where LOCALITY_ID=p_loc) and ACT_STATUS='Y'
then '3' when ADDRESS2 in (select NEARBY_LOC from NEAR_BY where LOCALITY_ID=p_loc)
then '4' when ACT_STATUS='Y' and address2<> p_loc then '5' else '6' end) as marker
FROM TEST_TEST
WHERE
CAT_ID in(select * from table(STRING_TO_TABLE_NUM(variab))) and rownum<P_MAX order by marker) where r>P_MIN;
IF VARIAB IS NULL THEN
OPEN P_OUT
   FOR SELECT * FROM(SELECT rownum r,name,address1,telephone,mobile,CATS
   FROM   (SELECT GREATEST (SCORE(1), SCORE(2) - 1, SCORE(3) - 2, SCORE(4) - 3) score,
                                  NAME,ADDRESS1,TELEPHONE,MOBILE,CATS
              FROM   TEST_TEST2
              -- exact words in order:
              WHERE CONTAINS (NAME,get_basic(P_CAT_TYPE), 1) > 0
              -- similar words next to each other in order:
              OR     CONTAINS (NAME, get_near_syntax(P_CAT_TYPE), 2) > 0
             -- similar words next to each other in any order:
             OR     CONTAINS (NAME, get_near_syntax_desc(P_CAT_TYPE), 3) > 0
             -- similar words anywhere in any order:
             OR     CONTAINS (NAME, get_anywhere(P_CAT_TYPE), 4) > 0
             ORDER BY SCORE DESC)
   WHERE ROWNUM < P_MAX)where r>P_MIN;
END IF;
end home_old;
{code}
the flow is to find what the end user has entered in category table ,if a match exists,find all reg_ids from test_test materialized view that have selected the matched cat_id..
the test_test materialized view lists each company cat_id-selected-by-that-company number of times
if no match is found in category table what the end user has entered could be a company so a search in name column of test_test2 materialized view..
this materialized view has one entry for each company
{code}
create or replace
FUNCTION GET_BASIC(P_CAT_TYPE VARCHAR2)
    RETURN VARCHAR2
is
VARIAB2 VARCHAR2(3000);
      begin
VARIAB2:='{'||P_CAT_TYPE||'}*10*10';
return(VARIAB2);
END;
create or replace
FUNCTION GET_NEAR_SYNTAX(P_CAT_TYPE VARCHAR2)
    RETURN VARCHAR2
is
VARIAB2 VARCHAR2(3000);
      begin
VARIAB2:='NEAR((?{'||replace(P_CAT_TYPE,' ','}, ?{')||'}),10,TRUE)*10*10';
return(VARIAB2);
END;
create or replace
FUNCTION GET_NEAR_SYNTAX_DESC(P_CAT_TYPE VARCHAR2)
    RETURN VARCHAR2
is
VARIAB2 VARCHAR2(3000);
      begin
VARIAB2:='NEAR((?{'||replace(P_CAT_TYPE,' ','}, ?{')||'}),10,FALSE)*10*10';
return(VARIAB2);
END;
{code}
can anything be done to ameliorate this whole flow
can anything be done to eliminate the near_by and act_status and locality checking in ordering by "marker" clause
below is the materialized view creation ddl
SELECT IN_V.REG_ID,
    IN_V.NAME,
    IN_V.TELEPHONE,
    IN_V.MOBILE,
    IN_V.ADDRESS1,
    IN_V.ADDRESS2,
    IN_V.ACT_STATUS,
    resec.cat_id,
    UPPER(STRAGG(IN_V.CAT_TYPE)) AS cat_typ
FROM
    (SELECT RSC.REG_ID,
      R.NAME,
      RSC.CAT_ID,
      C.CAT_TYPE,
      R.ADDRESS1,
      R.ADDRESS2,
      R.ACT_STATUS,
      R.TELEPHONE,
      R.MOBILE,
      ROW_NUMBER() OVER (PARTITION BY RSC.REG_ID ORDER BY rsc.reg_id) AS TT
    FROM REG_SEG_CAT RSC,
      category C,
      REGISTRATION R
    WHERE C.CAT_ID=RSC.CAT_ID
    AND R.REG_ID =RSC.REG_ID
    ) IN_V,
    REG_SEG_CAT RESEC
WHERE in_v.reg_id=resec.reg_id
AND IN_V.TT      <6
GROUP BY IN_V.REG_ID,
    IN_V.NAME,
    IN_V.TELEPHONE,
    IN_V.MOBILE,
    IN_V.ADDRESS2,
    IN_V.ACT_STATUS,
    IN_V.ADDRESS1,
    resec.cat_id;
and sql>desc test_test
REG_ID
NAME
TELEPHONE
MOBILE
ADDRESS1
ADDRESS2
ACT_STATUS
CAT_ID
CAT_TYP
please let me know if you need more info
Edited by: 946207 on Apr 19, 2013 6:22 PM

Justification for Using Oracle Text

Hello,
Can someone give me good cause (justification) for utilizing Oracle Text over other tools out there that are not tied directly to Oracle?
Apparently it is possible to identify metadata within text and do keyfield and keyword searches this way with other tools, but I question the accuracy, speed, or value in terms of data relationships with this approach. I feel the relationships belong in the database along with the indexes but can't convince anyone of this.
Has anyone experience working with Oracle Text where relationships help to drive the search and can give me good cause to this approach?
thanks

Hi,
Justification depends on your use. For starters:
1) It is included in both standard and enterprise editions of the db at no added charge
2) Uses SQL to query and maintain
3) Includes a number of built-ins for maintenance and optimization
4) It has 4 different index types for various uses
5) It can index any data type
6) UltraSearch is included in both standard and enterprise editions of the db at no additional charge (this is a crawler built on Oracle Text).
As for the integration - it is optimized for Oracle. If you were to build a standalone indexing solution you would probably design it a bit different, but Oracle Text takes into account the optimizer and database structure.
It has other features (same as some of the other tools) like a knowledge base, classification, clustering, theme extraction, language-specific features, ability to index documents in and out of the database, stopwords, stemming, wildcard, progressive relaxation, and the list goes on.
I guess my question would be, what is the reason for NOT using it? That might give me a better line on the reasoning so that I can respond with something a bit more specific.
Thanks,
Ron

Scoring messed up using concatenated datastore Index

Hi,
Here is my table structure....
CREATE TABLE SRCH_KEYWORD_SEARCH_SME
SYS_ID NUMBER(10) NOT NULL,
PAPER_NO VARCHAR2(10),
PRODIDX_ID VARCHAR2(10),
RESULT_TITLE VARCHAR2(255),
RESULT_DESCR VARCHAR2(1000) NOT NULL,
ABSTRACT CLOB,
SRSLT_CATEGORY_ID VARCHAR2(10) NOT NULL,
SRSLT_SUB_CATEGORY_ID VARCHAR2(10) NOT NULL,
ACTIVE_FLAG VARCHAR2(1) DEFAULT 'Y' NOT NULL,
EVENT_START_DATE DATE,
EVENT_END_DATE DATE,
Here is the Concatenated Datastore preference...
   -- Drop any existing storage preference.
   CTX_DDL.drop_preference('SEARCH_STORAGE_PREF');
   -- Create new storage preference.
   CTX_DDL.create_preference('SEARCH_STORAGE_PREF', 'BASIC_STORAGE');
      CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'I_TABLE_CLAUSE', 'tablespace searchidx');
      CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'K_TABLE_CLAUSE', 'tablespace searchidx');
      CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'R_TABLE_CLAUSE', 'tablespace searchidx lob (data) store as (disable storage in row cache)');
      CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'N_TABLE_CLAUSE', 'tablespace searchidx');
      CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'I_INDEX_CLAUSE', 'tablespace searchidx compress 2');
      CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'P_TABLE_CLAUSE', 'tablespace searchidx');
   -- Drop any existing datastore preference.
   CTX_DDL.drop_preference('SEARCH_DATA_STORE');
   CTX_DDL.DROP_SECTION_GROUP('SEARCH_DATA_STORE_SG');
   -- Create new multi-column datastore preference.
   CTX_DDL.create_preference('SEARCH_DATA_STORE','MULTI_COLUMN_DATASTORE');
   CTX_DDL.set_attribute('SEARCH_DATA_STORE','columns','abstract, srslt_category_id, srslt_sub_category_id, active_flag');
   CTX_DDL.set_attribute('SEARCH_DATA_STORE', 'FILTER','N,N,N,N');
   -- Create new section group preference.
   CTX_DDL.create_section_group ('SEARCH_DATA_STORE_SG','BASIC_SECTION_GROUP');
   CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'abstract',              'abstract',             TRUE);
   CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'srslt_category_id',     'srslt_category_id',    TRUE);
   CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'srslt_sub_category_id', 'srslt_sub_category_id',TRUE);
   CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'active_flag',           'active_flag',          TRUE);
Here is the context Index
CREATE INDEX SRCH_KEYWORD_SEARCH_I ON SRCH_KEYWORD_SEARCH_SME(ABSTRACT)
   INDEXTYPE IS CTXSYS.CONTEXT
      PARAMETERS('STORAGE search_storage_pref DATASTORE SEARCH_DATA_STORE SECTION GROUP SEARCH_DATA_STORE_SG' )
Here is the Query # 1 I am trying out...
SELECT /*+ FIRST_ROWS(10) */
       SCORE(1) score_nbr,
       k.SYS_ID,
       k.RESULT_TITLE,
FROM   SRCH_KEYWORD_SEARCH_SME k
WHERE CONTAINS (k.ABSTRACT, '<query><textquery><progression><seq>{hitchhiker} WITHIN abstract</seq></progression></textquery></query>',1) > 0
ORDER BY SCORE(1) DESC;
Here is the result for Query # 1...
score_nbr   sys_id     result_title
54          99220      SME Releases New Book The Hitchhiker's Guide to Lean                                                                                                                                                                                                     72
43          116583     Lean Leadership Package                                                                                                                                                                                                                                         72
32          132392     The Hitchhikers Guide to Lean: Lessons from the Road                                                                                                                                                                                                           72
11          132017     Lean Manufacturing A Plant Floor Guide Book Summary                                                                                                                                                                                                            72
11          137106     Managing Factory Maintenance, Second Edition                                                                                                                                                                                                                    72
11          132082     Lean Pocket GuideHere is the Query # 2 I am trying out...
SELECT /*+ FIRST_ROWS(10) */
       SCORE(1) score_nbr,
       k.SYS_ID,
       k.RESULT_TITLE,
FROM   SRCH_KEYWORD_SEARCH_SME k
WHERE CONTAINS (k.ABSTRACT, '<query><textquery><progression><seq>{hitchhiker} WITHIN abstract AND Y WITHIN active_flag</seq></progression></textquery></query>',1) > 0
ORDER BY SCORE(1) DESC
Here is the result for Query # 2...
score_nbr sys_id     result_title
3         132017     Lean Manufacturing: A Plant Floor Guide Book Summary                                                                                                                                                                                                            72
3         137106     Managing Factory Maintenance, Second Edition                                                                                                                                                                                                                    72
3         132082     Lean Pocket Guide                                                                                                                                                                                                                                               72
3         132083     The Toyota Way: 14 Management Principles From the World's Greatest...                                                                                                                                                                                           72
3         132417     Lean Manufacturing: A Plant Floor Guide                                                                                                                                                                                                                         72
3         132091     Breaking the Cost Barrier: A Proven Approach to Managing and...                                                                                                                                                                                                 72
3         99318      Conflicting pairs                                                                                                                                                                                                                                               72
3         132393     One-Piece Flow: Cell Design for Transforming the Production Process                                                                                                                                                                                             72
3         137091     Learning to See: Value Stream Mapping to Create Value & Eliminate MUDA                                                                                                                                                                                          72
3         137090     The Purchasing Machine: How the Top 10 Companies Use Best Practices...                                                                                                                                                                                          72
3         137393     Passion for Manufacturing My question is, why did the scoring went all the way to 3 for ALL the results the above query returned when I used the AND clause
and added the 2nd column used in the datastore for my query condition..
Also I want to use progressive relaxation technique in the queries to use stemming & fuzzy search option too.
Help me out please....
Thanks in advance.
- Richard.

Yes, it's in the doc - it's known as the weight operator.
http://download.oracle.com/docs/cd/B28359_01/text.111/b28304/cqoper.htm#i998379
"term*n      Returns documents that contain term. Calculates score by multiplying the raw score of term by n, where n is a number from 0.1 to 10."
We're just using the operator twice as the limit on "n" is 10 (for no obvious reason I know of!). This is perfectly safe, and common practice.

Results ranking

Hi,
I have a database with bibliographic data (title, author...) and full text documents. I store all bibliographic data and a part of the full text in a single CLOB column in a XML format :
I then defined groups with FIELD_SECTION and mdata_section to be able to search on this column . The default search is set on all data :
select id,score(1) FROM mytable where contains(mycolumn,'mysearch',1) > 0 order by score (1) desc
However, for generic queries I useally get more than 100 results with the same 100 scrore. So it is not very useafull for the end-user.
As far as I can see there is no way to improve the score algoritm so that the documents that have mysearch in the title section for exemple have a better ranking. Is that correct?
Did some of you tried to improved this ranking in the application? Search for the words in the title first section for exemple, and then search in the full text with an exclusion of the id of the documents found in the first step and then add the two hit list?
Thanks for your help.
Kind regards,
Fred

Progressive relaxation is useful here.
You can search for words in the title section only in the first stage, then look in the other columns in the next stage. Any hits in the first stage of progressive relaxation are guaranteed to score higher than hits in the next stage.
See here: http://www.oracle.com/technology/products/text/htdocs/prog_relax.html
for a discussion of progressive relaxation.

Unable to use the thesaurus in a relaxation template

I am trying to get a query relaxation template to use the thesaurus but I can't get the syntax correct. Is it possible? If so, please can someone tell me where I'm going wrong?
create table test_table(company_name varchar2(100));
insert into test_table values ('Test Limited');
insert into test_table values ('Test Ltd');
create index idx_test on test_table(company_name) indextype is ctxsys.context;
If my query looks like this:
select company_name, score(1)
from test_table
where CONTAINS (company_NAME,
'<query>
<textquery lang="ENGLISH" grammar="CONTEXT">test ltd
<progression>
<seq><rewrite>transform((TOKENS, “{”, “}”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “!”, “%”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “${”, “}”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “SYN(”, “,legal_form)”, “ ”))</rewrite></seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="COUNT"/>
</query>',1)>0;
I get the matching record back
COMPANY_NAME SCORE(1)
Test Ltd 75
But if I move the SYN line to the top like this:
select company_name, score(1)
from test_table
where CONTAINS (company_NAME,
'<query>
<textquery lang="ENGLISH" grammar="CONTEXT">test ltd
<progression>
<seq><rewrite>transform((TOKENS, “SYN(”, “,legal_form)”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “{”, “}”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “!”, “%”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “${”, “}”, “ ”))</rewrite></seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="COUNT"/>
</query>',1)>0;
I get an error which I think means that the XML line is not valid:
ORA-29902:error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 35
What is the correct format for the line that will apply the thesaurus synonym between Limited to LTD?

There are a lot of things that work well individually, but not in combination with one another. It looks like something goes wrong when you try to combine transform with syn. One possible workaround is to use replace to do your own transformation. Please see the reproduction and solution below.
SCOTT@10gXE> -- test environment:
SCOTT@10gXE> create table test_table(company_name varchar2(100));
Table created.
SCOTT@10gXE> insert into test_table values ('Test Limited');
1 row created.
SCOTT@10gXE> insert into test_table values ('Test Ltd');
1 row created.
SCOTT@10gXE> create index idx_test on test_table(company_name) indextype is ctxsys.context;
Index created.
SCOTT@10gXE> EXEC CTX_THES.CREATE_THESAURUS ('legal_form')
PL/SQL procedure successfully completed.
SCOTT@10gXE> EXEC CTX_THES.CREATE_RELATION ('legal_form', 'Limited', 'SYN', 'Ltd')
PL/SQL procedure successfully completed.
SCOTT@10gXE> COLUMN company_name FORMAT A30
SCOTT@10gXE> -- reproduction of problem:
SCOTT@10gXE> select company_name, score(1)
2 from test_table
3 where CONTAINS (company_NAME,
4 '<query>
5 <textquery lang="ENGLISH" grammar="CONTEXT">test ltd
6 <progression>
7 <seq><rewrite>transform((TOKENS, “SYN(”, “,legal_form)”, “ ”))</rewrite></seq>
8 <seq><rewrite>transform((TOKENS, “{”, “}”, “ ”))</rewrite></seq>
9 <seq><rewrite>transform((TOKENS, “!”, “%”, “ ”))</rewrite></seq>
10 <seq><rewrite>transform((TOKENS, “${”, “}”, “ ”))</rewrite></seq>
11 </progression>
12 </textquery>
13 <score datatype="INTEGER" algorithm="COUNT"/>
14 </query>',1)>0
15 /
select company_name, score(1)
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 7
SCOTT@10gXE> -- possible workaround:
SCOTT@10gXE> VARIABLE search_string VARCHAR2(30)
SCOTT@10gXE> EXEC :search_string := 'test ltd'
PL/SQL procedure successfully completed.
SCOTT@10gXE> select company_name, score(1)
2 from test_table
3 where CONTAINS (company_NAME,
4 '<query>
5 <textquery lang="ENGLISH" grammar="CONTEXT">
6 <progression>
7 <seq>' || 'SYN(' || REPLACE(:search_string, ' ', ',legal_form) AND SYN(') || ',legal_form)' || '</seq>
8 <seq>' || '{'    || REPLACE(:search_string, ' ', '} {')                 || '}'           || '</seq>
9 <seq>' || '!'    || REPLACE(:search_string, ' ', '% !')                 || '%'           || '</seq>
10 <seq>' || '${'   || REPLACE(:search_string, ' ', '} ${')                 || '}'           || '</seq>
11 </progression>
12 </textquery>
13 <score datatype="INTEGER" algorithm="COUNT"/>
14 </query>',1)>0
15 /
COMPANY_NAME                     SCORE(1)
Test Limited                           75
Test Ltd                               75
SCOTT@10gXE>

Progressive relaxation

Similar Messages

Maybe you are looking for