Progressive relaxation
I need to progressively relax my query, however, what I want is as follows:
If someone searched for say "Acrylic Crochet Throws" then
- the results with exact match should come on top
- followed by the match with stemming but all three words next to each other
- then a near search with max span of 15
- after this I want to repeat the above three conditions with combination of any two words i.e. "Acrylic Crochet", "Acrylic Throws" and "Crochet Throws"
Similarly if the searched string as 4 words, then it should first search for all 4 words, then combination of 3 words, then combination of two words.
Can someone guid me on how to achieve this
Madhup
In the example below, I have written a user-defined function to return the sequences for the progressive query relaxation. I wrote it to handle groups of up to four words. If you want more, just add more nested loops to the function. I included examples of what the function returns. I demonstated a query with a context index and contains, so that you can see that it works by the scores. However, I recommend that you use a ctxcat index and catsearch, as in the last query. There is a bug with context and contains not returning any rows if the first criteria of progressive query relaxation is not met and the optimizer uses the domain index.
SCOTT@10gXE> -- table and data:
SCOTT@10gXE> CREATE TABLE test_tab
2 (test_col VARCHAR2 (50))
3 /
Table created.
SCOTT@10gXE> INSERT ALL
2 INTO test_tab VALUES ('acrylic crochet throws')
3 INTO test_tab VALUES ('acrylic crochets throwing')
4 INTO test_tab VALUES ('acrylic word2 crochet word4 throws')
5 INTO test_tab VALUES ('acrylic crochet')
6 INTO test_tab VALUES ('acrylic throws')
7 INTO test_tab VALUES ('crochet throws')
8 INTO test_tab VALUES ('acrylics crocheting')
9 INTO test_tab VALUES ('acrylics throwing')
10 INTO test_tab VALUES ('crocheting throw')
11 INTO test_tab VALUES ('acrylic word2 crochet')
12 INTO test_tab VALUES ('acrylic word2 throws')
13 INTO test_tab VALUES ('crochet word2 throws')
14 sELECT * FROM DUAL
15 /
12 rows created.
SCOTT@10gXE> -- function:
SCOTT@10gXE> CREATE OR REPLACE FUNCTION seqs
2 (p_words IN VARCHAR2)
3 RETURN VARCHAR2
4 AS
5 v_words VARCHAR2 (32767) := LTRIM (RTRIM (p_words));
6 v_result VARCHAR2 (32767);
7 v_spaces INTEGER;
8 v_string VARCHAR2 (32767);
9 TYPE t_varchar2 IS TABLE OF VARCHAR2(255) INDEX BY BINARY_INTEGER;
10 t_words t_varchar2;
11 BEGIN
12 WHILE INSTR (v_words, ' ') > 0 LOOP
13 v_words := REPLACE (v_words, ' ', ' ');
14 END LOOP;
15 v_result := v_result || CHR(10) || '<seq>' || v_words || '</seq>';
16 v_result := v_result || CHR(10) || '<seq>$' || REPLACE (v_words, ' ', ' $') || '</seq>';
17 v_result := v_result || CHR(10) || '<seq>NEAR((' || REPLACE (v_words, ' ', ', ') || '), 15, TRUE)</seq>';
18 v_spaces := LENGTH (v_words) - LENGTH (REPLACE (v_words, ' ', ''));
19 IF v_spaces > 1 THEN
20 v_string := v_words || ' ';
21 FOR i IN 1 .. v_spaces + 1 LOOP
22 t_words (i) := SUBSTR (v_string, 1, INSTR (v_string, ' ') - 1);
23 v_string := SUBSTR (v_string, INSTR (v_string, ' ') + 1);
24 END LOOP;
25 FOR n IN REVERSE 1 .. v_spaces - 1 LOOP
26 FOR x IN 1 .. 3 LOOP
27 v_result := v_result || CHR(10) || '<seq>';
28 v_string := '';
29
30 FOR i IN 0 + 1 .. GREATEST (LEAST (v_spaces + 1 - n, v_spaces + 1), 0+1)
31 LOOP
32 IF n >= 1 THEN
33 FOR j IN i + 1 .. GREATEST (LEAST (v_spaces + 2 - n, v_spaces + 1), i+1)
34 LOOP
35 IF n >= 2 THEN
36 FOR k IN j + 1 .. GREATEST (LEAST (v_spaces + 3 - n, v_spaces + 1), j+1)
37 LOOP
38 IF n >= 3 THEN
39 FOR l IN k + 1 .. GREATEST (LEAST (v_spaces + 4 - n, v_spaces + 1), k+1)
40 LOOP
41 v_string := v_string || t_words(i) || ' ' || t_words(j) || ' ' || t_words(k) || ' ' || t_words(l);
42 v_string := v_string || CHR(10) || 'OR ';
43 END LOOP;
44 ELSE
45 v_string := v_string || t_words(i) || ' ' || t_words(j) || ' ' || t_words(k);
46 v_string := v_string || CHR(10) || 'OR ';
47 END IF;
48 END LOOP;
49 ELSE
50 v_string := v_string || t_words(i) || ' ' || t_words(j);
51 v_string := v_string || CHR(10) || 'OR ';
52 END IF;
53 END LOOP;
54 ELSE
55 v_string := v_string || t_words(i);
56 v_string := v_string || CHR(10) || 'OR ';
57 END IF;
58 END LOOP;
59
60 v_string := RTRIM (RTRIM (v_string, 'OR '), CHR(10));
61 IF x = 2 THEN
62 v_string := '$' || REPLACE (v_string, ' ', ' $');
63 ELSIF x = 3 THEN
64 v_string := 'NEAR(('
65 || REPLACE (REPLACE (v_string, ' ', ','), CHR(10) || 'OR,',
66 '), 15, TRUE)' || CHR(10) || 'OR NEAR((')
67 || '), 15, TRUE)';
68 END IF;
69 v_result := v_result || v_string;
70 v_result := v_result || '</seq>';
71 END LOOP;
72 END LOOP;
73 END IF;
74 RETURN LTRIM (v_result, CHR(10));
75 END seqs;
76 /
Function created.
SCOTT@10gXE> sHOW ERRORS
No errors.
SCOTT@10gXE> -- examples of what seqs function returns:
SCOTT@10gXE> VARIABLE g_words VARCHAR2(2000)
SCOTT@10gXE> EXEC :g_words := 'word1 word2 word3 word4'
PL/SQL procedure successfully completed.
SCOTT@10gXE> SELECT seqs (:g_words) FROM DUAL
2 /
SEQS(:G_WORDS)
<seq>word1 word2 word3 word4</seq>
<seq>$word1 $word2 $word3 $word4</seq>
<seq>NEAR((word1, word2, word3, word4), 15, TRUE)</seq>
<seq>word1 word2 word3
OR word1 word2 word4
OR word1 word3 word4
OR word2 word3 word4</seq>
<seq>$word1 $word2 $word3
OR $word1 $word2 $word4
OR $word1 $word3 $word4
OR $word2 $word3 $word4</seq>
<seq>NEAR((word1,word2,word3), 15, TRUE)
OR NEAR((word1,word2,word4), 15, TRUE)
OR NEAR((word1,word3,word4), 15, TRUE)
OR NEAR((word2,word3,word4), 15, TRUE)</seq>
<seq>word1 word2
OR word1 word3
OR word1 word4
OR word2 word3
OR word2 word4
OR word3 word4</seq>
<seq>$word1 $word2
OR $word1 $word3
OR $word1 $word4
OR $word2 $word3
OR $word2 $word4
OR $word3 $word4</seq>
<seq>NEAR((word1,word2), 15, TRUE)
OR NEAR((word1,word3), 15, TRUE)
OR NEAR((word1,word4), 15, TRUE)
OR NEAR((word2,word3), 15, TRUE)
OR NEAR((word2,word4), 15, TRUE)
OR NEAR((word3,word4), 15, TRUE)</seq>
SCOTT@10gXE> EXEC :g_words := 'Acrylic Crochet Throws'
PL/SQL procedure successfully completed.
SCOTT@10gXE> SELECT seqs (:g_words) FROM DUAL
2 /
SEQS(:G_WORDS)
<seq>Acrylic Crochet Throws</seq>
<seq>$Acrylic $Crochet $Throws</seq>
<seq>NEAR((Acrylic, Crochet, Throws), 15, TRUE)</seq>
<seq>Acrylic Crochet
OR Acrylic Throws
OR Crochet Throws</seq>
<seq>$Acrylic $Crochet
OR $Acrylic $Throws
OR $Crochet $Throws</seq>
<seq>NEAR((Acrylic,Crochet), 15, TRUE)
OR NEAR((Acrylic,Throws), 15, TRUE)
OR NEAR((Crochet,Throws), 15, TRUE)</seq>
SCOTT@10gXE> -- query with context index and contains
SCOTT@10gXE> -- (not recommended, just used to show score)
SCOTT@10gXE> CREATE INDEX test_idx1 ON test_tab (test_col)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 /
Index created.
SCOTT@10gXE> SELECT test_col, score (1)
2 FROM test_tab
3 WHERE CONTAINS (test_col,
4 '<query>
5 <textquery>
6 <progression>'
7 || seqs (:g_words)
8 || '</progression>
9 </textquery>
10 </query>',
11 1) > 0
12 /
TEST_COL SCORE(1)
acrylic crochet throws 84
acrylic crochets throwing 67
acrylic word2 crochet word4 throws 54
acrylic crochet 34
acrylic throws 34
crochet throws 34
acrylics crocheting 17
acrylics throwing 17
crocheting throw 17
acrylic word2 crochet 3
acrylic word2 throws 3
crochet word2 throws 3
12 rows selected.
SCOTT@10gXE> -- query with ctxcat index and catsearch (recommended):
SCOTT@10gXE> CREATE INDEX test_idx2 ON test_tab (test_col)
2 INDEXTYPE IS CTXSYS.CTXCAT
3 /
Index created.
SCOTT@10gXE> SELECT test_col
2 FROM test_tab
3 WHERE CATSEARCH (test_col,
4 '<query>
5 <textquery>
6 <progression>'
7 || seqs (:g_words)
8 || '</progression>
9 </textquery>
10 </query>',
11 NULL) > 0
12 /
TEST_COL
acrylic crochet throws
acrylic crochets throwing
acrylic word2 crochet word4 throws
acrylic crochet
acrylic throws
crochet throws
acrylics crocheting
acrylics throwing
crocheting throw
acrylic word2 crochet
acrylic word2 throws
crochet word2 throws
12 rows selected.
SCOTT@10gXE>
Similar Messages
-
Query using progressive relaxation take more time for execution
HI Gurus,
I am creating a query using context index and progressive relaxation
I had started using progressive relaxation after getting inputs from forum {thread:id=2333942} . Using progressive relaxation takes more than 7 seconds for every query. Is there any way we can improve the performance of the query?
create table test_sh4 (text1 clob,text2 clob,text3 clob);
begin
ctx_ddl.create_preference ('nd_mcd', 'multi_column_datastore');
ctx_ddl.set_attribute
('nd_mcd',
'columns',
'replace (text1, '' '', '''') nd1,
text1 text1,
replace (text2, '' '', '''') nd2,
text2 text2');
ctx_ddl.create_preference ('test_lex1', 'basic_lexer');
ctx_ddl.set_attribute ('test_lex1', 'whitespace', '/\|-_+');
ctx_ddl.create_section_group ('test_sg', 'basic_section_group');
ctx_ddl.add_field_section ('test_sg', 'text1', 'text1', true);
ctx_ddl.add_field_section ('test_sg', 'nd1', 'nd1', true);
ctx_ddl.add_field_section ('test_sg', 'text2', 'text2', true);
ctx_ddl.add_field_section ('test_sg', 'nd2', 'nd2', true);
end;
create index IX_test_sh4 on test_sh4 (text3) indextype is ctxsys.context parameters ('datastore nd_mcd lexer test_lex1 section group test_sg') ;
alter index IX_test_sh4 REBUILD PARAMETERS ('REPLACE SYNC (ON COMMIT)') ;-- sync index on every commit.
SELECT SCORE(1) score,t.* FROM test_sh4 t WHERE CONTAINS (text3, '
<query>
<textquery>
<progression>
<seq>{GIFT GRILL STAPLES CARD} within text1</seq>
<seq>{GIFTGRILLSTAPLESCARD} within nd1</seq>
<seq>{GIFT GRILL STAPLES CARD} within text2</seq>
<seq>{GIFTGRILLSTAPLESCARD} within nd2</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text1</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text2</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES%) or (%GRILL% and %STAPLES% and %CARD%) or (%GIFT% and %STAPLES% and %CARD%) or (%GIFT% and %GRILL% and %CARD%)) within text1</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES%) or (%GRILL% and %STAPLES% and %CARD%) or (%GIFT% and %STAPLES% and %CARD%) or (%GIFT% and %GRILL% and %CARD%)) within text2</seq>
<seq>((%STAPLES% and %CARD%) or (%GIFT% and %GRILL%) or (%GRILL% and %CARD%) or (%GIFT% and %CARD%) or (%GIFT% and %STAPLES%) or (%GRILL% and %STAPLES%)) within text1</seq>
<seq>((%STAPLES% and %CARD%) or (%GIFT% and %GRILL%) or (%GRILL% and %CARD%) or (%GIFT% and %CARD%) or (%GIFT% and %STAPLES%) or (%GRILL% and %STAPLES%)) within text2</seq>
<seq>((%GIFT% , %GRILL% , %STAPLES% , %CARD%)) within text1</seq>
<seq>((%GIFT% , %GRILL% , %STAPLES% , %CARD%)) within text2</seq>
<seq>((!GIFT and !GRILL and !STAPLES and !CARD)) within text1</seq>
<seq>((!GIFT and !GRILL and !STAPLES and !CARD)) within text2</seq>
<seq>((!GIFT and !GRILL and !STAPLES) or (!GRILL and !STAPLES and !CARD) or (!GIFT and !STAPLES and !CARD) or (!GIFT and !GRILL and !CARD)) within text1</seq>
<seq>((!GIFT and !GRILL and !STAPLES) or (!GRILL and !STAPLES and !CARD) or (!GIFT and !STAPLES and !CARD) or (!GIFT and !GRILL and !CARD)) within text2</seq>
<seq>((!STAPLES and !CARD) or (!GIFT and !GRILL) or (!GRILL and !CARD) or (!GIFT and !CARD) or (!GIFT and !STAPLES) or (!GRILL and !STAPLES)) within text1</seq>
<seq>((!STAPLES and !CARD) or (!GIFT and !GRILL) or (!GRILL and !CARD) or (!GIFT and !CARD) or (!GIFT and !STAPLES) or (!GRILL and !STAPLES)) within text2</seq>
<seq>((!GIFT , !GRILL , !STAPLES , !CARD)) within text1</seq>
<seq>((!GIFT , !GRILL , !STAPLES , !CARD)) within text2</seq>
<seq>((?GIFT and ?GRILL and ?STAPLES and ?CARD)) within text1</seq>
<seq>((?GIFT and ?GRILL and ?STAPLES and ?CARD)) within text2</seq>
<seq>((?GIFT and ?GRILL and ?STAPLES) or (?GRILL and ?STAPLES and ?CARD) or (?GIFT and ?STAPLES and ?CARD) or (?GIFT and ?GRILL and ?CARD)) within text1</seq>
<seq>((?GIFT and ?GRILL and ?STAPLES) or (?GRILL and ?STAPLES and ?CARD) or (?GIFT and ?STAPLES and ?CARD) or (?GIFT and ?GRILL and ?CARD)) within text2</seq>
<seq>((?STAPLES and ?CARD) or (?GIFT and ?GRILL) or (?GRILL and ?CARD) or (?GIFT and ?CARD) or (?GIFT and ?STAPLES) or (?GRILL and ?STAPLES)) within text1</seq>
<seq>((?STAPLES and ?CARD) or (?GIFT and ?GRILL) or (?GRILL and ?CARD) or (?GIFT and ?CARD) or (?GIFT and ?STAPLES) or (?GRILL and ?STAPLES)) within text2</seq>
<seq>((?GIFT , ?GRILL , ?STAPLES , ?CARD)) within text1</seq>
<seq>((?GIFT , ?GRILL , ?STAPLES , ?CARD)) within text2</seq>
</progression>
</textquery>
<score datatype="FLOAT" algorithm="default"/>
</query>',1) >0 ORDER BY score(1) DESCProgressive relaxation works best when you're only selecting a limited number of rows. If you fetch ALL the rows which satisfy the query, then all the steps in the relaxation will have to run regardless.
If you fetch - say - the first 10 results, then if the first step of the relaxation provides 10 results then there is no need to execute the next step (in fact, due to internal buffering, that won't be exactly true but it's conceptually correct).
The simplest way to do this is reword the query as
SELECT * FROM (
( SELECT SCORE(1) score,t.* FROM test_sh4 t WHERE CONTAINS (text3, '
<query>
<textquery>
</textquery>
<score datatype="FLOAT" algorithm="default"/>
</query>',1) >0 ORDER BY score(1) DESC
WHERE ROWNUM <= 10
You've discovered that leading wild cards don't work too well unless you use SUBSTRING_INDEX. I would encourage you to avoid them altogether if possible, or push them down much lower in the progressive relaxation. Usually, GIFT% is a useful expression (matches GIFTS, GIFTED, etc), %GIFT% is generally no more effective.
There are a lot of steps in your progressive relaxation. It you wanted to reduce the number of steps, you could change:
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text1</seq>
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text2</seq>
to
<seq>((%GIFT% and %GRILL% and %STAPLES% and %CARD%)*2) within text1 ACCUM ((%GIFT% and %GRILL% and %STAPLES% and %CARD%)) within text2</seq>
I don't know if this would have any performance benefits - but it's worth trying it to see. -
Progressive Relaxation with error
Hi,
I tried to use progress relaxation to calculate matching scores(between t1.album and t2.title) and store them into a table. However, after executing the following script, I got these errors:
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 34
ORA-06512: at line 26
And the outer cursor stopped at some point. I've been working on it for many days already, but I still can't figure it out...
Oh, one more question: what is 'c' in the statement 'for c in (......)'? I got this from http://www.oracle.com/technology/products/text/htdocs/prog_relax.html, but I don't understand what it does...
Thanks a lot!!!!!!
Here is the script:
DECLARE
max_rows integer := 300000;
counter integer := 0;
current_album t1.album%TYPE;
CURSOR album_cursor IS
SELECT distinct album FROM t1;
BEGIN
OPEN album_cursor;
LOOP
FETCH album_cursor INTO current_album;
EXIT WHEN album_cursor%NOTFOUND;
for c in (select score(1) scr, aritst, Title from t2 where contains (Title, '
<query>
<textquery>'||'{'||current_album||'}'||'
<progression>
<seq><rewrite>transform((TOKENS, "{", "}", " "))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "?{", "}", " "))</rewrite>/seq>
<seq><rewrite>transform((TOKENS, "{", "}", "OR"))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "?{", "}", "OR"))</rewrite></seq>
</progression>
</textquery>
</query>
', 1) > 0)
LOOP
counter := counter + 1;
INSERT INTO ALBUM_MATCHED(SEQNUM, SCORE, ARTIST, t2_TITLE, t1_ALBUM)
VALUES(counter, c.scr, c.artist, c.Title, current_album);
commit;
EXIT when counter >= max_rows;
END LOOP;
END LOOP;
CLOSE album_cursor;
END;
**************************************************************************************************************************************************Hi raford,
I also suspect that it's a problem caused by special characters. That's why I defined the following skipjoins for the matching column indexes:
exec CTX_DDL.CREATE_PREFERENCE ('spe_cha_lexer', 'BASIC_LEXER');
exec CTX_DDL.SET_ATTRIBUTE('spe_cha_lexer', 'SKIPJOINS' , '\,\&\=\{\}\\\(\)\[\]\-\;\~\|\$\!\>\*\%\_''\<\:\?\.\+\/\"@#');
create index t1_album_idx on t1 (album)
indextype is ctxsys.context
parameters ('lexer spe_cha_lexer wordlist wildcard_pref sync (on commit)');
create index t2_title_idx on amazonData (title)
indextype is ctxsys.context
parameters ('lexer spe_cha_lexer wordlist wildcard_pref sync (on commit)');
Here is the table structures of t1 and t2, and resultset table album_matched:
create table t1(album varchar2(500));
create table t2(artist varchar2(500), title varchar2(4000));
create table album_matched(SEQNUM number, SCORE number, ARTIST varchar2(500), t2_TITLE varchar2(4000), t1_ALBUM varchar2(500));
Here is some sample data:
for t1:
Madeline Porter
Harry Porter
???? R & H ?????
mandy's candy
for t2.artist, you can put whatever. and for t2.title:
Harry Porter
Basically, you can put everything in t1.album and t2.artist and t2.title, since we downloaded these data from a website. Therefore, there might be a lot of special characters in it, some of which might be miscoded into english from French or some other languages.
And I found the scoring is not right too...both table contain 'Harry Porter', but the matching score has only 68...don't know why...
Thanks a lot! -
Progressive relaxation matches ctxrule
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
"CORE 10.2.0.1.0 Production"
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
Is progressive relaxation supported by matches with ctxrule indexes??No, you can't use progressive relaxation with CTXRULE.
-
Search inside sections with progressive relaxation
Hi,
I am trying to search for the keywords within the XML. So my search will be made on a particular sections inside the XML. Say, i will search for the keyword Dog inside the tag <Animal>.
My questions are,
1. Is there any other way we can do it without using the within operator?
2. If we are using within, will i be able to apply the progressive relaxation in the search keywords?
3. Also, what is the use of SDATA. if i am using sdata, do i define all the tags in the XML (the tags in the XML are not generic and it may vary for each XML)?
Thanks in advance.
Regards,
LoganathanI am trying to search for the keywords within the XML. So my search will be made on a particular sections inside the XML. Say, i will search for the keyword Dog inside the tag <Animal>.
1. Is there any other way we can do it without using the within operator?Not that I know of. If you search without the within operator, then it would find the word in any tag.
2. If we are using within, will i be able to apply the progressive relaxation in the search keywords?Yes.
3. Also, what is the use of SDATA. if i am using sdata, do i define all the tags in the XML (the tags in the XML are not generic and it may vary for each XML)?Sdata is for structured data, not unstructured text data. The sdata needs to be in a separate column. If your tags are variable, then you can use ctxsys.auto_section_group.
A lot of what you are asking about was discussed and demonstrated in the following recent thread:
Query Templates and WITHIN -
Please help explain strange behaviour of progressive relaxation..
Hello
I am building theo Oracle text querry and I stumbled on a behaviour which I cannot explain...
here is the topo:
I have created a synonym thusly:
ctx_thes.create_relation('GR_THESAURUS','LEBOURGNEUF','SYN','BOURGNEUF');
when I issue this select:
SELECT SCORE(1) NIVEAU_RECHERCHE, NOM,NO_MATRC,NO_NOM_REGST
FROM DUMMY
WHERE CONTAINS(nom, 'SYN(LEBOURGNEUF, GR_THESAURUS)', 1) > 0 ;
I get the expected result set which containsentries with either 'LEBOURGNEUF' or 'BOURGNEUF'. So far so good.
Now, when I combine search criterias using progressive relaxation, thus:
SELECT SCORE(1) NIVEAU_RECHERCHE, DUMMY.NO_MATRC,DUMMY.NO_NOM_REGST,DUMMY.NOM
FROM DUMMY
WHERE CONTAINS(nom, '<query>
<textquery lang="FRENCH" grammar="CONTEXT">LEBOURGNEUF golf
<progression>
<seq><rewrite>transform((TOKENS, "{", "}", "AND"))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "{", "}", "OR"))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "SYN(", ",GR_THESAURUS)", "OR"))</rewrite></seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="DEFAULT"/>
</query>', 1) > 0;
I do NOT get any results matching synonym 'BOURGNEUF'.. only those with 'LEBOURGNEUF' are returned...
further, if I intentionaly make a syntax error in the line
<seq><rewrite>transform((TOKENS, "SYN(", ",GR_THESAURUS)", "OR"))</rewrite></seq>
say like this:
<seq><rewrite>transform((TOKENS, "xSYN(", ",GR_THESAURUS)", "OR"))</rewrite></seq>
no error is returned and I get the same result set...
so this leads me to conclude that only the first two lines of the query are parsed/executed...
does anyone here have any ideas what is going one here?
in the preceding quire I neeed to add
<seq><rewrite>transform((TOKENS, "SYN(", ",GR_THESAURUS)", "AND"))</rewrite></seq>
<seq><rewrite>transform((TOKENS, "SYN(", ",GR_THESAURUS)", "OR"))</rewrite></seq>
and possibly
<seq><rewrite>transform((TOKENS, "NT(", ",2,GR_THESAURUS)", "OR"))</rewrite></seq>
can this be done???
why is there no errors when I execute the query (in sqldeveloper) ???
any hints will be greatly appreciated!
Cheers
Edited by: user8848610 on 2009-10-29 07:46now it works... although the simpler and cleaner solution of using transform would have been perfect, this function does a somewhat adequate job... I put it here so maybe it will help others ;)
FUNCTION BuildSearchPredicate (texte IN NOM_ASSJT.NOM%TYPE) RETURN VARCHAR2 IS
sSQL VARCHAR2(5000);
sSeq1 VARCHAR2(1000);
sSeq2 VARCHAR2(1000);
sSeq3 VARCHAR2(1000);
iFirst NUMBER(1);
iPosition NUMBER;
iToken NUMBER;
CURSOR curWords(line_text IN VARCHAR2) IS
select regexp_substr(line_text, '[^ ]+', 1, level) word
from dual
connect by regexp_substr(line_text, '[^ ]+', 1, level) is not null;
BEGIN
sSQL := sSQL || '<query><textquery lang="FRENCH" grammar="CONTEXT"><progression> ';
sSeq1 := '';
iFirst := 1;
iToken := 0;
FOR r_curWord IN curWords(texte)
LOOP
iToken := iToken + 1;
IF iFirst = 0 THEN
sSeq1 := sSeq1 || ' AND {' || trim(r_curWord.word) || '}';
sSeq2 := sSeq2 || ' OR {' || trim(r_curWord.word) || '}';
sSeq3 := sSeq3 || ' OR SYN(' || trim(r_curWord.word) || ',GR_THESAURUS)';
ELSE
sSeq1 := '{' || trim(r_curWord.word) || '}';
sSeq2 := '{' || trim(r_curWord.word) || '}';
sSeq3 := 'SYN(' || trim(r_curWord.word) || ',GR_THESAURUS)';
iFirst := 0;
END IF;
END LOOP;
sSQL := sSQL || '<seq>' || sSeq1 || '</seq>';
iPosition := instr(sSeq1, ' AND ');
IF instr(sSeq1, ' AND ',iPosition + 1) > 0 THEN -- we must have at least 2 AND operator for this to make sense
WHILE iPosition > 0
LOOP
IF instr(substr(sSeq1, iPosition + 5), ' AND ') > 0 THEN
sSQL := sSQL || '<seq>' || substr(sSeq1, iPosition + 5) || '</seq>';
END IF;
iPosition := instr(sSeq1, ' AND ', iPosition + 1);
END LOOP;
END IF;
IF iToken > 1 THEN -- no use in having OR if there is only one word
sSQL := sSQL || '<seq>' || sSeq2 || '</seq>';
END IF;
sSQL := sSQL || '<seq>' || sSeq3 || '</seq>';
RETURN sSql;
END BuildSearchPredicate;
END GR_RECH;
this will combine search words using AND and OR and SYN like so:
SELECT GR_RECH.BuildSearchPredicate('LEBOURGNEUF GOLF') FROM DUAL;
will result in :
<query><textquery lang="FRENCH" grammar="CONTEXT"><progression> <seq>{LEBOURGNEUF} AND {GOLF}</seq><seq>{LEBOURGNEUF} OR {GOLF}</seq><seq>SYN(LEBOURGNEUF,GR_THESAURUS) OR SYN(GOLF,GR_THESAURUS)</seq>
thanks for you help!
cheers
gth -
Oracle Text progressive relaxation
hello,
We're in the process of evaluating Oracle Text search engine so far so good until yesterday when we added Synonyms to our progressive search criterion and it stop working depending on where we place the synonym search. If we place it first everything else stops working (stemming, fuzzy...) If we place it last then the synonym search stops working. I saw a reference to a bug in this conference that seemed similar to the problem, I believe it mentioned that it had been fixed in 10.2.0.3 (this is the version were on).
The following is a sample of plsql code were executing
select score(1), nm_resource, ADDR_RSRC_ST_LN_1, id_resource, ADDR_RSRC_CITY FROM caps_resource where
CONTAINS (nm_resource,
'<query>
<textquery lang="ENGLISH" grammar="CONTEXT">' || res_name ||
'<progression>
<seq><rewrite>transform((TOKENS, ?{?, ?}?, ?AND?))</rewrite></seq>
<seq><rewrite>transform((TOKENS, ??{?, ?}?, ?AND?))</rewrite>/seq>
<seq><rewrite>transform((TOKENS, ?{?, ?}?, ?OR?))</rewrite></seq>
<seq><rewrite>transform((TOKENS, ??{?, ?}?, ?OR?))</rewrite>/seq>
<seq><rewrite>transform((TOKENS, ?{?, ?}?, ?ACCUM?))</rewrite></seq>
<seq><rewrite>transform((TOKENS, ?{?, ?}?, ?NEAR?))</rewrite></seq>
<seq>' || 'SYN(' || REPLACE('' || res_name || '', ' ', ',IMPACT_tst) AND SYN(') || ',IMPACT_tst)' || '</seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="default"/>
</query>', 1)>0Here is a suggested alternative. I have used a syns function that I wrote for another user on another thread, that checks for all combinations of words that could amount to a synonym, up to the maximum number of words that you can specify as an input parameter. I have then used that in a separate contains clause and combined the score results, using the score derived from the synonym only when there is no score from the progressive rewrites, ordering by the progressive rewrites first. Also notice that the ordering must be done in an inner sub-query then the rows limited in an outer subquery. Otherwise it can select the first 100 rows, then order them, instead of the other way around, which can produce an entirely different result set.
SCOTT@orcl_11g> CREATE TABLE caps_resource
2 (nm_resource VARCHAR2 (30))
3 /
Table created.
SCOTT@orcl_11g> INSERT ALL
2 INTO caps_resource VALUES ('Delagarza,Lorenzo')
3 INTO caps_resource VALUES ('Diana De La Garza')
4 INTO caps_resource VALUES ('De La Garza,Fred')
5 INTO caps_resource VALUES ('somebody else')
6 SELECT * FROM DUAL
7 /
4 rows created.
SCOTT@orcl_11g> CREATE INDEX your_index ON caps_resource (nm_resource)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 /
Index created.
SCOTT@orcl_11g> BEGIN
2 CTX_THES.CREATE_THESAURUS ('impact_tst');
3 CTX_THES.CREATE_RELATION ('impact_tst', 'Delagarza', 'SYN', 'De La Garza');
4 END;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> create or replace function syns
2 (p_words in varchar2,
3 p_thes in varchar2,
4 p_num in number default 3) -- maximum number of words per synonym phrase
5 return varchar2
6 as
7 v_words_in varchar2 (32767) := ltrim (p_words) || ' ';
8 v_words_out varchar2 (32767);
9 begin
10 while instr (v_words_in, ' ') > 0 loop
11 v_words_in := replace (v_words_in, ' ', ' ');
12 end loop;
13 while length (v_words_in) > 1
14 loop
15 for i in reverse 1 .. least (p_num, (length (v_words_in) - length (replace (v_words_in, ' ', ''))))
16 loop
17 if instr (ctx_thes.syn (substr (v_words_in, 1, instr (v_words_in, ' ', 1, i) - 1), p_thes), '|') > 0
18 or i = 1 then
19 v_words_out := v_words_out
20 || ' AND ('
21 || ctx_thes.syn (substr (v_words_in, 1, instr (v_words_in, ' ', 1, i) - 1), p_thes)
22 || ')';
23 v_words_in := substr (v_words_in, instr (v_words_in, ' ', 1, i) + 1);
24 exit;
25 end if;
26 end loop;
27 end loop;
28 return ltrim (v_words_out, ' AND ');
29 end syns;
30 /
Function created.
SCOTT@orcl_11g> show errors
No errors.
SCOTT@orcl_11g> VARIABLE res_name VARCHAR2 (100)
SCOTT@orcl_11g> EXEC :res_name := 'De La Garza'
PL/SQL procedure successfully completed.
SCOTT@orcl_11g> SELECT syns (:res_name, 'impact_tst') FROM DUAL
2 /
SYNS(:RES_NAME,'IMPACT_TST')
({DE LA GARZA}|{DELAGARZA})
SCOTT@orcl_11g> SELECT the_score, nm_resource
2 FROM (select DECODE (score(1), 0, SCORE(2), SCORE(1)) AS the_score, nm_resource
3 FROM caps_resource
4 where CONTAINS (nm_resource,
5 '<query>
6 <textquery lang="ENGLISH" grammar="CONTEXT">' || :res_name ||
7 '<progression>
8 <seq><rewrite>transform((TOKENS, "{", "}", "AND"))</rewrite></seq>
9 <seq><rewrite>transform((TOKENS, "?{", "}", "AND"))</rewrite>/seq>
10 <seq><rewrite>transform((TOKENS, "{", "}", "OR"))</rewrite></seq>
11 <seq><rewrite>transform((TOKENS, "?{", "}", "OR"))</rewrite>/seq>
12 <seq><rewrite>transform((TOKENS, "{", "}", "ACCUM"))</rewrite></seq>
13 <seq><rewrite>transform((TOKENS, "{", "}", "NEAR"))</rewrite></seq>
14 </progression>
15 </textquery>
16 <score datatype="INTEGER" algorithm="default"/>
17 </query>', 1) > 0
18 OR CONTAINS (nm_resource, syns (:res_name, 'impact_tst'), 2) > 0
19 ORDER BY SCORE (1) DESC, SCORE (2) DESC)
20 WHERE ROWNUM < 100
21 /
THE_SCORE NM_RESOURCE
76 Diana De La Garza
76 De La Garza,Fred
5 Delagarza,Lorenzo
SCOTT@orcl_11g> -
Progressive relaxation doesn't progress
I just discovered that in a contains() query with a <progression> tag and multiple <seq> conditions, the query does not return any results (ie, does not evaluate any subsequent conditions) if the first condition fails (ie, returns no rows).
Is this the correct behavior? It seems like a bug to me. I dont see it mentioned in the documentation anywhere.
Thanks,
RoryIf you use a ctxcat index and catsearch, instead of a context index and contains, the optimizer uses the domain index and returns the correct results quickly, as shown in the comparison below. The only bad thing about catsearch is that, as far as I know, the score function doesn't work with it.
SCOTT@10gXE> CREATE TABLE presidents
2 (id NUMBER,
3 name VARCHAR2(60))
4 /
Table created.
SCOTT@10gXE> INSERT INTO presidents VALUES (1, 'William Jefferson Clinton')
2 /
1 row created.
SCOTT@10gXE> BEGIN
2 FOR i IN 1 .. 40 LOOP
3 INSERT INTO presidents
4 SELECT object_id, object_name
5 FROM all_objects;
6 END LOOP;
7 END;
8 /
PL/SQL procedure successfully completed.
SCOTT@10gXE> SELECT COUNT(*) FROM presidents
2 /
COUNT(*)
481081
SCOTT@10gXE> CREATE INDEX presidents_idx
2 ON presidents (name)
3 INDEXTYPE IS CTXSYS.CONTEXT
4 /
Index created.
SCOTT@10gXE> EXEC DBMS_STATS.GATHER_TABLE_STATS ('SCOTT', 'PRESIDENTS')
PL/SQL procedure successfully completed.
SCOTT@10gXE> SET TIMING ON
SCOTT@10gXE> SET AUTOTRACE ON EXPLAIN
SCOTT@10gXE> select id, name
2 from presidents
3 where contains(name,
4 '<query>
5 <textquery>
6 <progression>
7 <seq>{William} {Clinton}</seq>
8 <seq>{William} ; {Clinton}</seq>
9 </progression>
10 </textquery>
11 </query>',1) <> 0
12 /
ID NAME
1 William Jefferson Clinton
Elapsed: 00:01:35.74
Execution Plan
Plan hash value: 3740813417
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 23933 | 514K| 746 (35)| 00:00:09 |
|* 1 | TABLE ACCESS FULL| PRESIDENTS | 23933 | 514K| 746 (35)| 00:00:09 |
Predicate Information (identified by operation id):
1 - filter("CTXSYS"."CONTAINS"("NAME",'<query>
<textquery> <progression>
<seq>{William} {Clinton}</seq> <seq>{William} ;
{Clinton}</seq> </progression>
</textquery> </query>',1)<>0)
SCOTT@10gXE> SET TIMING OFF
SCOTT@10gXE> SET AUTOTRACE OFF
SCOTT@10gXE> DROP INDEX presidents_idx
2 /
Index dropped.
SCOTT@10gXE> CREATE INDEX presidents_idx
2 ON presidents (name)
3 INDEXTYPE IS CTXSYS.CTXCAT
4 /
Index created.
SCOTT@10gXE> EXEC DBMS_STATS.GATHER_TABLE_STATS ('SCOTT', 'PRESIDENTS')
PL/SQL procedure successfully completed.
SCOTT@10gXE> SET TIMING ON
SCOTT@10gXE> SET AUTOTRACE ON EXPLAIN
SCOTT@10gXE> select id, name
2 from presidents
3 where catsearch(name,
4 '<query>
5 <textquery>
6 <progression>
7 <seq>{William} {Clinton}</seq>
8 <seq>{William} ; {Clinton}</seq>
9 </progression>
10 </textquery>
11 </query>',null) > 0
12 /
ID NAME
1 William Jefferson Clinton
Elapsed: 00:00:01.94
Execution Plan
Plan hash value: 777849224
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 24160 | 519K| 486 (1)| 00:00:06 |
| 1 | TABLE ACCESS BY INDEX ROWID| PRESIDENTS | 24160 | 519K| 486 (1)| 00:00:06 |
|* 2 | DOMAIN INDEX | PRESIDENTS_IDX | | | | |
Predicate Information (identified by operation id):
2 - access("CTXSYS"."CATSEARCH"("NAME",'<query> <textquery>
<progression> <seq>{William}
{Clinton}</seq> <seq>{William} ; {Clinton}</seq>
</progression> </textquery>
</query>',NULL)>0)
SCOTT@10gXE> -
Progressive relaxation doesn't execute all sequeces
I' trying to execute the following query:
select * from UNIMI_GA.ENTITA_RC entitarc0_ where CONTAINS(entitarc0_.VALORE_INDICIZZATO, '
<query><textquery lang="ITALIAN" grammar="CONTEXT">
<progression>
<seq>$esame NEAR $microbiologiche</seq>
<seq>$esame AND $microbiologiche</seq>
<seq>$esame ACCUM $microbiologiche</seq>
<seq>?esame ACCUM ?microbiologiche</seq>
</progression>
</textquery><score datatype="INTEGER" algorithm="DEFAULT"/></query>
', 1)>0
Scenario 1
There is no field VALORE_INDICIZZATO that contains both "esame" NEAR/AND "microbiologiche" (in exact/stemmed versions).
But there are loads of records with that field containing "esame" or "microbiologiche".
I would expect those records to be returned from this query. But this not happens. The only way I found to obtain what I want is deleting the first 2 seq nodes in the xml (those containing NEAR and AND operators).
Scenario 2
If I add a new record with both "esame" and "microbiologiche" in VALORE_INDICIZZATO and execute the query again, the query returns the last inserted record and all the records that contains "esame" or "microbiologiche".
Is this behavioiur correct?
Thanks
DavideI believe bug 5060137 was introduced in 10.2.0.1 and fixed in the 10.2.0.3 patch set. I don't have access to Metalink, so I can't tell you exactly where to find the patch set, just that others have found it and used it to fix the problem. I imagine someone on Metalink can help you locate it.
-
Progression not yielding the desired result
Hi
I have written a text query using the progressive relaxation method. It is not giving me the desired results. Here are the query details:
I have created an Intermedia index on a table with following specs:
BEGIN
CTX_DDL.DROP_PREFERENCE('CTXSYS.COMPANY_SEARCH_MULTI');
CTX_DDL.CREATE_PREFERENCE('CTXSYS.COMPANY_SEARCH_MULTI', 'MULTI_COLUMN_DATASTORE');
CTX_DDL.SET_ATTRIBUTE( 'CTXSYS.COMPANY_SEARCH_MULTI',
'columns',
'COMPANY,
DESC_N_PRODS,
PROD_DESC_N_PRODS,
PG_TITLE_GLUSR,
PG_KWD_DESC,
GEOGRAPHICAL_PROFILE,
GLUSR_DESC,
SUBCAT_DESC,
CTL_DESC,
SHORT_PROFILE,
LONG_PROFILE');
CTX_DDL.DROP_SECTION_GROUP('CTXSYS.COMPANY_SEARCH_GROUP');
CTX_DDL.CREATE_SECTION_GROUP('CTXSYS.COMPANY_SEARCH_GROUP', 'BASIC_SECTION_GROUP');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F1', 'COMPANY');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F2', 'DESC_N_PRODS');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F3', 'PROD_DESC_N_PRODS');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F4', 'PG_TITLE_GLUSR');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F5', 'PG_KWD_DESC');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F6', 'GEOGRAPHICAL_PROFILE');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F7', 'GLUSR_DESC');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F8', 'SUBCAT_DESC');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F9', 'CTL_DESC');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F10','SHORT_PROFILE');
CTX_DDL.ADD_FIELD_SECTION('CTXSYS.COMPANY_SEARCH_GROUP' , 'F11','LONG_PROFILE');
CTX_DDL.DROP_PREFERENCE('CTXSYS.IIL_LEXER');
CTX_DDL.CREATE_PREFERENCE('CTXSYS.IIL_LEXER','BASIC_LEXER');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_LEXER', 'INDEX_STEMS', 'ENGLISH');
CTX_DDL.DROP_PREFERENCE('CTXSYS.IIL_FUZZY_PREF');
CTX_DDL.CREATE_PREFERENCE('CTXSYS.IIL_FUZZY_PREF', 'BASIC_WORDLIST');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','FUZZY_MATCH','ENGLISH');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','FUZZY_SCORE','60');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','FUZZY_NUMRESULTS','100');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','SUBSTRING_INDEX','TRUE');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','PREFIX_INDEX','TRUE');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','PREFIX_MIN_LENGTH','1');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','PREFIX_MAX_LENGTH','3');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','WILDCARD_MAXTERMS','15000');
CTX_DDL.SET_ATTRIBUTE('CTXSYS.IIL_FUZZY_PREF','STEMMER','ENGLISH');
END;
CREATE INDEX COMPANY_SEARCH_IM on COMPANY_SEARCH(DUMMY) INDEXTYPE IS
CTXSYS.CONTEXT PARAMETERS
('DATASTORE CTXSYS.COMPANY_SEARCH_MULTI SECTION GROUP CTXSYS.COMPANY_SEARCH_GROUP MEMORY 50M
LEXER CTXSYS.IIL_LEXER WORDLIST CTXSYS.IIL_FUZZY_PREF STOPLIST CTXSYS.IIL_STOPLIST');
Now if I want to search for a string - acrylic crochet
My progressive clause is as follows:
<QUERY>
<TEXTQUERY>
<PROGRESSION>
<SEQ>(acrylic crochet) within F2</SEQ>
<SEQ>($acrylic $crochet) within F2</SEQ>
<SEQ>(acrylic crochet) within F3</SEQ>
<SEQ>($acrylic $crochet) within F3</SEQ>
<SEQ>(NEAR((acrylic,crochet))) within F2</SEQ>
</PROGRESSION>
</TEXTQUERY>
</QUERY>
The data set has a record where F2 Contains following text:
Manufacturers and exporters of yarns like acrylic yarn, viscose yarns, acrylic blended yarn, acrylic knitting yarn, spun yarn, blended yarns, braided thread, chenille yarn, cotton yarn, crochet yarn, dupion silk yarns etc
My problem is that - This record is not coming in the search result.
The record starts appearing if I use only NEAR Clause. as shown below:
<QUERY>
<TEXTQUERY>
<PROGRESSION>
<SEQ>(NEAR((acrylic,crochet))) within F2</SEQ>
</PROGRESSION>
</TEXTQUERY>
</QUERY>
Please advise what could be wrong - is my Index proper, or my progressive clause has some problem or there is something else which I have totally missed.
Regards
MadhupThe discussion in the link below contains the same bug that you have encoutered and some workarounds.
Re: progressive relaxation doesn't progress -
Order of words, fuzzy and utl_match
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
"CORE 10.2.0.1.0 Production"
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
create table category(cat_id number(20),cat_type varchar2(3000));
create table category_match(cat_id number(20),cat_type varchar2(3000));
Insert into category (CAT_ID,CAT_TYPE) values (12790,'AUTO CONSULTANTS');
INSERT INTO CATEGORY (CAT_ID,CAT_TYPE) VALUES (23803,'AUTO CONSULTANT');
Insert into category (CAT_ID,CAT_TYPE) values (23804,'CONSULTANT FOR AUTO FINANCE');
Insert into category_match (CAT_ID,CAT_TYPE) values (12790,'AUTO CONSULTANTS');
INSERT INTO CATEGORY_match (CAT_ID,CAT_TYPE) VALUES (23803,'AUTO CONSULTANT');
Insert into category_match (CAT_ID,CAT_TYPE) values (23804,'CONSULTANT FOR AUTO FINANCE');
CREATE INDEX "LOOKING4"."MYINDEX" ON "CATEGORY_MATCH"
"CAT_TYPE"
INDEXTYPE IS "CTXSYS"."CONTEXT" ;
CREATE INDEX "LOOKING4"."CAT_TYPE_IDX" ON "CATEGORY"
"CAT_TYPE"
INDEXTYPE IS "CTXSYS"."CTXCAT" ;
select cat_id,CAT_TYPE,UTL_MATCH.edit_distance_similarity(CAT_TYPE,'AUTO CONSULTANT') from
select * from category where catsearch(cat_type,
'<query>
<textquery grammar="context">
<progression>
<seq>auto consultant</seq>
<seq>?(auto) and ?(consultant)</seq>
</progression>
</textquery>
</query>'
,NULL)>0
)where rownum<5
23803 AUTO CONSULTANT 100
12790 AUTO CONSULTANTS 94
23804 CONSULTANT FOR AUTO FINANCE 26
update category set cat_type='CONSULTANTS AUTO' WHERE CAT_ID=12790
select cat_id,CAT_TYPE,UTL_MATCH.edit_distance_similarity(CAT_TYPE,'AUTO CONSULTANT') from
select * from category where catsearch(cat_type,
'<query>
<textquery grammar="context">
<progression>
<seq>auto consultant</seq>
<seq>?(auto) and ?(consultant)</seq>
</progression>
</textquery>
</query>'
,NULL)>0
)where rownum<5
23803 AUTO CONSULTANT 100
12790 CONSULTANTS AUTO 32
23804 CONSULTANT FOR AUTO FINANCE 26
select score(1),cat_id,cat_type from CATEGORY_MATCH where cat_id in(
select cat_id from category where catsearch(cat_type,
'<query>
<textquery grammar="context">
<progression>
<seq>auto consultant</seq>
<seq>?(auto) and ?(consultant)</seq>
</progression>
</textquery>
</query>'
,NULL)>0) AND
contains(cat_type,'?(auto) and ?(consultant)',1)>0
9 23803 AUTO CONSULTANT
9 12790 AUTO CONSULTANTS
9 23804 CONSULTANT FOR AUTO FINANCEi have been using catsearch to use progressive relaxation
there are many "cat_types" like "cat_id" =23803,12790 ,the order of words in a sentence changes
there are upto 10 words in each row of "cat_types" column
among others i have referred
Achieving functionality of many preferences using one context index
and
Re: Fuzzy search - more accurate score??
there is very less possibility of repetition of words in a row
utl match seems to work perfect only when the order of appearance of words is same
if you can suggest a way to get a very close score for cat_id 23803 and 12790 it would be much appreciated
thanks and regardsselect *
FROM (SELECT score(1),score(2),score(3),score(4),GREATEST (SCORE(1), SCORE(2) - 1, SCORE(3) - 2, SCORE(4) - 3) g_scores,
UTL_MATCH.EDIT_DISTANCE_SIMILARITY (CAT_TYPE,'AUTO CONSULTANT') EDS,
CAT_ID, CAT_TYPE
FROM category_match
WHERE CONTAINS (cat_type, 'solar water heater* 10 * 10', 1) > 0
OR CONTAINS (cat_type, 'NEAR ((?solar, ?water ,?heater), 0, TRUE) * 10 * 10', 2) > 0
OR CONTAINS (cat_type, 'NEAR ((?solar, ?water ,?heater), 0, FALSE) * 10 * 10', 3) > 0
or CONTAINS (CAT_TYPE, '(?solar AND ?water AND ?heater) * 10 * 10', 4) > 0
order by g_scores desc, EDS desc)
WHERE ROWNUM<100
100 100 100 100 100 23 4 SOLAR WATER HEATER-ANU
100 100 100 100 100 22 26901 SOLAR WATER HEATER SUDARSHAN SAUR
100 100 100 100 100 21 30 SOLAR WATER HEATER INDUSTRIAL
100 100 100 100 100 20 17379 SOLAR WATER HEATER DEALERS-TATA
100 100 100 100 100 20 26906 SOLAR WATER HEATER NUETECH
100 100 100 100 100 20 11465 SOLAR WATER HEATER DEALERS-ANU
100 100 100 100 100 20 21 SOLAR WATER HEATER-ZING TATA BP
100 100 100 100 100 20 11463 SOLAR WATER HEATER MANUFACTURERS-ANU
100 100 100 100 100 19 8 SOLAR WATER HEATER MANUFACTURERS
100 100 100 100 100 19 23 SOLAR WATER HEATER EVACUATED TUBE
100 100 100 100 100 19 49 SOLAR WATER HEATER-HOTMAX NOVA TATA BP
100 100 100 100 100 19 13357 SOLAR WATER HEATER INDUSTRIAL DEALERS
100 100 100 100 100 18 16300 SOLAR WATER HEATER-TECHNOMAX
100 100 100 100 100 18 9 SOLAR WATER HEATER DEALERS-TATA BP
100 100 100 100 100 18 20 SOLAR WATER HEATER-ZING
100 100 100 100 100 18 18 SOLAR WATER HEATER-ORB SOLAR
100 100 100 100 100 18 22552 SOLAR WATER HEATER-KOTAK URJA
100 100 100 100 100 18 26908 SOLAR WATER HEATER SUPREME
100 100 100 100 100 17 26907 SOLAR WATER HEATER TECHNOMAX"
100 100 100 100 100 17 13322 SOLAR WATER HEATER DISTRIBUTORS
100 100 100 100 100 17 22 SOLAR WATER HEATER-ETC TATA BP
100 100 100 100 100 17 48 SOLAR WATER HEATER-VAJRA PLUS TATA BP
100 100 100 100 100 17 27084 SOLAR WATER HEATER SALES
100 100 100 100 100 16 16236 SOLAR WATER HEATER DEALERS-RACOLD
100 100 100 100 100 16 15 SOLAR WATER HEATER-NUTECH
100 100 100 100 100 16 1 SOLAR WATER HEATER DEALERS
100 100 100 100 100 15 2 SOLAR WATER HEATER DEALERS-TATA BP SOLAR
100 100 100 100 100 15 31 SOLAR WATER HEATER DOMESTIC
100 100 100 100 100 15 13 SOLAR WATER HEATER DEALERS-V GUARD
100 100 100 100 100 14 17 SOLAR WATER HEATER-KAMAL SOLAR
100 100 100 100 100 13 11467 SOLAR WATER HEATER DEALERS-GILMA
100 100 100 100 100 13 19 SOLAR WATER HEATER-GILMA
100 100 100 100 100 13 10 SOLAR WATER HEATER REPAIRS & SERVICES-TATA SOLAR
100 100 100 100 100 12 10578 SOLAR WATER HEATER
100 100 100 100 100 11 3 SOLAR WATER HEATER REPAIRS & SERVICES
0 0 100 100 98 25 10120 WATER HEATER SOLAR INDUSTRIAL
0 0 100 100 98 20 12953 WATER HEATER SOLAR-RACOLD
0 0 100 100 98 17 10119 WATER HEATER SOLAR RESIDENCIAL
{code}
the query is working accurately technically
but is there any way to get 10578 on top
the requirement is
---first
solar water heater
solar water heater dealers
solar water heater manufacturers
solar water heater distributors
solar water heater sales
solar water heater repairs and servicing
---followed by
SOLAR WATER HEATER REPAIRS & SERVICES-TATA SOLAR
SOLAR WATER HEATER-KAMAL SOLAR
SOLAR WATER HEATER DEALERS-TATA BP SOLAR etc etc
so if the end user types in "solar water" the top row would have a row from the table that has what the end user has entered followed by "dealers" or "manufacturers" or "distributors" or "sales" or "repairs and servicing"
so if a row contains "solar water dealer" it shows up on top
or(if "solar water dealer" is not there and "solar water manufacturers" or "solar water distributors" etc is not present)
a row from the table that has what the end user has entered PLUS "heater" followed by "dealers" or "manufacturers" or "distributors" or "sales" or "repairs and servicing"
so "solar water heater dealers" shows up on top
these words - "dealers" , "manufacturers" , "distributors" , "sales" , "repairs and servicing" etc remain constant
what i am using right now is
{code}
create or replace
procedure HOME_OLD
p_cat_type in varchar2,
P_LOC IN NUMBER,
P_MAX IN NUMBER,
P_MIN IN NUMBER,
P_OUT OUT SYS_REFCURSOR
as
VARIAB varchar2(500);
VARIAB2 varchar2(500);
VARIAB3 varchar2(500);
VARIAB4 varchar2(500);
begin
--VARIAB2:='?'||replace(P_CAT_TYPE,' ',', ?');
--VARIAB3:='?'||replace(P_CAT_TYPE,' ',' ?');
--DBMS_OUTPUT.PUT_LINE(VARIAB2);
--DBMS_OUTPUT.PUT_LINE(VARIAB3);
SELECT stragg(cat_id) into variab
FROM (SELECT GREATEST (SCORE(1), SCORE(2) - 1, SCORE(3) - 2, SCORE(4) - 3) score,
CAT_ID, CAT_TYPE
FROM category_match
-- exact words in order:
WHERE CONTAINS (cat_type,get_basic(P_CAT_TYPE), 1) > 0
-- similar words next to each other in order:
OR CONTAINS (cat_type, get_near_syntax(P_CAT_TYPE), 2) > 0
-- similar words next to each other in any order:
OR CONTAINS (cat_type, get_near_syntax_desc(P_CAT_TYPE), 3) > 0
-- similar words anywhere in any order:
OR CONTAINS (cat_type, get_anywhere(P_CAT_TYPE), 4) > 0
order by score desc)
where rownum < 3;
DBMS_OUTPUT.PUT_LINE(VARIAB);
open p_out
FOR select * from(select rownum r,name,address1,telephone,mobile,CAT_TYP,cat_id,
(case when address2=p_loc and ACT_STATUS='Y' then '1' when address2=p_loc then '2' when address2 in
(select NEARBY_LOC from NEAR_BY where LOCALITY_ID=p_loc) and ACT_STATUS='Y'
then '3' when ADDRESS2 in (select NEARBY_LOC from NEAR_BY where LOCALITY_ID=p_loc)
then '4' when ACT_STATUS='Y' and address2<> p_loc then '5' else '6' end) as marker
FROM TEST_TEST
WHERE
CAT_ID in(select * from table(STRING_TO_TABLE_NUM(variab))) and rownum<P_MAX order by marker) where r>P_MIN;
IF VARIAB IS NULL THEN
OPEN P_OUT
FOR SELECT * FROM(SELECT rownum r,name,address1,telephone,mobile,CATS
FROM (SELECT GREATEST (SCORE(1), SCORE(2) - 1, SCORE(3) - 2, SCORE(4) - 3) score,
NAME,ADDRESS1,TELEPHONE,MOBILE,CATS
FROM TEST_TEST2
-- exact words in order:
WHERE CONTAINS (NAME,get_basic(P_CAT_TYPE), 1) > 0
-- similar words next to each other in order:
OR CONTAINS (NAME, get_near_syntax(P_CAT_TYPE), 2) > 0
-- similar words next to each other in any order:
OR CONTAINS (NAME, get_near_syntax_desc(P_CAT_TYPE), 3) > 0
-- similar words anywhere in any order:
OR CONTAINS (NAME, get_anywhere(P_CAT_TYPE), 4) > 0
ORDER BY SCORE DESC)
WHERE ROWNUM < P_MAX)where r>P_MIN;
END IF;
end home_old;
{code}
the flow is to find what the end user has entered in category table ,if a match exists,find all reg_ids from test_test materialized view that have selected the matched cat_id..
the test_test materialized view lists each company cat_id-selected-by-that-company number of times
if no match is found in category table what the end user has entered could be a company so a search in name column of test_test2 materialized view..
this materialized view has one entry for each company
{code}
create or replace
FUNCTION GET_BASIC(P_CAT_TYPE VARCHAR2)
RETURN VARCHAR2
is
VARIAB2 VARCHAR2(3000);
begin
VARIAB2:='{'||P_CAT_TYPE||'}*10*10';
return(VARIAB2);
END;
create or replace
FUNCTION GET_NEAR_SYNTAX(P_CAT_TYPE VARCHAR2)
RETURN VARCHAR2
is
VARIAB2 VARCHAR2(3000);
begin
VARIAB2:='NEAR((?{'||replace(P_CAT_TYPE,' ','}, ?{')||'}),10,TRUE)*10*10';
return(VARIAB2);
END;
create or replace
FUNCTION GET_NEAR_SYNTAX_DESC(P_CAT_TYPE VARCHAR2)
RETURN VARCHAR2
is
VARIAB2 VARCHAR2(3000);
begin
VARIAB2:='NEAR((?{'||replace(P_CAT_TYPE,' ','}, ?{')||'}),10,FALSE)*10*10';
return(VARIAB2);
END;
{code}
can anything be done to ameliorate this whole flow
can anything be done to eliminate the near_by and act_status and locality checking in ordering by "marker" clause
below is the materialized view creation ddl
SELECT IN_V.REG_ID,
IN_V.NAME,
IN_V.TELEPHONE,
IN_V.MOBILE,
IN_V.ADDRESS1,
IN_V.ADDRESS2,
IN_V.ACT_STATUS,
resec.cat_id,
UPPER(STRAGG(IN_V.CAT_TYPE)) AS cat_typ
FROM
(SELECT RSC.REG_ID,
R.NAME,
RSC.CAT_ID,
C.CAT_TYPE,
R.ADDRESS1,
R.ADDRESS2,
R.ACT_STATUS,
R.TELEPHONE,
R.MOBILE,
ROW_NUMBER() OVER (PARTITION BY RSC.REG_ID ORDER BY rsc.reg_id) AS TT
FROM REG_SEG_CAT RSC,
category C,
REGISTRATION R
WHERE C.CAT_ID=RSC.CAT_ID
AND R.REG_ID =RSC.REG_ID
) IN_V,
REG_SEG_CAT RESEC
WHERE in_v.reg_id=resec.reg_id
AND IN_V.TT <6
GROUP BY IN_V.REG_ID,
IN_V.NAME,
IN_V.TELEPHONE,
IN_V.MOBILE,
IN_V.ADDRESS2,
IN_V.ACT_STATUS,
IN_V.ADDRESS1,
resec.cat_id;
and sql>desc test_test
REG_ID
NAME
TELEPHONE
MOBILE
ADDRESS1
ADDRESS2
ACT_STATUS
CAT_ID
CAT_TYP
please let me know if you need more info
Edited by: 946207 on Apr 19, 2013 6:22 PM -
Justification for Using Oracle Text
Hello,
Can someone give me good cause (justification) for utilizing Oracle Text over other tools out there that are not tied directly to Oracle?
Apparently it is possible to identify metadata within text and do keyfield and keyword searches this way with other tools, but I question the accuracy, speed, or value in terms of data relationships with this approach. I feel the relationships belong in the database along with the indexes but can't convince anyone of this.
Has anyone experience working with Oracle Text where relationships help to drive the search and can give me good cause to this approach?
thanksHi,
Justification depends on your use. For starters:
1) It is included in both standard and enterprise editions of the db at no added charge
2) Uses SQL to query and maintain
3) Includes a number of built-ins for maintenance and optimization
4) It has 4 different index types for various uses
5) It can index any data type
6) UltraSearch is included in both standard and enterprise editions of the db at no additional charge (this is a crawler built on Oracle Text).
As for the integration - it is optimized for Oracle. If you were to build a standalone indexing solution you would probably design it a bit different, but Oracle Text takes into account the optimizer and database structure.
It has other features (same as some of the other tools) like a knowledge base, classification, clustering, theme extraction, language-specific features, ability to index documents in and out of the database, stopwords, stemming, wildcard, progressive relaxation, and the list goes on.
I guess my question would be, what is the reason for NOT using it? That might give me a better line on the reasoning so that I can respond with something a bit more specific.
Thanks,
Ron -
Scoring messed up using concatenated datastore Index
Hi,
Here is my table structure....
CREATE TABLE SRCH_KEYWORD_SEARCH_SME
SYS_ID NUMBER(10) NOT NULL,
PAPER_NO VARCHAR2(10),
PRODIDX_ID VARCHAR2(10),
RESULT_TITLE VARCHAR2(255),
RESULT_DESCR VARCHAR2(1000) NOT NULL,
ABSTRACT CLOB,
SRSLT_CATEGORY_ID VARCHAR2(10) NOT NULL,
SRSLT_SUB_CATEGORY_ID VARCHAR2(10) NOT NULL,
ACTIVE_FLAG VARCHAR2(1) DEFAULT 'Y' NOT NULL,
EVENT_START_DATE DATE,
EVENT_END_DATE DATE,
Here is the Concatenated Datastore preference...
-- Drop any existing storage preference.
CTX_DDL.drop_preference('SEARCH_STORAGE_PREF');
-- Create new storage preference.
CTX_DDL.create_preference('SEARCH_STORAGE_PREF', 'BASIC_STORAGE');
CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'I_TABLE_CLAUSE', 'tablespace searchidx');
CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'K_TABLE_CLAUSE', 'tablespace searchidx');
CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'R_TABLE_CLAUSE', 'tablespace searchidx lob (data) store as (disable storage in row cache)');
CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'N_TABLE_CLAUSE', 'tablespace searchidx');
CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'I_INDEX_CLAUSE', 'tablespace searchidx compress 2');
CTX_DDL.set_attribute('SEARCH_STORAGE_PREF', 'P_TABLE_CLAUSE', 'tablespace searchidx');
-- Drop any existing datastore preference.
CTX_DDL.drop_preference('SEARCH_DATA_STORE');
CTX_DDL.DROP_SECTION_GROUP('SEARCH_DATA_STORE_SG');
-- Create new multi-column datastore preference.
CTX_DDL.create_preference('SEARCH_DATA_STORE','MULTI_COLUMN_DATASTORE');
CTX_DDL.set_attribute('SEARCH_DATA_STORE','columns','abstract, srslt_category_id, srslt_sub_category_id, active_flag');
CTX_DDL.set_attribute('SEARCH_DATA_STORE', 'FILTER','N,N,N,N');
-- Create new section group preference.
CTX_DDL.create_section_group ('SEARCH_DATA_STORE_SG','BASIC_SECTION_GROUP');
CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'abstract', 'abstract', TRUE);
CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'srslt_category_id', 'srslt_category_id', TRUE);
CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'srslt_sub_category_id', 'srslt_sub_category_id',TRUE);
CTX_DDL.add_field_section('SEARCH_DATA_STORE_SG', 'active_flag', 'active_flag', TRUE);
Here is the context Index
CREATE INDEX SRCH_KEYWORD_SEARCH_I ON SRCH_KEYWORD_SEARCH_SME(ABSTRACT)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('STORAGE search_storage_pref DATASTORE SEARCH_DATA_STORE SECTION GROUP SEARCH_DATA_STORE_SG' )
Here is the Query # 1 I am trying out...
SELECT /*+ FIRST_ROWS(10) */
SCORE(1) score_nbr,
k.SYS_ID,
k.RESULT_TITLE,
FROM SRCH_KEYWORD_SEARCH_SME k
WHERE CONTAINS (k.ABSTRACT, '<query><textquery><progression><seq>{hitchhiker} WITHIN abstract</seq></progression></textquery></query>',1) > 0
ORDER BY SCORE(1) DESC;
Here is the result for Query # 1...
score_nbr sys_id result_title
54 99220 SME Releases New Book The Hitchhiker's Guide to Lean 72
43 116583 Lean Leadership Package 72
32 132392 The Hitchhikers Guide to Lean: Lessons from the Road 72
11 132017 Lean Manufacturing A Plant Floor Guide Book Summary 72
11 137106 Managing Factory Maintenance, Second Edition 72
11 132082 Lean Pocket GuideHere is the Query # 2 I am trying out...
SELECT /*+ FIRST_ROWS(10) */
SCORE(1) score_nbr,
k.SYS_ID,
k.RESULT_TITLE,
FROM SRCH_KEYWORD_SEARCH_SME k
WHERE CONTAINS (k.ABSTRACT, '<query><textquery><progression><seq>{hitchhiker} WITHIN abstract AND Y WITHIN active_flag</seq></progression></textquery></query>',1) > 0
ORDER BY SCORE(1) DESC
Here is the result for Query # 2...
score_nbr sys_id result_title
3 132017 Lean Manufacturing: A Plant Floor Guide Book Summary 72
3 137106 Managing Factory Maintenance, Second Edition 72
3 132082 Lean Pocket Guide 72
3 132083 The Toyota Way: 14 Management Principles From the World's Greatest... 72
3 132417 Lean Manufacturing: A Plant Floor Guide 72
3 132091 Breaking the Cost Barrier: A Proven Approach to Managing and... 72
3 99318 Conflicting pairs 72
3 132393 One-Piece Flow: Cell Design for Transforming the Production Process 72
3 137091 Learning to See: Value Stream Mapping to Create Value & Eliminate MUDA 72
3 137090 The Purchasing Machine: How the Top 10 Companies Use Best Practices... 72
3 137393 Passion for Manufacturing My question is, why did the scoring went all the way to 3 for ALL the results the above query returned when I used the AND clause
and added the 2nd column used in the datastore for my query condition..
Also I want to use progressive relaxation technique in the queries to use stemming & fuzzy search option too.
Help me out please....
Thanks in advance.
- Richard.Yes, it's in the doc - it's known as the weight operator.
http://download.oracle.com/docs/cd/B28359_01/text.111/b28304/cqoper.htm#i998379
"term*n Returns documents that contain term. Calculates score by multiplying the raw score of term by n, where n is a number from 0.1 to 10."
We're just using the operator twice as the limit on "n" is 10 (for no obvious reason I know of!). This is perfectly safe, and common practice. -
Hi,
I have a database with bibliographic data (title, author...) and full text documents. I store all bibliographic data and a part of the full text in a single CLOB column in a XML format :
I then defined groups with FIELD_SECTION and mdata_section to be able to search on this column . The default search is set on all data :
select id,score(1) FROM mytable where contains(mycolumn,'mysearch',1) > 0 order by score (1) desc
However, for generic queries I useally get more than 100 results with the same 100 scrore. So it is not very useafull for the end-user.
As far as I can see there is no way to improve the score algoritm so that the documents that have mysearch in the title section for exemple have a better ranking. Is that correct?
Did some of you tried to improved this ranking in the application? Search for the words in the title first section for exemple, and then search in the full text with an exclusion of the id of the documents found in the first step and then add the two hit list?
Thanks for your help.
Kind regards,
FredProgressive relaxation is useful here.
You can search for words in the title section only in the first stage, then look in the other columns in the next stage. Any hits in the first stage of progressive relaxation are guaranteed to score higher than hits in the next stage.
See here: http://www.oracle.com/technology/products/text/htdocs/prog_relax.html
for a discussion of progressive relaxation. -
Unable to use the thesaurus in a relaxation template
I am trying to get a query relaxation template to use the thesaurus but I can't get the syntax correct. Is it possible? If so, please can someone tell me where I'm going wrong?
create table test_table(company_name varchar2(100));
insert into test_table values ('Test Limited');
insert into test_table values ('Test Ltd');
create index idx_test on test_table(company_name) indextype is ctxsys.context;
If my query looks like this:
select company_name, score(1)
from test_table
where CONTAINS (company_NAME,
'<query>
<textquery lang="ENGLISH" grammar="CONTEXT">test ltd
<progression>
<seq><rewrite>transform((TOKENS, “{”, “}”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “!”, “%”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “${”, “}”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “SYN(”, “,legal_form)”, “ ”))</rewrite></seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="COUNT"/>
</query>',1)>0;
I get the matching record back
COMPANY_NAME SCORE(1)
Test Ltd 75
But if I move the SYN line to the top like this:
select company_name, score(1)
from test_table
where CONTAINS (company_NAME,
'<query>
<textquery lang="ENGLISH" grammar="CONTEXT">test ltd
<progression>
<seq><rewrite>transform((TOKENS, “SYN(”, “,legal_form)”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “{”, “}”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “!”, “%”, “ ”))</rewrite></seq>
<seq><rewrite>transform((TOKENS, “${”, “}”, “ ”))</rewrite></seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="COUNT"/>
</query>',1)>0;
I get an error which I think means that the XML line is not valid:
ORA-29902:error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 35
What is the correct format for the line that will apply the thesaurus synonym between Limited to LTD?There are a lot of things that work well individually, but not in combination with one another. It looks like something goes wrong when you try to combine transform with syn. One possible workaround is to use replace to do your own transformation. Please see the reproduction and solution below.
SCOTT@10gXE> -- test environment:
SCOTT@10gXE> create table test_table(company_name varchar2(100));
Table created.
SCOTT@10gXE> insert into test_table values ('Test Limited');
1 row created.
SCOTT@10gXE> insert into test_table values ('Test Ltd');
1 row created.
SCOTT@10gXE> create index idx_test on test_table(company_name) indextype is ctxsys.context;
Index created.
SCOTT@10gXE> EXEC CTX_THES.CREATE_THESAURUS ('legal_form')
PL/SQL procedure successfully completed.
SCOTT@10gXE> EXEC CTX_THES.CREATE_RELATION ('legal_form', 'Limited', 'SYN', 'Ltd')
PL/SQL procedure successfully completed.
SCOTT@10gXE> COLUMN company_name FORMAT A30
SCOTT@10gXE> -- reproduction of problem:
SCOTT@10gXE> select company_name, score(1)
2 from test_table
3 where CONTAINS (company_NAME,
4 '<query>
5 <textquery lang="ENGLISH" grammar="CONTEXT">test ltd
6 <progression>
7 <seq><rewrite>transform((TOKENS, “SYN(”, “,legal_form)”, “ ”))</rewrite></seq>
8 <seq><rewrite>transform((TOKENS, “{”, “}”, “ ”))</rewrite></seq>
9 <seq><rewrite>transform((TOKENS, “!”, “%”, “ ”))</rewrite></seq>
10 <seq><rewrite>transform((TOKENS, “${”, “}”, “ ”))</rewrite></seq>
11 </progression>
12 </textquery>
13 <score datatype="INTEGER" algorithm="COUNT"/>
14 </query>',1)>0
15 /
select company_name, score(1)
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 7
SCOTT@10gXE> -- possible workaround:
SCOTT@10gXE> VARIABLE search_string VARCHAR2(30)
SCOTT@10gXE> EXEC :search_string := 'test ltd'
PL/SQL procedure successfully completed.
SCOTT@10gXE> select company_name, score(1)
2 from test_table
3 where CONTAINS (company_NAME,
4 '<query>
5 <textquery lang="ENGLISH" grammar="CONTEXT">
6 <progression>
7 <seq>' || 'SYN(' || REPLACE(:search_string, ' ', ',legal_form) AND SYN(') || ',legal_form)' || '</seq>
8 <seq>' || '{' || REPLACE(:search_string, ' ', '} {') || '}' || '</seq>
9 <seq>' || '!' || REPLACE(:search_string, ' ', '% !') || '%' || '</seq>
10 <seq>' || '${' || REPLACE(:search_string, ' ', '} ${') || '}' || '</seq>
11 </progression>
12 </textquery>
13 <score datatype="INTEGER" algorithm="COUNT"/>
14 </query>',1)>0
15 /
COMPANY_NAME SCORE(1)
Test Limited 75
Test Ltd 75
SCOTT@10gXE>
Maybe you are looking for
-
Calculate fixed order quantity by using econmoic order quanity
Hi experts, i have one requirement price of pen is 2/- order costs per order is 6/- holding costs per year is 15% the demand forecast is 5,00,000 pen per year. SAP ECC offers more detailed information for the next two weeks: days week day
-
Inter Company transfer Accounting entry without loss or profit
Hi Experts, I want to transfer an asset between to legally independent co codes( Group is same, still we are using different trading partners for consolidation) without posting revenue or expense. Co Code 1000 , Company 1000(assigned to co code 1000
-
Dear Guru kindly clear my confusion In Print Output of a Purchasing Document In SAP.
Dear Guru, I am creating a RFQ AND want a printout for it from the standard SAP. 1= I given document no ,purchasing group and in message data given application as EA , Message Type as NEU Processing status 1.but found message as no purchasing docum
-
Photos show up empty thumnail from icloud photo library
I am using iOS 8.3; iPhone 6; iCloud Photo Library with 20000+ photos. I uploaded my entire library to iCloud from Mac OS X Photo app. After the long uploading time, when I try to load back to my iPhone, I found many empty grid with the cloud icon. E
-
I downloaded iPhoto, iMovie and Garage Band separately and I need iWeb for my job. How to get iWeb individually ? I cannot dowload it from Apple store. Thanks