Extracting stem words from text index

Hello all,
I am trying to categorize some records in a table. I wonder if Oracle Text has some searching capabilities inside the text index. So, what I'm trying to achieve is to find the minimum amount of stem words that can be found in a set of records. Basically, it's kind of reverse searching, I have a subset of records from a table that can be found using a regular query (no text query) - the table has a text index on one column - and I want to find, using the text index, the minimum amount of stem words in that column that can generate a hit for the whole subset if queried using only the text query.
Thanks,
Danny

Here is a method for viewing the stem word of any given word by using a function that inserts one row into a table, dynamically rebuilds the index, then selects the stem word from the domain index table. I have then added some code to use that to loop through all the words in the original domain index table, insert them and their roots into another table, and select the roots and corresponding concatenated words.
SCOTT@orcl_11gR2> SELECT banner FROM v$version
2 /
BANNER
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE     11.2.0.1.0     Production
TNS for 64-bit Windows: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
5 rows selected.
SCOTT@orcl_11gR2> CREATE TABLE test_tab (test_col VARCHAR2 (40))
2 /
Table created.
SCOTT@orcl_11gR2> INSERT ALL
2 INTO test_tab VALUES ('The cats ran quickly from the dogs.')
3 INTO test_tab VALUES ('The mice were running from the cats.')
4 INTO test_tab VALUES ('Some people walk their dogs every day.')
5 INTO test_tab VALUES ('The dogs chased the cats.')
6 SELECT * FROM DUAL
7 /
4 rows created.
SCOTT@orcl_11gR2> BEGIN
2    CTX_DDL.CREATE_PREFERENCE ('test_lex', 'AUTO_LEXER');
3    CTX_DDL.SET_ATTRIBUTE ('test_lex', 'INDEX_STEMS', 'YES');
4 END;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> CREATE INDEX test_idx ON test_tab (test_col)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS
4    ('LEXER       test_lex
5       STOPLIST CTXSYS.EMPTY_STOPLIST')
6 /
Index created.
SCOTT@orcl_11gR2> CREATE TABLE stem_tab
2    (test_word VARCHAR2 (4000))
3 /
Table created.
SCOTT@orcl_11gR2> CREATE INDEX stem_idx on stem_tab (test_word)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS
4    ('LEXER        stem_lex
5       STOPLIST CTXSYS.EMPTY_STOPLIST')
6 /
Index created.
SCOTT@orcl_11gR2> CREATE OR REPLACE FUNCTION get_stem
2    (p_word IN VARCHAR2)
3    RETURN VARCHAR2
4 AS
5    v_word       VARCHAR2 (32767);
6 BEGIN
7    DELETE FROM stem_tab;
8    COMMIT;
9    INSERT INTO stem_tab (test_word) VALUES (p_word);
10    COMMIT;
11    EXECUTE IMMEDIATE 'ALTER INDEX stem_idx REBUILD';
12    SELECT MIN (token_text)
13    INTO   v_word
14    FROM   dr$stem_idx$i;
15    RETURN v_word;
16 EXCEPTION
17    WHEN NO_DATA_FOUND THEN RETURN p_word;
18 END get_stem;
19 /
Function created.
SCOTT@orcl_11gR2> SHOW ERRORS
No errors.
SCOTT@orcl_11gR2> CREATE TABLE words_and_stems
2    (word VARCHAR2 (20),
3      stem VARCHAR2 (20))
4 /
Table created.
SCOTT@orcl_11gR2> SET SERVEROUTPUT ON
SCOTT@orcl_11gR2> DECLARE
2    v_stem VARCHAR2 (32767);
3 BEGIN
4    FOR r IN
5       (SELECT DISTINCT token_text
6        FROM      dr$test_idx$i
7        WHERE token_text != '.')
8    LOOP
9       v_stem := get_stem (r.token_text);
10       INSERT INTO words_and_stems
11       VALUES (r.token_text, v_stem);
12    END LOOP;
13    COMMIT;
14 END;
15 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> COLUMN words FORMAT A45 WORD_WRAPPED
SCOTT@orcl_11gR2> SELECT stem,
2          LISTAGG (word, ', ') WITHIN GROUP (ORDER BY word)
3            AS words
4 FROM   words_and_stems
5 GROUP BY stem
6 /
STEM                 WORDS
BE                   BE, WERE
CAT                  CAT, CATS
CHASE                CHASE, CHASED
DAY                  DAY
DOG                  DOG, DOGS
EVERY                EVERY
FROM                 FROM
MICE                 MICE
MOUSE                MOUSE
PEOPLE               PEOPLE
QUICKLY              QUICKLY
RAN                  RAN
RUN                  RUN, RUNNING
SOME                 SOME
THE                  THE
THEIR                THEIR
WALK                 WALK
17 rows selected.
SCOTT@orcl_11gR2>

Similar Messages

Extracting two words from the article in english

i have an english article which is to be classified into a particular category based on the keywords. There are lacks of keywords stored in database. What i have to do is to obtain the keywords from the article and match it from the database. if match is found then the article belongs to that particular category. This keyword matching i did for one word by using split(" "), but now i want to do for 2 words from an article.that is getting 2 words from the article which is repeated many times.then searching it in the db.(here 2 words will be considered as one keyword)
Now what i should do to get the two appropriate keywords from the article without taking a,am,the,is,when etc...(leaving many generic words).
Any help will be appreciated.

hi,
thanks for reply!
I know its a bad algorithm classify the article written in english only based on few words appearing in the article.
But what i want to do is first extract the words from the article leaving the generic words, then count the single word each.Then i am sorting the words based on count and taking the five words from the article which has highest count. Now i have the database where millions of keywords are stored. These keywords are refering to particle category
ie. if we consider a category as sports, then under this category i have many keywords stored in the database like cricket, football, worldcup,tennis... etc
Now if i search the appropriate word from the article it will be considered as keyword. then this will be searched in the database. if match is found then it means the article belongs to sports category.
Now problem is some times article can have two words which can be considered as one keyword and can be used to classify article in much better way.
The question is how to get such words from the article???
ex.. if Hero's Journey is combined word appearing many times in the article then this keyword can be used to classify the article much better than going for single word.
Can anybody help me in this regard.
Any help will be appreciated.

Seachring word from text file

Hi...There..
I h'va wrirtten Search application which search words from Simple text files.
My file contains list of words separated by "\n"(new line).
i am using java.io.BufferedReader for reading file.
i'want to search word from file within few milliseconds, but when my file containo more then some 2lake words(200000) my process of readind comsumes more then 5 sec. time to search.
pl. suggest me effective method to search word from file so i can make it rapid search.
Actually i 've to provide search on "TEXT VALUE CHANGED EVENT" so even if my process takes more then one seconds it is not physible for me.
Thanks in Advance.
Timir Patel.

Try this:
import java.io.*;
import java.util.*;
public class searcher
 private static long [] indexes;
 private static class temp_data
 public final String text;
 public final long starts_at;
 public temp_data(String t, long l)
 text = t;
 starts_at = l;
 private static class temp_cmp implements Comparator
 public int compare(Object o1,Object o2)
 return ((temp_data)o1).text.compareTo(
 ((temp_data)o2).text);
 /** creats index table. You should do it once, and rather store index
 table in file then. This method has high peak memory usage but it is
 easy to optimize it.*/
 private static void buildIndex(RandomAccessFile file)throws Exception
 List temp = new LinkedList();
 String st;
 long p = file.getFilePointer();
 while((st = file.readLine())!=null)
 temp.add(
 new temp_data(st,p)
 p = file.getFilePointer();
 Collections.sort(temp,new temp_cmp());
 indexes = new long[temp.size()];
 int i=0;
 for(Iterator I=temp.iterator();I.hasNext();i++)
 temp_data tt = ((temp_data)I.next());
 System.out.println("indexing :"+tt.text+" ["+tt.starts_at+"]");
 indexes=tt.starts_at;
 /** returns position at which text starts or -1 if not found */
 public static long find(String text,RandomAccessFile file)throws Exception
 int ncp = indexes.length/2;
 int n = 2;
 int cp;
 do{
 cp = ncp;
 file.seek(indexes[cp]);
 String tt = file.readLine();
 System.out.println("comparing with "+tt);
 int cmpr = text.compareTo(tt);
 if (cmpr==0)
 return indexes[cp];
 else
 if (cmpr>0)
 ncp = cp+(indexes.length / (1<<n));
 else
 ncp = cp-(indexes.length / (1<<n));
 n++;
 }while(ncp!=cp);
 return -1;
 public static void main(String args [] )throws Exception
 RandomAccessFile f = new RandomAccessFile(args[0],"r");
 buildIndex(f);
 for(int i=1;i<args.length;i++)
 System.out.println("searching for \""+args[i]+"\"");
 System.out.println("found at:"+find(args[i],f));
 f.close();
It should work, however I gave it less than five minutes testing.

Read a non english word from text file

While Reading thai charater from text file which was sent by QAD(a different application),
We are reading 60 char using substr() function.
If the data is English word it reads correctly with 60 char.
But if it is in thai characters it returns more than 60 char.
In oracle all NLS Char set has been already set.
Can anyone help in this issue
thanks in advance

Maybe you should use SUBSTRC, SUBSTR2 or SUBSTR4 depending on the character set of your database. See http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96540/functions119a.htm#87068
Message was edited by:
Pierre Forstmann

Reading first word from text file

Hello all,
I created a program which I can type in a line and store it into a file. What I need my next program to do is read and display just the first word from each line on the text file.
Here is what I have:
import java.io.*;
public class ReadFromFile
public static void main (String args[])
     FileInputStream firstWord;
          try
              // Open an input stream
              firstWord = new FileInputStream ("FileofWords.txt");
              // Read a line of text
              System.out.println( new DataInputStream(firstWord).readLine() );
              // Close our input stream
              firstWord.close();
               System.err.println ("Unable to read from file");
               System.exit(-1);
}

what i would like is my program to get the first word of every line and put it in my array.
This is what i have attempted.
import java.io.*;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.BufferedReader;
public class testing {
     String[] tmp = new String [2];
     String str;
     public void programTest()
          try
                         BufferedReader in2 = new BufferedReader(new FileReader("eventLog.log"));
                         while ((str = in2.readLine()) != null)
                              tmp = str.split(" ");
                              System.out.println(tmp[0]);
                    catch(IOException e)
                         System.out.println("cannot read file");
public static void main(String[] args)
             testing B = new testing();
             B.programTest();
}//end classAny help is most appreciated

Read words from text file by delimiter as columns and rows

Hi All,
I need your help as i have a problem with reading string by delimiter. i have used various tech in java, but i can not get the result as expected.
Please help me to get the string from text file and write into excel . i have used a text file.
problem
get the below as string by delimiter
OI.ID||'|'||OI.IDSIEBEL||'|'||OI.IDTGU||'|'||OI.WORKTYPE||'|'||OI.UTR_FK
read the below as string by delimiter
"50381|TEST_7.15.2.2|09913043548|Attivazione|07-SEP-10
now i need to write the above into excel file as columns and rows.
there are 5 columns and each corresponding values
the outut excel should be
OI.ID OI.IDSIEBEL OI.IDSIEBEL OI.WORKTYPE OI.UTR_FK
50381 TEST_7.15.2.2 09913043548 Attivazione 07-SEP-10
i tried in diffrerent techinq but not able to get the resule. pls help me
Thanks,
Jasmin
Edited by: user13836688 on Jan 22, 2011 8:13 PM

First of all, when posting code, put it between two tags.
Second of all, the pipe is a special character in regex, so you need to escape it as
.split("\ \|");
Erh. So 2 backslashes before the pipe character. Stupid forum won't post it.
Edited by: Kayaman on Jan 24, 2011 9:35 AM
Edited by: Kayaman on Jan 24, 2011 9:35 AM
Edited by: Kayaman on Jan 24, 2011 9:36 AM

Extract certain data from text

Hi,
I want to extract or crop some data from a given word. Say i
want "Mad" from "Madras - 1"
and even how do i remove/crop "03S" from "10.03.03S".
Thanks,
Ayush

Look at the String methods, particularly slice, substr, and
substring

Extract highlighted words from a pdf (Acrobat SDK, OLE)

Hello Acrobat gurus ! :-)
I'm new to the SDK, so please excuse any "stupid" question i might have.
Here is what i want to do:
I want to search for a group of words in a pdf document. According to the SDK documentation, once i search for a text using AcroExch.AVDoc.FindText(), the function "Finds the specified text, scrolls so that it is visible, and highlights it."
I was assuming that after calling this function with my string, once the string is found i will have acces to the coordinates of the rectangle containing the highlighted group of words (i presumed that those words would be automatically contained in an object of the type AcroExch.HiliteList) and to the coordinates of those words.But i'm not able to do so, i cannot find any function(s) that give me that kind of access.
So question is:
Is it possible to access the coordinates of the rectangle/words that are highlighted in a pdf after calling the FindText() function ? Can someone help me get on the right track ?
Thanks

Ok, let me give you an more elaborate example, maybe i don't ask the right question.
Let's say i have a pdf, containing the following text in the first page
--- arbitrary number of ":"
Mother's Name: Joanna
Father's Name: Josh
other text
If i call the function like this: FindText("Mother's Name:"), acrobat is going to find the first occurance of my string. What i want to do is to be able to get the coordinates of this WHOLE string OR the coordinates of the last character in the string (in this case ":").
The problem is that if i go for the coordinates ofthe double dots i cannot just look for them in the pdf, because i may have an unknown number of double dots (":") before the ones i'm interested in. The logical solution in this case would be to get the coordinates of the entire string ("Mother's Name:" in this case) and then get the coordinates of the double dots i'm interested in.
Would that be possible ?

Extract words from JSP into text file

Hi,
I have a big problem:
I want to extract selected words from a jsp file.
Following are found in one of my jsp file:
E.g.
1) <td width="217">ORGANISATION UNIT NAME *
I want to retrieve "ORGANISATION UNIT NAME"
2)errPrompt(frm.txtDesc, "Maximum length is just 100 character only");
I want to retrieve "Maximum length is just 100 character only"
I try String tokenising, then check that is the token ends with ">", then the next token is the one
I want, then loop until "<" is found. BUT this would not work as there is no spacing between some of
the tags and the words.(e.g. abc)--> This cauese the whole token to be abc, so the "abc" will
not be extracted as it does not have a "ends with >" in front of it.
Even with using ">" as a checking does not work for pop up messages, as it does not have tags(refer
to e.g 2).
Please reply a.s.a.p...
Really urgent!!
Thank You
Michelle

for extracting the HTML tags, u can use DOM/DHTML
I dont remember the syntxes, how to use, but i know that it is possible, just go thru' the DOM/DHTML
Regards,
Ritesh

Extracting specific data from multiple text files to single CSV

Hello,
Unfortunately my background is not scripting so I am struggling to piece together a powershell script to achieve the below. Hoping an experienced powershell scripter can provide the answer. Thanks in advance.
I have a folder containing approx. 2000 label type files that I need to extract certain information from to index a product catalog. Steps to be performed within the script as I see are:
1. Search folder for *.job file types
2. Search the files for certain criteria and where matched return into single CSV file
3. End result should be a single CSV with column headings:
a) DESCRIPTION
b) MODEL
c) BARCODE

Try:
# Script to extract data from .job files and report it in CSV
# Sam Boutros - 8/24/2014
# http://superwidgets.wordpress.com/category/powershell/
$CSV = ".\myfile.csv" # Change this filename\path as needed
$Folders = "d:\sandbox" # You can add multiple search folders as "c:\folder1","\\server\share\folder2"
# End Data entry section
if (-not (Test-Path -Path $CSV)) {
Write-Output """Description"",""Model"",""Barcode""" | Out-File -FilePath $CSV -Encoding ascii
$Files = Get-ChildItem -Path $Folders -Include *.job -Force -Recurse
foreach ($File in $Files) {
$FileContent = Get-Content -Path $File
$Keyword = "viewkind4"
if ($FileContent -match $Keyword) {
for ($i=0; $i -lt $FileContent.Count; $i++) {
if ($FileContent[$i] -match $Keyword) {
$Description = $FileContent[$i].Split("\")[$FileContent[$i].Split("\").Count-1]
} else {
Write-Host "Keyword $Keyword not found in file $File" -ForegroundColor Yellow
$Keyword = "Code:"
if ($FileContent -match $Keyword) {
for ($i=0; $i -lt $FileContent.Count; $i++) {
if ($FileContent[$i]-match $Keyword) {
$Parts = $FileContent[$i].Split(" ")
for ($j=0; $j -lt $Parts.Count; $j++) {
if ($Parts[$j] -match $Keyword) {
$Model = $Parts[$j+1].Trim()
$Model = $Model.Split("\")[$Model.Split("\").Count-1]
} else {
Write-Host "Keyword $Keyword not found in file $File" -ForegroundColor Yellow
$Keyword = "9313"
if ($FileContent -match $Keyword) {
for ($i=0; $i -lt $FileContent.Count; $i++) {
if ($FileContent[$i] -match "9313") {
$Index = $FileContent[$i].IndexOf("9313")
$Barcode = $null
for ($j=0; $j -lt 12; $j++) {
$Barcode += $FileContent[$i][($Index+$j)]
} else {
Write-Host "Keyword $Keyword not found in file $File" -ForegroundColor Yellow
Write-Output "File: '$File', Description: '$Description', Model: '$Model', Barcode: '$Barcode'"
Write-Output """$Description"",""$Model"",""$Barcode""" | Out-File -FilePath $CSV -Append -Encoding ascii
Sam Boutros, Senior Consultant, Software Logic, KOP, PA http://superwidgets.wordpress.com (Please take a moment to Vote as Helpful and/or Mark as Answer, where applicable)

I have two pdf docs that I used Acrobat to convert to word docs. How do I extract one page from one doc to insert into the other doc?

I have two pdf docs that I converted to Word Docs using Acrobat Pro. How do I extract one page from the first doc and insert it into the second doc? When I "select all" it grabs the entire document. I need to take pages out, put other pages in, and edit some of the text.

HI djlarp,
Try triple-clicking in the text that you want to select--it can sometimes be tricky to select text in a converted document. If that doesn't work, it could be that the PDF document was created from a scanned document, and OCR wasn't enabled when you converted the document. (However, OCR is enabled by default when you convert via the ExportPDF website.)
If you're unable to select text by triple-clicking, let us know. I would be happy to take a closer look at your files.
Best,
Sara

Build text-index based on a given list of words or phrases.

I'm somewhat of a beginner to this text-indexing. I've been able to build and query a simple text-index and even implement my own list of stop-words. However, I'd like to be able to control the set of words that are indexed.
For example, If I have a table that contains a CLOB field filled with text documents and I also have a list of 200 words:
"TUBERCULOSIS"
"DIABETES"
"CHEMOTHERAPY"
Can I generate an index that only indexes the words on that list and ignores all the other words? (the reverse of using a stop-list)
Also, could it be done with a list of phrases instead of single words:
"CARDIAC ABLATION"
"ATRIOVENTRICULAR NODE"
"PULMONARY ABSCESS"
Thanks.

Please see if you can use any of the pieces of the following example.
SCOTT@orcl_11gR2> -- table containing list of phrases:
SCOTT@orcl_11gR2> create table phrases
2    (phrase        varchar2 (21))
3 /
Table created.
SCOTT@orcl_11gR2> insert all
2 into phrases values ('TUBERCULOSIS')
3 into phrases values ('DIABETES')
4 into phrases values ('CHEMOTHERAPY')
5 into phrases values ('CARDIAC ABLATION')
6 into phrases values ('ATRIOVENTRICULAR NODE')
7 into phrases values ('PULMONARY ABSCESS')
8 select * from dual
9 /
6 rows created.
SCOTT@orcl_11gR2> -- ctxrule index on list of phrases:
SCOTT@orcl_11gR2> create index phrases_idx on phrases (phrase)
2 indextype is ctxsys.ctxrule
3 /
Index created.
SCOTT@orcl_11gR2> -- table to hold combination of documents and matching phrases:
SCOTT@orcl_11gR2> create table classifications
2    (document clob,
3      phrase       varchar2 (60))
4 /
Table created.
SCOTT@orcl_11gR2> -- context index on classifications table:
SCOTT@orcl_11gR2> create index class_phrase_idx
2 on classifications (phrase)
3 indextype is ctxsys.context
4 parameters ('sync (on commit)')
5 /
Index created.
SCOTT@orcl_11gR2> -- regular index on classifications table:
SCOTT@orcl_11gR2> create index class_phrase_idx2
2 on classifications (phrase)
3 /
Index created.
SCOTT@orcl_11gR2> -- table for documents:
SCOTT@orcl_11gR2> create table documents
2    (document     clob)
3 /
Table created.
SCOTT@orcl_11gR2> -- trigger to populate classifications table from documents table:
SCOTT@orcl_11gR2> create or replace trigger documents_bir
2    before insert on documents
3    for each row
4 begin
5    for r in
6       (select phrase
7        from      phrases
8        where matches (phrase, :new.document) > 0)
9    loop
10       insert into classifications (document, phrase) values
11         (:new.document, r.phrase);
12    end loop;
13 end documents_bir;
14 /
Trigger created.
SCOTT@orcl_11gR2> -- inserts into documents table:
SCOTT@orcl_11gR2> insert all
2 into documents values ('word1 tuberculosis word2')
3 into documents values ('word3 diabetes word4')
4 into documents values ('word5 chemotherapy word6')
5 into documents values ('word7 cardiac ablation word8')
6 into documents values ('word9 atrioventricular node word10')
7 into documents values ('word11 pulmonary abscess word12')
8 into documents values ('word13 word14 word15')
9 select * from dual
10 /
7 rows created.
SCOTT@orcl_11gR2> commit
2 /
Commit complete.
SCOTT@orcl_11gR2> -- resulting population of classifications table:
SCOTT@orcl_11gR2> column phrase   format a21
SCOTT@orcl_11gR2> column document format a34
SCOTT@orcl_11gR2> select phrase, document from classifications
2 /
PHRASE                DOCUMENT
TUBERCULOSIS          word1 tuberculosis word2
DIABETES              word3 diabetes word4
CHEMOTHERAPY          word5 chemotherapy word6
CARDIAC ABLATION      word7 cardiac ablation word8
ATRIOVENTRICULAR NODE word9 atrioventricular node word10
PULMONARY ABSCESS     word11 pulmonary abscess word12
6 rows selected.
SCOTT@orcl_11gR2> -- tokens that are indexed:
SCOTT@orcl_11gR2> select token_text from dr$class_phrase_idx$i
2 /
TOKEN_TEXT
ABLATION
ABSCESS
ATRIOVENTRICULAR
CARDIAC
CHEMOTHERAPY
DIABETES
NODE
PULMONARY
TUBERCULOSIS
9 rows selected.
SCOTT@orcl_11gR2> -- searches using text index:
SCOTT@orcl_11gR2> set autotrace on explain
SCOTT@orcl_11gR2> select phrase, document
2 from   classifications
3 where contains (phrase, 'tuberculosis') > 0
4 /
PHRASE                DOCUMENT
TUBERCULOSIS          word1 tuberculosis word2
1 row selected.
Execution Plan
Plan hash value: 2513347404
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                  |     1 | 2046 |     4   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| CLASSIFICATIONS |     1 | 2046 |     4   (0)| 00:00:01 |
|* 2 |   DOMAIN INDEX              | CLASS_PHRASE_IDX |       |       |     4   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("CTXSYS"."CONTAINS"("PHRASE",'tuberculosis')>0)
Note
   - dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2> select phrase, document
2 from   classifications
3 where contains (phrase, 'cardiac ablation') > 0
4 /
PHRASE                DOCUMENT
CARDIAC ABLATION      word7 cardiac ablation word8
1 row selected.
Execution Plan
Plan hash value: 2513347404
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                  |     1 | 2046 |     4   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| CLASSIFICATIONS |     1 | 2046 |     4   (0)| 00:00:01 |
|* 2 |   DOMAIN INDEX              | CLASS_PHRASE_IDX |       |       |     4   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("CTXSYS"."CONTAINS"("PHRASE",'cardiac ablation')>0)
Note
   - dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2> select phrase, document
2 from   classifications
3 where contains (phrase, '%ab%') > 0
4 /
PHRASE                DOCUMENT
DIABETES              word3 diabetes word4
CARDIAC ABLATION      word7 cardiac ablation word8
PULMONARY ABSCESS     word11 pulmonary abscess word12
3 rows selected.
Execution Plan
Plan hash value: 2513347404
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                  |     1 | 2046 |     4   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| CLASSIFICATIONS |     1 | 2046 |     4   (0)| 00:00:01 |
|* 2 |   DOMAIN INDEX              | CLASS_PHRASE_IDX |       |       |     4   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("CTXSYS"."CONTAINS"("PHRASE",'%ab%')>0)
Note
   - dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2> -- searches using non-text index:
SCOTT@orcl_11gR2> select phrase, document
2 from   classifications
3 where phrase = 'PULMONARY ABSCESS'
4 /
PHRASE                DOCUMENT
PULMONARY ABSCESS     word11 pulmonary abscess word12
1 row selected.
Execution Plan
Plan hash value: 4202264836
| Id | Operation                   | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                   |     1 | 2034 |     1   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| CLASSIFICATIONS   |     1 | 2034 |     1   (0)| 00:00:01 |
|* 2 |   INDEX RANGE SCAN          | CLASS_PHRASE_IDX2 |     1 |       |     1   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("PHRASE"='PULMONARY ABSCESS')
Note
   - dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2> select phrase, document
2 from   classifications
3 where phrase = 'PULMONARY ABSCESS'
4 /
PHRASE                DOCUMENT
PULMONARY ABSCESS     word11 pulmonary abscess word12
1 row selected.
Execution Plan
Plan hash value: 4202264836
| Id | Operation                   | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                   |     1 | 2034 |     1   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| CLASSIFICATIONS   |     1 | 2034 |     1   (0)| 00:00:01 |
|* 2 |   INDEX RANGE SCAN          | CLASS_PHRASE_IDX2 |     1 |       |     1   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - access("PHRASE"='PULMONARY ABSCESS')
Note
   - dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2> select phrase, document
2 from   classifications
3 where phrase like '%AB%'
4 /
PHRASE                DOCUMENT
CARDIAC ABLATION      word7 cardiac ablation word8
DIABETES              word3 diabetes word4
PULMONARY ABSCESS     word11 pulmonary abscess word12
3 rows selected.
Execution Plan
Plan hash value: 723026238
| Id | Operation                   | Name              | Rows | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                   |     3 | 6102 |     0   (0)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID| CLASSIFICATIONS   |     3 | 6102 |     0   (0)| 00:00:01 |
|* 2 |   INDEX FULL SCAN           | CLASS_PHRASE_IDX2 |     1 |       |     0   (0)| 00:00:01 |
Predicate Information (identified by operation id):
   2 - filter("PHRASE" IS NOT NULL AND "PHRASE" LIKE '%AB%')
Note
   - dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2>

Extract words from a string

select * from v$version;
BANNER
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
PL/SQL Release 11.1.0.7.0 - Production
CORE    11.1.0.7.0    Production
TNS for Linux: Version 11.1.0.7.0 - Production
NLSRTL Version 11.1.0.7.0 - ProductionI want to extract xyz_xyz,abc_abc words from the string ' select * from TEMP_TABLE where trunc(xyz_xyz) = ''xyz'' and trunc(abc_abc) = ''abc'') '
i have tried with this query...
select regexp_substr('select * from TEMP_TABLE where trunc(xyz_xyz) = ''xyz'' and trunc(abc_abc) = ''abc'')', 'trunc[^'''']+''') from dual
can some one help me on this?
Thanks,
Mike

Hi, Mike,
Mike wrote:
I apologize for this..
create table TEMP_TABLE ( col1 varchar2(10),col2 varchar2(10));
insert into TEMP_TABLE values('   xyz   ','abc      ');
insert into TEMP_TABLE values('   xyzxyz ','abcabc    ');
insert into TEMP_TABLE values('   xyz123 ','abc546 ');
insert into TEMP_TABLE values('xyz','abc');
commit
select * from TEMP_TABLE where trim(col1) ='xyz' and trim(col2) ='abc'
desired output
col1,col2
So the correct output is one row, containing these 9 characters
col1,col2which do not appear in the table. is that right?
How is that output related to the data that is in the table? Do you want that output to appear if there are any rows where TRIM (col1) = 'xyz' and TRIM (col2) = 'abc' (for example, either of these two rows from you sample data:
insert into TEMP_TABLE values('   xyz   ','abc      ');
insert into TEMP_TABLE values('xyz','abc');), and would you want the result set to contain no rows if the table contained no row like that?
If so,
SELECT DISTINCT
     'col1,col2'     AS desired_output
FROM     temp_table
WHERE     TRIM (col1)     = 'xyz'
AND     TRIM (col2)     = 'abc'
;I hope this answers your question.
I have a feeling that it doesn't, or perhaps you have more than one question, since this your earlier messages seemed to have something to do with locating text that was surrounded by single-quotes, and the sample data you posted doesn't contain any single-quotes.

Extract x amount of words from a string

hi all just wondered if some can help me with this
String mySting = "This is a test String"
getInput(); // this method returns a string and works
getNoOfWordsInputted(); // returns int of words in the stringthis is where I need the help
I now need to extract getNoOfWords inputted from myString
ie such that if I input "this is " true will be returned
I can implment the following
if (getInput.equals(myString)){
// do Something
}

Still not quite sure what you're after, but if you want to see if the target string starts with a candidate substring, just use the startsWith method.
if (example.startsWith(sub)) ...
or, you can still use the indexOf method, just check for a return value of 0, meaning the index of was at the beginning of the string.
if (example.indexOf(sub) == 0) ...

Text auto-correct grabbing words from gmail / etc.

I'm not sure if this is a Gingerbread issue, but this is a problem that I was introduced to when I got my Xperia PLAY.
The on-screen QWERTY keyboard lends itself to frequent typos. Since I started using the Xperia, the auto-correct dictionary has been automatically synced and populated with various words that it apparently grabbed out of my gmail, email, contacts, address book, or elsewhere. This includes names, segments of email addresses, already misspelled words, etc.
This has made the auto-correct function useless on my phone, and makes text messaging a convoluted and futile chore that' I'd simply rather not do.
For example, when I'm trying to send a text to someone asking them, "what time?" -- I commonly typo "time" as "timr."
Now, auto-correct on my Xperia always wants to change "timr" into "TimeBomb4321" -- which is a segment of an email address out of my address book.
And, there's apparently no way to turn this off.
There's no way to toggle this under "Accounts & sync."
"User Dictionary" under Settings only includes words that I have manually added to it.
HOW DO I FIX THIS???

Hi chuckk,
I understand the importance of having a streamlined text experience. I would suggest deleting some of the auto-saved words from the dictionary. Settings>Language & Keyboard>User Dictionary. Press on hold the word you no longer want and the select delete. I am hopeful this information is helpful. Please let me know if you need further assistance.
Thank you for your contribution to our community forums,

Extracting stem words from text index

Similar Messages

Maybe you are looking for