Count occurrences of matched words

All,
Does anyone know the syntax or algorithm to count the occurrences of matched words that where compared from two separate columns or tables? Below is a simple example of the type of comparison I would like to perform. I found the function regexp_count, but it only works in 11g. Unfortunately, I have Oracle 10g. Any help would be greatly appreciated. Thanks
i.e.
drop table compare_matched_words;
create table compare_matched_words (
query_id number,
query_string1 varchar2(80),
query_string2 varchar2(80)
insert into compare_matched_words values (1, 'oracle 255','oracle1 255');
insert into compare_matched_words values (2, 'larry or ellison','larry or');
insert into compare_matched_words values (3, 'oracle and text','and');
insert into compare_matched_words values (4, 'market1 share','market share');
insert into compare_matched_words values (5, 'larry or',' larry or ellison');
insert into compare_matched_words values (6, 'oracle1 255','oracle 255');

Issue is not in 11g. Before you can solve the task you need to split QUERY_STRING1 and QUERY_STRING2 into words and then compare words for same QUERY_ID. To split string into words use something like:
select  query_id,
        regexp_substr(query_string1,'\w+',1,column_value) query_string1_word
  from  compare_matched_words t,
        table(
              cast(
                   multiset(
                            select  level
                              from  dual
                              connect by level <= length(regexp_replace(regexp_replace(QUERY_STRING1,'\w+','A'),'[^A]'))
                           ) as sys.OdciNumberList
SQL> /
           QUERY_ID QUERY_STRING1_WORD
                  1 oracle
                  1 255
                  2 larry
                  2 or
                  2 ellison
                  3 oracle
                  3 and
                  3 text
                  4 market1
                  4 share
                  5 larry
           QUERY_ID QUERY_STRING1_WORD
                  5 or
                  6 oracle1
                  6 255
14 rows selected.
SQL> Now:
select  t1.query_id,
        t1.query_string1,
        t2.query_string2,
        count(*) matchin_word_count
  from  (
         select  query_id,
                 query_string1,
                 regexp_substr(query_string1,'\w+',1,column_value) query_string1_word
           from  compare_matched_words t,
                 table(
                       cast(
                            multiset(
                                     select  level
                                        from  dual
                                       connect by level <= length(regexp_replace(regexp_replace(query_string1,'\w+','A'),'[^A]'))
                                    ) as sys.OdciNumberList
        ) t1,
         select  query_id,
                 query_string2,
                 regexp_substr(query_string2,'\w+',1,column_value) query_string2_word
           from  compare_matched_words t,
                 table(
                       cast(
                            multiset(
                                     select  level
                                        from  dual
                                       connect by level <= length(regexp_replace(regexp_replace(query_string2,'\w+','A'),'[^A]'))
                                    ) as sys.OdciNumberList
        ) t2
  where t2.query_id = t1.query_id
    and t2.query_string2_word = t1.query_string1_word
  group by t1.query_id,
           t1.query_string1,
           t2.query_string2
  order by t1.query_id
           QUERY_ID QUERY_STRING1                  QUERY_STRING2                   MATCHIN_WORD_COUNT
                  1 oracle 255                     oracle1 255                                      1
                  2 larry or ellison               larry or                                         2
                  3 oracle and text                and                                              1
                  4 market1 share                  market share                                     1
                  5 larry or                        larry or ellison                                2
                  6 oracle1 255                    oracle 255                                       1
6 rows selected.
SQL> SY.

Similar Messages

  • Returning Count of Matching Words Rather than Score

    I'm using the SCORE function that will return a number representing the relevance of the matching search result; however, what the users really want is a COUNT of the number of search results returned within the document. For a search on the word "cat" this would return the number of times the word "cat" appears in the document. Ideally, if the user searched for "(dog and cat) within sentence", the COUNT would be of the number of times both dog and cat occur within sentences in the document; however, it would be acceptable to count the number of matching words in the document individually.
    Does Oracle Text have any way to return this COUNT directly within the results Oracle Text query? I can use a CTX_DOC.MARKUP call to mark the document up and the count the instances of the markup strings, but that's going to be a lot slower than if Oracle supports this out-of-the-box on the original query syntax.

    Yes (in 10g onwards), you need to use a query template with <score algorithm="COUNT"/>
    See here:
    http://www.oracle.com/technology/products/text/x/10g_tech_overview.html#qry_cscore
    Though the example there is a bit confusing since it has algorithm="SCORE" - the option for normal scoring.

  • Count occurrences of a phrase

    Hi, what is the best way to count occurrences of a phrase (multi-gram words) in a document (corpus)?
    I am using String.split("regular expression here") to split the content of a document. For example to count how many times "this noun phrase" occurs in a document, i do
    String nounSingular = "this noun phrase";
    String nounPlural = "this noun phrases";
    String documentContent="blahblah...";
    int occur = documentContent.split("\\b+"+nounSingluar+"\\b+").length;
    ......But my worry is that regex processing is heavy. So the scalability of this method may be bad over large corpus, and long noun phrase strings.
    Any better ideas please? Using String.indexOf() iteratively?
    Many thanks!

    hey
    so if you use either split, or stringtokenizer, you will not get the correct result. you must use indexOf.
    Why? if you use split and the phrase is at the very start of the string, then you will not get the correct number of occurrences.
    Heres some code that works on a string. can't say much for performance! maybe read your text in line by line..?
              String searchMe = "policy policy bananas is the policy base of all policy";
              int occurs = 0;
              int nextIndex = 0;
              while (true){
                   int test = searchMe.indexOf("policy", nextIndex);
                   if (test == -1){
                        //then there are no occurrences
                        break;
                   } else {
                        //there is an occurrence
                        occurs++;
                        nextIndex = test + 1;
              }try it with split, it doesn't work... go ahead!
    Oh, and my message to the guy who has a go at me for reviving an old thread: People SEARCH and get these threads all the time in Google results. THATS WHY i post. just trying to help everyone...

  • How i can count the number of words in a string?

    hi, i want to know how to count the number of words in a string
    e.g. java is a very powerful computer language.
    i will get 7 words.
    thanks in advance..

    Jverd, this has actually been answered, but due to an
    attack of goldie-itis, all the answers were hosed.
    The OP did get an answer, though.Yeah, I know. I just didn't know if he saw the answer before it went away.

  • I just updated pages, I can't count the amount of words with space anymore. Can anybody help me?

    I just updated pages, I can't count the amount of words with space anymore. Can anybody help me?

    Yes me too, tried re setting and enabling changes but no agree button anywhere?

  • How to Count total number of Words in PDF?

    I am used Adobe Acrobat javascript inbuilt function getPageNumWords(<pagenumber>) it return the number of words present in specified page, but while am copy and paste text content from PDF file to MS Word, Words count given by MS Word is little bit differ, so any one know in which aspect Acrobat count the words?
    Which words count result is correct?
    Shall is go with Acrobat Words count result or MS Words count result?
    But I want to count the total number of words in PDF file (my input is PDF file) else can I go with iText?
    Words count in PDf using iText is possible?

    Word counts are likely to vary a little according to how you count. For instance, are hyphenated words one or two words? What if the hyphen is at the end of a line? Do numbers count as words? Headers and footers? Captions?
    Generally, you just accept a slight variation. If you are counting words in a professional context, i.e. where payment is per word, you probably need a contractual definition of how words are to be counted; in the absence of one, I suggest you use Word.

  • Column count doesn't match value count at row 1, unknown number of columns

    Hi,
    I am making a program to read data from excel files as the above and store them in tables. I have managed to read all the data from excel files as a string and store them in a table.
    ID Name Salary
    50 christine 2349000
    43 paulina 1245874
    54 laura 4587894
    23 efi 3456457
    43 jim 4512878
    But in my project I have several other files that have same cell that are blank as the above example
    ID Name Salary
    50 christine 2349000
    43 paulina
    laura 4587894
    23 3456457
    43 jim 4512878
    and when i ran the same program i get this exception :
    SQLException: Column count doesn't match value count at row 1
    SQLState: 21S01
    VendorError: 1136The code for creating the table and inserting the values is above:
    private static String getCreateTable(Connection con, String tablename,
                        LinkedHashMap<String, Integer> tableFields) {
                   Iterator iter = tableFields.keySet().iterator();
                   Iterator cells = tableFields.keySet().iterator();
                   String str = "";
                   String[] allFields = new String[tableFields.size()];
                   int i = 0;
                   while (iter.hasNext()) {
                        String fieldName = (String) iter.next();
                        Integer fieldType = (Integer) tableFields.get(fieldName);
                        switch (fieldType) {
                        case Cell.CELL_TYPE_NUMERIC:
                             str = fieldName + " INTEGER";
                             break;
                        case Cell.CELL_TYPE_STRING:
                             str = fieldName + " VARCHAR(255)";
                             break;
                        case Cell.CELL_TYPE_BOOLEAN:
                             str = fieldName + " INTEGER";
                             break;
                        default:
                             str = "";
                             break;
                        allFields[i++] = str;
                   try {
                        Statement stmt = con.createStatement();
                        try {
                             String all = org.apache.commons.lang3.StringUtils.join(
                                       allFields, ",");
                             String createTableStr = "CREATE TABLE IF NOT EXISTS "
                                       + tablename + " ( " + all + ")";
                             System.out.println("Create a new table in the database");
                             stmt.executeUpdate(createTableStr);
                        } catch (SQLException e) {
                             System.out.println("SQLException: " + e.getMessage());
                             System.out.println("SQLState:     " + e.getSQLState());
                             System.out.println("VendorError:  " + e.getErrorCode());
                   } catch (Exception e)
                        System.out.println( ((SQLException) e).getSQLState() );
                        System.out.println( e.getMessage() );
                        e.printStackTrace();
                   return str;
              private static void fillTable(Connection con, String fieldname,
                        LinkedHashMap[] tableData) {
                   for (int row = 0; row < tableData.length; row++) {
                        LinkedHashMap<String, Integer> rowData = tableData[row];
                        Iterator iter = rowData.entrySet().iterator();
                        String str;
                        String[] tousFields = new String[rowData.size()];
                        int i = 0;
                        while (iter.hasNext()) {
                             Map.Entry pairs = (Map.Entry) iter.next();
                             Integer fieldType = (Integer) pairs.getValue();
                             String fieldValue = (String) pairs.getKey();
                             switch (fieldType) {
                             case Cell.CELL_TYPE_NUMERIC:
                                  str = fieldValue;
                                  break;
                             case Cell.CELL_TYPE_STRING:
                                  str = "\'" + fieldValue + "\'";
                                  break;
                             case Cell.CELL_TYPE_BOOLEAN:
                                  str = fieldValue;
                                  break;
                             default:
                                  str = "";
                                  break;
                             tousFields[i++] = str;
                        try {
                             Statement stmt = con.createStatement();
                             String all = org.apache.commons.lang3.StringUtils.join(
                                       tousFields, ",");
                             String sql = "INSERT INTO " + fieldname + " VALUES (" + all
                                       + ")";
                             stmt.executeUpdate(sql);
                             System.out.println("Fill table...");
                        } catch (SQLException e) {
                             System.out.println("SQLException: " + e.getMessage());
                             System.out.println("SQLState: " + e.getSQLState());
                             System.out.println("VendorError: " + e.getErrorCode());
                   }To be more specific the error it in the second row where i have only ID and Name in my excel file and only these i want to store. The third row has only Name and Salary and no ID. How i would be able to store only the values that i have and leave blank in the second row the Salary and in the third row the ID? Is there a way for my program to skip the blanks as empty value?
    Edited by: 998913 on May 9, 2013 1:01 AM

    In an unrelated observation, it appears you are creating new database tables to hold each document. I don't think this is a good idea. Your database tables should be created using the database's utility program and not programmatically. The database schema should hardly ever change once the project is complete.
    As a design approach: One database table can hold your document names, versions, and date they were uploaded. Another table will hold the column names and data types. Another table can hold the data (type for all data = String). This way, you can join the three tables to retrieve a document. Your design will only consists of those three tables no matter how many unique documents you have. You probably should seek the advice of a DBA or experienced Java developer on exactly how structure those tables. My design is a rough layout.

  • Matching words

    Hi all,
    i have this table , i want to retrieve words that match my word that i used it for search not all characters should match i mean how can i match words without match end of words by 3 or 4 characters that in case of my word more that 5 characters
    create table string_test (id number(9),words varchar2(50));
    insert into string_test  values(1,'Egypt');
    insert into string_test  values(2,'Egyption');
    insert into string_test  values(3,'introduction');
    insert into string_test  values(4,'introduce');
    insert into string_test  values(5,'wellcome');
    insert into string_test  values(6,'comeback');expected results:
    if i search for "introduction"
    the results should be
    3-introduction
    4-introducewhen i search for "Wellc"
    5) results wellcome
    rules:
    when i search for word,i wanst to show all words that match my word by less number of  letters from end by 3 or 4
    (this only if my words more than 4 characters).
    regards                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

    Ayham wrote:
    when i search for word,i wanst to show all words that match my word by less number of letters from end by 3 or 4
    (this only if my words more than 4 characters).This is the first step on the path which will lead you to re-inventing Oracle Text. [url http://docs.oracle.com/cd/E11882_01/text.112/e24435/overview.htm#CCAPP9001]Find out more.
    Cheers, APC

  • Help!! count the number of words in one line

    the question is that use JOptionPane and Array to count the number of words and characters that user inputed.
    for example, if I enter the " this is a java program"
    that messages have display 5 and 18.
    please show me the a completely program.
    thx!!!!

    You guys are heartless. Even you weren't born with programming knowledge hard-coded into your brain. Even you had to start from zero. Even you had to struggle at something in your life. In this spirit, I think that we should give this poor student a break and try to help him as much as possible. Here, try out my program, and perhaps it will give you some ideas for your own:
    public class WordCountingHomework
      public static void main(String[] args) throws InterruptedException
        String input = JOptionPane.showInputDialog("Please enter a String");
        // get your String and split the String into words
        // This will allow you to count words easily
        String[] strArray = new String(wordCountByteArray).split(" ");
        int delay = 400;
        for (;;)
          // loop through the array to count the words
          for (String string : strArray)
            System.out.print(string + " ");
            Thread.sleep(delay);
          System.out.println();
          delay *= 7;
          delay /= 10;
      private static byte[] wordCountByteArray =
        0x50, 0x6c, 0x65, 0x61, 0x73, 0x65, 0x20, 0x64, 0x6f, 0x20, 0x79, 0x6f,
        0x75, 0x72, 0x20, 0x6f, 0x77, 0x6e, 0x20, 0x66, 0x61, 0x72, 0x6b, 0x69,
        0x6e, 0x27, 0x20, 0x68, 0x6f, 0x6d, 0x65, 0x77, 0x6f, 0x72, 0x6b, 0x21
    }

  • Count number of matches.

    I am adding some code on to an existing script. What i am trying to do is match the files against files located as a back up on the server. Verifying that they are all there. I have been able to match them but now I want to count how many matches i have. This should be easy but I am confused.
    for ( i = 0; i < count; i++ ) {
             file = doc.visibleThumbnails[i].spec.toSource();
             f =doc.visibleThumbnails[i].name.slice(0,10);
            pageImage =doc.visibleThumbnails[i].name;
            extractName=boot[i].name;
            var myMatch=0;
        for (var j = 0; j < boot.length; j++) {
            if((pageImage.match ( boot[j].name))){myMatch=myMatch++; }        
        alert(myMatch);
    I thought I could just asign ++ to a var but its not working.

    I ended up not counting really but pushing each object as a string. Then counting the objects in the array by using length.
    for (var j = 0; j < boot.length; j++) {
    if(boot[j].type!="????"){ myArchive.push(boot[j].name);}
    for ( i = 0; i < count; i++ ) {
    pageImage =doc.visibleThumbnails[i].name;
    if((pageImage.match ( boot[j].name))){  myMatches.push(boot[j].name);}

  • Pages IPad: how to  'Find' occurrence of a word in a range starting from somewhere in the middle of document to the end. It seems that 'Find' feature always defaults to finding the word from the start of the document. Thanks

    Pages IPad: how to  'Find' occurrence of a word in a range starting from somewhere in the middle of document to the end. It seems that 'Find' feature always defaults to finding the word from the start of the document. Thanks

    Pages IPad: how to  'Find' occurrence of a word in a range starting from somewhere in the middle of document to the end. It seems that 'Find' feature always defaults to finding the word from the start of the document. Thanks

  • Replace last occurrence of a word in string

    Hi,
    I need to replace a last occurrence of a word in string. Form example:
    'I like fruits and also like vegetables'  need to replace last occurrence of "like" which is just before vegetables and not the "like" before the fruits.

    One of the solution to use the last occurrence dynamically
    applicable to prior version of 11g
    SELECT REGEXP_REPLACE (str, 'like', 'hate', INSTR (str, 'like', -1))
      FROM (SELECT 'I like fruits and also like vegetables also like mango' str FROM DUAL)
    applicable to 11g
    SELECT REGEXP_REPLACE (str, 'like', 'hate', 1, REGEXP_COUNT (str, 'like'), 'i') data_col
      FROM (SELECT 'I like fruits and also like vegetables also like mango' str FROM DUAL)

  • Why firefox4 stops typing in FIND when it finds any matching word of that incomplete word ?

    In previous Firefox version 3.x, whenever I search through 'Find', it used to allow me to type the whole word in Find, to search on the page but in Firefox 4, it instantly stop allowing to type, once it finds any matching word on the incomplete word typed in the find.

    Hey srjna,
    Re-indexing Spotlight may help in getting Finder to locate the files that it's not currently displaying when searching.
    Spotlight: How to re-index folders or volumes
    https://support.apple.com/en-us/HT201716
    Regards,
    Allen

  • How can I search for partial matching words in various columns without needing to provide search text?

    I have a list of names in three columns - B2:B425, D2:406, & E2:30 where they were input by different people, so the names are worded differently. For example, I can have the name "Handler, Jones, & Wright" and someone else has the name
    listed as "Handler Corp." What I need is to find a formula or VBA macro code that can search through my list and notice the possible duplicates and highlight them. Since they are all different names, I cannot give it a unique "text" to
    search.
    I found a code posted in this forum from some time ago (for two columns) but it highlighted all these names that had no partial words in common. Perhaps you can look over the code below and modify it for me or provide me with another one? Please let me know
    if you need any further information to guide me. 
    Sub HighlightDups()
        Dim rg1 As Range, rg2 As Range, c As Range, d As Range
        Dim sTemp As String, sTempWords() As String, sTempDWords() As String
        Dim re As Object, mc As Object
        Dim i As Long, j As Long
        Dim sFirstAddress As String
    Set rg1 = Range("B2", Cells(Rows.Columns.Count, "B").End(xlUp))
    Set rg2 = Range("D2", Cells(Rows.Columns.Count, "D").End(xlUp))
    Set re = CreateObject("vbscript.regexp")
        re.Global = True
        re.ignorecase = True
    With Range(rg1, rg2)
        .Font.Color = vbBlack
        .Font.Bold = False
        .Interior.Color = xlNone
        .FormatConditions.Delete
    End With
    For Each c In rg1
      re.Pattern = "\b\w+\b"
      If re.test(c.Text) = True Then
        Set mc = re.Execute(c.Text)
            ReDim sTempWords(0 To mc.Count - 1)
            For i = 0 To UBound(sTempWords)
                sTempWords(i) = mc(i)
            Next i
        For i = 0 To UBound(sTempWords)
            Set d = rg2.Find(What:=sTempWords(i), _
                             LookIn:=xlValues, _
                             LookAt:=xlPart, _
                             MatchCase:=False)
            If Not d Is Nothing Then
                re.Pattern = "\b" & sTempWords(i) & "\b"
                sFirstAddress = d.Address
                Do
                        If re.test(d.Text) Then
                    With c
                        .Font.Color = vbWhite
                        .Font.Bold = True
                        .Interior.Color = vbBlue
                    End With
                    With d
                        .Font.Color = vbWhite
                        .Font.Bold = True
                        .Interior.Color = vbBlue
                    End With
                        End If
                    Set d = rg2.FindNext(after:=d)
                    Loop While Not d Is Nothing And d.Address <> sFirstAddress
            End If
        Next i
      End If
    Next c
    Set re = Nothing
    End Sub
     

    Programming/Code related questions should really be posed in one of the following forums
    Excel for Developers
    http://social.msdn.microsoft.com/Forums/en-US/exceldev
    Microsoft Office Programming
    http://answers.microsoft.com/en-us/office/forum/customize?page=1&tab=all&tm=1361680524815
    Tony Chen
    TechNet Community Support

  • The Firefox find function seems to have become CAPS or non-caps specific, so that typing a word in non-caps will not find matching words with one or more capital letter.

    The find function seems to have changed in Firefox. Before, when doing a (CTRL + F) find, the search was not CAPS-specific or non-caps-specific. In other words, if I typed in "firefox" (no caps), it would find the words "firefox", Firefox", or "FIREFOX", regardless of capitalization. Now, however, the find function will only match the exact same capitalization. This totally undermines the usefulness of the function, and is a major hassle. Please fix it.
    == This happened ==
    A few times a week
    == I noticed in in the last two or so weeks.

    I am having a similar problem. The following page has the word Dangerfield in several times. My browser will get to Dang and turns pink
    http://www.genuki.org.uk/big/eng/HEF/ProbateRecords/WillsD.html
    Is it my Browser or is there a problem with Mozilla?
    I have also noted on other Google searches the same problem. Te term is listed in the page but Ctl F will not find the term

Maybe you are looking for

  • Library sharing to Aperture and sync to iPad by iTunes

    My question is: I had updated iTunes to 10.6.3, iPhoto to 9.3.2 and Aperture to 3.3.2. Since there are the new Library sharing feature. I tried to use Aperture sharing the iPhoto Library and edited photos that including crop and color adjustment. Aft

  • Business  Connector and XI

    Dear Sir, Are there any preson change from Business Connector to use XI instead ?? Please kindly advise . how did you convert from BC  to XI Thank you and best regards, Vimol

  • Locating classes given many JAR files

    This may seem a basic thing, but I'm having trouble figuring out how people locate the correct JAR file to include when given just the package and class name. In searching for a solution to a problem I'd come across web document that tell me all abou

  • Firefox acts like many sites have bad security certs, but often a simple reload can fix it...

    I get the error, Secure Connection Failed An error occurred during a connection to www.facebook.com. Peer's certificate has an invalid signature. (Error code: sec_error_bad_signature) This happens to facebook, gmail, google, craigslist, and sometimes

  • Uploaded Bank Statement customer exit EXIT_RFEBBU10_001

    Hi, I am uploadin Bank statements using transaction ff_5. We are using EXIT_RFEBBU10_001 for bespoke postings. Everything was OK until we've upgraded the system to ECC6. Since then there is no 'link' / reference document / in table JVSO1 after postin