Using fuzzy operator in contains

I am new to all of this and would like to get more comfortable that the approach I've taken is correct. Our problem is taking user input which may not be 100% correct or slighty off and attempting to search the database with that input. The idea is the user provided '201 Wilshire Blvd' when the databse contains '201 Wiltshire Ave'. Can we find this customer in the system even if they left the 't' out of Wiltshire and not searching for Ave, Blvd, St, etc? Oracle Text fuzzy operators seems the right solution. We are using Oracle 9i. The app is written in Java/JDBC (1.3 jdk).
I tried to keep it as simple as possible but here is what I did:
1- create a table specifically for searching containing an id, varchar2(360), varchar2(666)
2- create a CONTEXT index on the first varchar
3- create a CONTEXT index on the second varchar
4- the varchar2(666) contains the address information in the following format: 'zip state city addr1 addr2'
5- triggers are defined to keep the search table in sync with its source tables
6- the indexes are re-sync'ed nightly
I created a separate search table because I was concerned over performance if I were to create indexes on the source tables. The select statements I construct look like the following:
Where CONTAINS(address, ‘24032 & MD & Frostburg’, 10) > 0 AND
CONTAINS(address, ‘1616’, 20) > 0 AND
CONTAINS(address, ‘fuzzy(Pullman, 60, 30, weight)’, 30) > 0 AND
CONTAINS(customer_name, ‘fuzzy(Acme, 60, 30, weight)’, 40) > 0
So, zip, state and city must match and street address must match. Terms extracted from the address and name are searched for using the fuzzy operator.
My concerns are:
1- performance: My search table has over 2.1M records. What can I do to improve lookups.
2- ignorance: Like I said, I'm new to all of this; am I correct that using the fuzzy operator for numbers makes no sense? So if they type '30' but meant '300' too bad?
3- accuracy: How can I use the input parameters to improve my hit rate? Given that I am indexing varchars and not a document set, does it make sense to change the min score from the conatins clause?
Sorry this is so long but I appreciate any comments/suggestions. Hope I haven't left out anything important.
David

Hi,
Here's a start - taking your questions one at a time:
(1) "Can we find this customer in the system even if they left the 't' out of Wiltshire..."
create table z_test (col1 varchar2(100));
insert into z_test values ('wiltshire');
commit;
create index z_test_idx on z_test(col1)
indextype is ctxsys.context;
-- SQL> column col1 format a20
-- SQL> select score(1), col1
-- 2 from z_test
-- 3 where contains(col1, '!wilshire', 1) > 0;
-- SCORE(1) COL1
-- 3 wiltshire
(2) "...and not searching for Ave, Blvd, St, etc"
truncate table z_test;
insert into z_test values ('wiltshire blvd');
insert into z_test values ('wiltshire ave');
insert into z_test values ('wilshire ave');
commit;
exec ctx_ddl.sync_index('Z_TEST_IDX')
-- SQL> COLUMN COL1 FORMAT A20
-- SQL> select score(1), col1
-- 2 from z_test
-- 3 where contains(col1, '!wilshire, ave', 1) > 0
-- 4 order by 1 desc;
-- SCORE(1) COL1
-- 52 wiltshire ave
-- 52 wilshire ave
-- 2 wiltshire blvd
-- wiltshire blvd is returned even though the search was
-- for ave - see the comma separating tokens in this case.
-- Note also the difference in score as a result.
(3) "5- triggers are defined to keep the search table in sync with its source tables"
That has to be expensive. More on this to come.
(4) "I created a separate search table because I was concerned over
performance if I were to create indexes on the source tables. "
Please explain where you anticipate performance problems that prompted
the separate table. If search, do a trace on the search and see
where Oracle spends its time. You'd be better off creating a storage
preference, storing the DR$ tables in a separate tablespace.
(5) "CONTAINS(address, ‘24032 & MD & Frostburg’, 10) > 0 AND
CONTAINS(address, ‘1616’, 20) > 0 AND
CONTAINS(address, ‘fuzzy(Pullman, 60, 30, weight)’, 30) > 0 AND
CONTAINS(customer_name, ‘fuzzy(Acme, 60, 30, weight)’, 40) > 0 "
You can (and should) simplify this query. There are three
passes at the same column and it isn't necessary. Read up on
searching using contains some more.
-Ron

Similar Messages

  • Problem when using About Operator in Contains Query

    Hi,
    I'm new to Oracle and this forums too. I have a problem when using about operator in contains query.
    I create a table with some records and then create a context index on 'name' column.
    CREATE TABLE my_items (
      id           NUMBER(10)      NOT NULL,
      name         VARCHAR2(200)   NOT NULL,
      description  VARCHAR2(4000)  NOT NULL,
      price        NUMBER(7,2)     NOT NULL
    ALTER TABLE my_items ADD (
      CONSTRAINT my_items_pk PRIMARY KEY (id)
    CREATE SEQUENCE my_items_seq;
    INSERT INTO my_items VALUES(my_items_seq.nextval, 'Car', 'Car description', 1);
    INSERT INTO my_items VALUES(my_items_seq.nextval, 'Train', 'Train description', 2);
    INSERT INTO my_items VALUES(my_items_seq.nextval, 'Japan', 'Japan description', 3);
    INSERT INTO my_items VALUES(my_items_seq.nextval, 'China', 'China description', 4);
    COMMIT;
    EXEC ctx_ddl.create_preference('english_lexer','basic_lexer');
    EXEC ctx_ddl.set_attribute('english_lexer','index_themes','yes');
    EXEC ctx_ddl.set_attribute('english_lexer','theme_language','english');
    CREATE INDEX my_items_name_idx ON my_items(name) INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS('lexer english_lexer');
    EXEC ctx_ddl.sync_index('my_items_name_idx');Then I perform contains query to retrieve record :
    SELECT count(*) FROM my_items WHERE contains(name, 'Japan', 1) > 0;
    COUNT(*)
          1
    SELECT count(*) FROM my_items WHERE contains(name, 'about(Japan)', 1) > 0;
    COUNT(*)
          1But the problem is when I using ABOUT operator like in Oracle's English Knowledge Base Category Hierarchy it return 0
    SELECT count(*) FROM my_items WHERE contains(name, 'about(Asia)', 1) > 0;
    COUNT(*)
          0
    SELECT count(*) FROM my_items WHERE contains(name, 'about(transportation)', 1) > 0;
    COUNT(*)
          0I can't figure out what 's wrong in my query or in my index.
    Any help will be appreciated.
    Thanks,
    Hieu Nguyen
    Edited by: user2944391 on Jul 10, 2009 3:25 AM

    Hello (and welcome),
    You'd be best asking this question in the Oracle Text forum, here:
    Text
    And by the way, it will help others to analyse if you put {noformat}{noformat} (lowercase code in curly brackets) before and after your code snippets.
    Good luck!                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • Fuzzy Operator with Contains (Oracle Text)

    Hi !
    I want to know If Fuzzy Operator works with FRENCH language ? In the doc, Oracle said you can created the Context Index in French but for the Fuzzy Operator, the FRENCH language are not displayed in supported language (Chap. 3).
    Thanks

    It was a problem with sqlplus.exe (not happening to sqlplusw.exe) displaying wrong caracters,
    because of the code page it was using.
    I've solved executing one of these:
    "set NLS_LANG=american_america.US8PC437" or "chcp 1252", before sqlplus.exe.
    Joaquin Gonzalez

  • Statements hang when using fuzzy operator

    Hi all,
    Since we rolled out an update of our application, the database sessions seem to run into deadlocks. I would really apreciate any help!
    perhaps this is a known issue?
    In the old version we already used Oracle Text without problems.
    The only change we applied in our SQL statements was to encapsulate the search term with {}.
    Here is the relevant part:
    AND contains(column, 'fuzzy({example})', 1, 100, WEIGHT)', 1) > 0
    In the previous release we used this without problems:
    AND contains (column, 'fuzzy(example)', 1, 100, WEIGHT)', 1) > 0
    The whole statement is quite large, uses unions over several similar selects on different tables.
    the hanging sessions have the message: Cursor: pin S wait on X.
    What I saw from other posts, this seems to be a common thing for deadlock situations.
    A second strange thing is that the sessions hang only when accessing certain tables. Similar statements that access other tables never hang.
    Thanks in advance,
    Michael

    not sure. Anyone knows about this board. I think its not that powerful. The AGP, PCI is locked.

  • How to use 'about' operator in Full text search query?

    Hi all,
    I have to search the following string in full text index using 'about' operator.
    'Advertisment(Cosmetics) Assets'
    If use the following query
    SELECT keyword_id
    FROM mam_keyword_languages
    WHERE contains(fts_text_uc, convert('about(Advertisment(Cosmetics) Assets)',
    'WE8MSWIN1252', 'WE8MSWIN1252')) > 0
    ORDER BY nlssort(text, 'NLS_SORT=EEC_EUROPA3')
    It gives following error.
    ERROR at line 1:
    ORA-29902: error in executing ODCIIndexStart() routine
    ORA-20000: Oracle Text error:
    DRG-50901: text query parser syntax error on line 1, column 37
    How can i do this search? Is there any other way?
    Thanx in advance.
    T.Umapathy

    Sum((postab.subtotal)*(loc.royalty)/100)
    Is there any other way to take product of two
    attributs? your help will be greatly appreciated as
    it is really stumbling block in my project. Thanks in
    advanceSuch a stumbling block should have inspired more activity on your part.
    I'd try rewriting it like this:
    sum(postab.subtotal*loc.royalty/100)[/b]%

  • How to Use Dimension Operator

    Hi
    I'm trying to implement SCD Type 2. I have done so using conventional methods.
    I have read in some blogs that dimension operator can be used for SCD. Can any one provide me material on how to use Dimension Operator. I tried OWB User guide. But its not useful.
    I have seen that we need to create levels. But i dont need levels.
    Can somebody please tell me how to use it.
    Regards
    Vibhuti

    Hi Vibhuti,
    using dimensions with OWB 10g R2 isn't that difficult. You just create a number of attributes like an ID, a business key and probably a description and then associate each of the attributes with a level. You need at least one level that all the attributes are associated to. Then you can use the slowly changing dimension wizard (SCD) to track changes. In the SCD settings you can determin for which attribute you want to trigger history and which attributes contain your effective date and your end date (if you want to use SCD type II). Obviously you would need two additional attributes in every level for that purpose.
    Regards,
    Jörg

  • MSI 6309 V2.0 Overclock using Fuzzy Logic 4

    Hi Folks, I am new here and new to MSI MOBOs.  I have got a MSI 6309 V2.0 Mobo with a PIII 800 CPU.  I want to overclock this using Fuzzy Logic 4. When I try doing this I get lots of message boxs with 'privileged instruction'. ;( Can someone tell me if this MOBO is supported by FL4 ?( and secondly if this MOBO is the same as MSI 6309 'LITE' ?(
    Thanks in advance

    The ratio can't be changed while operating.
    The ratio the board booted with will be kept.

  • How can I load a .xlsx File into a SQL Server Table using a Foreach Loop Container in SSIS?

    I know I've REALLY struggled with this before. I just don't understand why this has to be soooooo difficult.
    I can very easily do a straight Data Pump of a .xlsX File into a SQL Server Table using a normal Excel Connection and a normal Excel Source...simply converting Unicode to DT_STR and then using an OLE DB Destination of the SQL Server Table.
    If I want to make the SSIS Package a little more flexible by allowing multiple .xlsX spreadsheets to be pumped in by using a Foreach Loop Container, the whole SSIS Package seems to go to hell in a hand basket. I simply do the following...
    Put the Data Flow Task within the Foreach Loop Container
    Add the Variable Mapping Variable User::FilePath that I defined as a Variable and a string within the FOreach Loop Container
    I change the Excel Connection and its Expression to be ExcelFilePath ==> @[User::FilePath]
    I then try and change the Excel Source and its Data Access Mode to Table Name or view name variable and provide the Variable Name User::FilePath
    And that's when I run into trouble...
    Exception from HRESULT: 0xC02020E8
    Error at Data Flow Task [Excel Source [56]]:SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occured. Error code: 0x80004005.
    Error at Data Flow Task [Excel Source [56]]: Opening a rowset for "...(the EXACT Path and .xlsx File Name)...". Check that the object exists in the database. (And I know it's there!!!)
    I don't understand by adding a Foreach Loop Container to try and make this as efficient as possible has caused such an error unless I'm overlooking something. I have even tried delaying my validations and that doesn't seem to help.
    I have looked hard in Google and even YouTube to try and find a solution for this but for the life of me I cannot seem to find anything on pumping a .xlsX file into SQL Server using a Foreach Loop Container.
    Can ANYONE please help me out here? I'm at the end of my rope trying to get this to work. I think the last time I was in this quandry, trying to pump a .xlsX File into a SQL Server Table using a Foreach Loop Container in SSIS, I actually wrote a C# Script
    to write the contents of the .xlsX File into a .csv File and then Actually used the .csv File to pump the data into a SQL Server Table.
    Thanks for your review and am hoping and praying for a reply and solution.

    Hi ITBobbyP,
    If I understand correctly, you want to load data from multiple sheets in an .xlsx file into a SQL Server table.
    If in this scenario, please refer to the following tips:
    The Foreach Loop container should be configured as shown below:
    Enumerator: Foreach ADO.NET Schema Rowset Enumerator
    Connection String: The OLE DB Connection String for the excel file.
    Schema: Tables.
    In the Variable Mapping, map the variable to Sheet_Name, and change the Index from 0 to 2.
    The connection string for Excel Connection Manager is the original one, we needn’t make any change.
    Change Table Name or View name to the variable Sheet_Name.
    If you want to load data from multiple sheets in multiple .xlsx files into a SQL Server table, please refer to following thread:
    http://stackoverflow.com/questions/7411741/how-to-loop-through-excel-files-and-load-them-into-a-database-using-ssis-package
    Thanks,
    Katherine Xiong
    Katherine Xiong
    TechNet Community Support

  • Error while using between operator with sql stmts in obiee 11g analytics

    Hi All,
    when I try to use between operator with two select queries in OBIEE 11g analytics, I'm getting the below error:
    Error Codes: YQCO4T56:OPR4ONWY:U9IM8TAC:OI2DL65P
    Location: saw.views.evc.activate, saw.httpserver.processrequest, saw.rpc.server.responder, saw.rpc.server, saw.rpc.server.handleConnection, saw.rpc.server.dispatch, saw.threadpool.socketrpcserver, saw.threads
    Odbc driver returned an error (SQLExecDirectW).
    State: HY000. Code: 10058. [NQODBC] [SQL_STATE: HY000] [nQSError: 10058] A general error has occurred. [nQSError: 43113] Message returned from OBIS. [nQSError: 27002] Near <select>: Syntax error [nQSError: 26012] . (HY000)
    can anyone help me out in resolving this issue.

    Hi All,
    Thank u all for ur replies, but I dint the exact solution for what I'm searching for.
    If I use the condition as
    "WHERE "Workforce Budget"."Used Budget Amount" BETWEEN MAX("Workforce Budget"."Total Eligible Salaries") AND MAX("Workforce Budget"."Published Worksheet Budget Amount"",
    all the data will be grouped with the two columns which I'm considering in the condition.
    my actual requirement with this query is to get the required date from a table to generate the report either as daily or weekly or monthly report. If I use repository variables, variables are not getting refreshed until I regenerate the server(which I should not do in my project). Hence I have created a table to hold weekly start and end dates and monthly start and end dates to pass the value to the actual report using between operator.
    please could anyone help me on this, my release date is fast approaching.

  • I used a partitioned HDD for time machine, using a partition already containing other data files. I am now no longer able to view that partition in Finder. Disk Utility shows it in grey and "not mounted". Any suggestions of how to access the files?

    I used a partitioned HDD for time machine, using a partition already containing other data files. I am now no longer able to view that partition in Finder. Disk Utility shows it in grey and "not mounted". Any suggestions of how to access the files? Does using time machine mean that that partition is no longer able to be used as it used to be?
    HDD is a Toshiba 1TB, partitioned into two 500GB partitions.
    OS X version 10.9.2

    Yes, sharing a TM disk is a bad idea, and disks are cheap enough so that you don't need to.
    Now
    Have you tried to repair the disk yet

  • I am new in using Mac operating system, kindly suggest ebooks , videos or audio books to me so that i can learn more about it?

    i am new in using Mac operating system, kindly suggest ebooks, videos or audio books to me so that i can learn more about it.
    any kind of help would be appriciated. i am very eager to learn.how to make ios application? and how to effectively use terminal? where does the basic programming start in Mac? what are the different tools that can help me make an Mac application and ios application.
    -Thank you
    Shailendra (India)

    Apple has got some great guides to start developing in Objective-C, used for programming OS X and iOS apps > http://developer.apple.com/library/mac/#referencelibrary/GettingStarted/RoadMapO SX/chapters/01_Introduction.html

  • How to use EQUIV operator in a query?

    i want to list products whose names like 'Windows XP' or 'WindowsXP', following is my query:
    select * from product
    where contains(product_name,'Windows XP = WindowsXP',1) > 0
    but this query only return the products whose name contain 'Windows XP'. the result is something likes
    select * from product
    where contains(product_name,'Windows (XP = WindowsXP)',1) > 0
    so i modified it to :
    select * from product
    where contains(product_name,'(Windows XP) = (WindowsXP)',1) > 0
    it's error
    Error report:
    SQL Error: ORA-29902: error in executing ODCIIndexStart() routine
    ORA-20000: Oracle Text error:
    DRG-50900: text query parser error on line 1, column 26
    DRG-50921: EQUIV operand not a word or another EQUIV expression
    29902. 00000 - "error in executing ODCIIndexStart() routine"
    *Cause:    The execution of ODCIIndexStart routine caused an error.
    *Action:   Examine the error messages produced by the indextype code and
    take appropriate action.

    Equiv only works for individual terms. Since there is a space between Windows and XP, they are two terms. You can use synonyms for phrases containing multiple terms, as shown below.
    SCOTT@orcl_11g> CREATE TABLE product (product_name  VARCHAR2 (30))
      2  /
    Table created.
    SCOTT@orcl_11g> INSERT ALL
      2  INTO product VALUES ('Windows XP')
      3  INTO product VALUES ('WindowsXP')
      4  INTO product VALUES ('Unix')
      5  SELECT * FROM DUAL
      6  /
    3 rows created.
    SCOTT@orcl_11g> BEGIN
      2    CTX_THES.CREATE_THESAURUS ('name_thes');
      3    CTX_THES.CREATE_RELATION ('name_thes', 'Windows XP', 'SYN', 'WindowsXP');
      4  END;
      5  /
    PL/SQL procedure successfully completed.
    SCOTT@orcl_11g> CREATE INDEX product_name_idx ON product (product_name)
      2  INDEXTYPE IS CTXSYS.CONTEXT
      3  /
    Index created.
    SCOTT@orcl_11g> SELECT * FROM product
      2  WHERE  CONTAINS (product_name, 'SYN (Windows XP, name_thes)') > 0
      3  /
    PRODUCT_NAME
    Windows XP
    WindowsXP
    SCOTT@orcl_11g> SELECT * FROM product
      2  WHERE  CONTAINS (product_name, 'SYN (WindowsXP, name_thes)') > 0
      3  /
    PRODUCT_NAME
    Windows XP
    WindowsXP
    SCOTT@orcl_11g>

  • Why do we use Allowed Operations in DML Process

    Hello,
    Why do we use Allowed Operations in DML Process ??
    Can you please clear this confusion:
    I am using apex 4.1. oracle 11g R2 SOE ...
    Using the Wizard, I created a Form and IR on Dept Table...
    In the form page:
    - Create Button
    The name is "CREATE"
    NO Database Action
    - DML Process
    Allowed Operations: nothing is checked
    This will insert a new row in the Dept table
    In the form page:
    - Create Button
    The name is "CREATE2"
    Database Action : insert
    - DML Process
    Allowed Operations: nothing is checked
    This will insert a new row in the Dept table
    So, What difference does it make if INSERT check box in Allowed Operations of DML Process is TICKED OR NOT ??
    Regards,
    Fateh

    kdm7 wrote:
    Okay.
    So can we keep a web button to access the www.ni.com ? So that web site opens only when button pressed?
    P.S  I,m a newbie.
    Yes, you can also, e.g. include a help file or manual as html and open that in the browser.
    /Y
    LabVIEW 8.2 - 2014
    "Only dead fish swim downstream" - "My life for Kudos!" - "Dumb people repeat old mistakes - smart ones create new ones."
    G# - Free award winning reference based OOP for LV

  • Problem in JDBC , when using LIKE operator. - VERY URGENT

    Problem in JDBC , when using LIKE operator.
    LINE 1 : String temp = "AA";
    LINE 2 : String query = "select * from emp where EMPNAME like '*temp*' ";
    LINE 3 : Staement st = con.createStaement();
    LINE 4 : ResultSet rs = st.executeQuery(query);
    '*' character is not getting evaluated. In MS ACCESS2000 only * is accepted instead of '%'. Moreover in MS ACCESS the like operator has to be used within double quotes as a String. whereas in other databases, it accepts single quotes as a String.
    Ex:
    In MS ACCESS
         select * from emp where ename like "*aa*";
    Other Databases
         select * from emp where ename like '%aa%';
    In my situation iam passing a Variable inside a like operator and '*' is used.
    For the above Scenario, Please help me out.
    If possible Kindly let me know the exact Syntax.
    Please give me the answer as LINE1,LINE2,LINE3,LINE4,
    I have verified in JDBC Spec also, it has been specified to use escape sequence.that too did not work.
    Due to this, My project is in hold for about 4 days. I could not find a suitable solution.
    Please help me out.

    I made a LIKE clause work with M$ Access, using PreparedStatement and the % wildcard:
                escapeStr                   = "%";
                String sql                  = "SELECT USERNAME, PASSWORD FROM USERS WHERE USERNAME LIKE ?";
                PreparedStatement statement = connection.prepareStatement(sql);
                statement.setString(1, ("user" + escapeStr));
                ResultSet resultSet         = statement.executeQuery();
                while (resultSet.next())
                    System.out.println("username: " + resultSet.getObject("USERNAME") + " password: " + resultSet.getObject("PASSWORD"));

  • How can I file share with another person if both of us are using Mac operating systems?  Do we need to use a third party file sharing system or does apple have this capability?

    How can I file share with another personif both of us are using Mac operating systems (one of us using a Mac laptop and the other using iMac).  Our intention is to have a working document that can be changed by both parties over time and both parties will have visibility to the others changes.

    Use SugarSync

Maybe you are looking for

  • Page looks different in Dreamweave than online...why?

    This is a fairly basic question, but it's driving me nuts! When I look at my menu in dreamweaver, it looks like this: But when it's live it looks normal, like this: Does anyone know why that happens? I think it has something to do with the position:

  • Specific smarthost based on email header data

    Hi All We plan on upgradeing to Exchange 2010 and then 2013. Our requirement is that we have a service provider who handles our bulk email and one that handles our non-bulk email My Question is - in 2010 or 2013 Can Exchange be configured to inspect

  • Use smart card for 802.1x secured WiFi authentication

    Hi, is it possible to use a certificate stored on a USB Security Token for WiFi 802.1x authentication? I have setup a test environment with all required components (AD, Enterprise CA, NPS, WPA2-Enterprise capable WiFi Access Point, all required certi

  • Rounding of Depriciation

    Dear All,          How Can we Specify Rounding of Net Book Value and/or Depreciation. Thanks & Regards, Mahendra Gupta

  • Reader incompatible with system configuration

    Installing Reader Xl has turned all of my icons into Adobe symbols and will not open. Dialog box says that protected mode is incompatible with system configuration but still will not open even if I click to disable proteced mode. I cannot open Adobe