Deduplication in ODI 10g

Hi All,
I was just wondering if ODI 10g alone can and is intended to perform deduplication (data cleansing)? By deduplication, I mean that from an error table (where the duplicates were stored), we'll perform a series of pattern matching checks until eventually, only one record is left among the duplicates and loaded to the target.
Example of the checks needed to be done:
Assume:
Table Columns: col0 | col1 | col2 | col3 | col4 | col5
*PK is col0
Scenario#1: between duplicate records (same PK), if their non-PK col1 value is the same and their non-PK col2 value is the same, select the record where non-PK col3 is not equal to 0; if there is no such record or if there are still 2 or more duplicate records that satisfy this, apply another rule
Scenario#2: between duplicate records (same PK), if their non-PK col1 value is the same but their non-PK col2 is different, select the record with the most number of fields that are not null; if there are records with the same number of 'not null' fields, select the one where col4 is filled over the record where col5 is filled, etc.
Scenario#3: ...
Scenario#4: ...
Basing on the data integrity checks available in ODI, this cannot be done without doing a manual way of cleansing (setting up one interface per scenario/case in the set of pattern matching rules and creating a lot of temporary staging tables) which will result in a more complex setup. Is this a correct approach or is there an alternative way of doing this? (customizing the KM, etc.) Will the ODQ tool be useful in this kind of situation? Hope someone knowledgeable on this would be able to help.
Thanks a lot,
Marco

Your approach sounds feasible but its a lot of data churn.
An alternative approach can be to create a data cleansing procedure.
And for each scenario that you have, you can create a step (hence series of steps) that perform the data cleansing.

Similar Messages

  • Query  Regarding Updation/Migration of ODI 10g To ODI 11g.

    Hi All,
    Currently I am using ODI 10g Version & Repositories ( Work & Master) have been installed on Oracle database version "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi"
    We are thinking to migrate from ODI 10g to ODI 11g Version 11.1.1.5 & i have some queries which are metioned below.
    1. Can we install ODI 11g Version 11.1.1.5 version with Repositories ( Work & Master) on Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi" or do i need to upgrade my database version to 11G?.
    2. If yes then, Can i upgrade or use exting Repositoires ( 10g one) for ODI 11g OR i have to create new Repositoires & move/migrate the objects of 10G repositories as mentioned in the Oracle installation doc.
    3. Currently I am using OBIEE 10g for reposrting purpose & if i switch to ODI 11g , Do i need to use OBIEE 11g?
    ODI gurus, I need your reponse ASAP & i have to share it on urgent basis.
    Thanks
    Edited by: neeraj_singh on May 15, 2013 9:58 PM

    neeraj_singh wrote:
    Hi All,
    Currently I am using ODI 10g Version & Repositories ( Work & Master) have been installed on Oracle database version "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi"
    We are thinking to migrate from ODI 10g to ODI 11g Version 11.1.1.5 & i have some queries which are metioned below.
    1. Can we install ODI 11g Version 11.1.1.5 version with Repositories ( Work & Master) on Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi" or do i need to upgrade my database version to 11G?.You can install ODI 11.1.1.5 but you have to upgrade your repositories using upgrade assistant
    refer http://docs.oracle.com/cd/E23943_01/upgrade.1111/e12642/tasklist.htm#CIHGIDFG
    2. If yes then, Can i upgrade or use exting Repositoires ( 10g one) for ODI 11g OR i have to create new Repositoires & move/migrate the objects of 10G repositories as mentioned in the Oracle installation doc.No need to create new repositories. You just upgrade them. But you need to takecare of certain things as you are a 10g user. Refer below link for the prerequisite
    http://docs.oracle.com/cd/E23943_01/upgrade.1111/e12642/prevusers.htm
    3. Currently I am using OBIEE 10g for reposrting purpose & if i switch to ODI 11g , Do i need to use OBIEE 11g?Not clear about the question ?
    >
    ODI gurus, I need your reponse ASAP & i have to share it on urgent basis.
    Thanks
    Edited by: neeraj_singh on May 15, 2013 9:58 PM

  • Getting Error Out Of Memory while importing the work repository in ODI 10g

    I exported the work repository from topology of ODI 10g of one DB and tried importing it in the another ODI 10g topology of another DB.While importing i got the error 'Out of Memory' .
    Can somebody suggest me ,how to solve the heap size out of memory issue while importing in ODI 10g.
    Thanks in Advance.

    Hi,
    you have to post your question in ODI forum
    Data Integrator
    Suresh

  • Implement MAX / JOIN in ODI 10g?

    What is the approach for using MAX functions in ODI 10g? I need it to filter source data:
    SELECT SRC_TAB.*       
    FROM SRC_TAB
    INNER JOIN
    (SELECT MAX(COL1) COL1, COL2  FROM SRC_TAB GROUP BY COL2) B
    ON SRC_TAB.COL1=B.COL1 AND SRC_TAB.COL2 = B.COL2
    Luckily this issue has been addressed in ODI 11g.
    Thank you.

    That's not good especially when you have large volume of data.
    In this case, in 10g it's better to rely on a underline view rather than 2 interfaces.

  • ODI 10g - session keep a table locked

    Hi,
    We have a random issue, with ODI session that keep a lock on a table, even replication is finished and session becomes inactive
    It generated dead locks as a trigger has to update the target table.
    what happened :
    - user application create rows (13)
    - ODI scenario replicate the rows (contract table)
    - 2nd scenario based on same source with another sunscriber run a stored procedure to create records in another table (around 30, positions table)
    this 2nd locked the target table, and when the run of the procedure finished, and commited, the lock was not released
    - ODI replicate another table (price) 30mn later, a trigger on target update position table with new values
    ---> trigger failed with deadlock (ora 60)
    ---> ODI failed as the trigger raised back the error
    this issue happened after 10 hours of same activity without issue, chained lot of time, but suddenly the lock become persistent (more than 4 hours)
    what can I do?
    use ODI 10g 10.1.3.5.0 - RDBMS 10.2.0.4

    Hi !
    For small tables wich are mostly accessed with full table scan you can use
    ALTER TABLE <table_name> STORAGE (BUFFER_POOL KEEP);KEEP pool should be properly sized , setting will cause once the table is read oracle will avoid flushing it out from pool.
    T

  • Problem when exporting and importing project from odi 10g to odi 11g

    Hi,
    I want to migrate my project from odi 10g to odi 11g.
    But when i am importing the interface then it is giving the error of mising references .
    I have exported the project(without its child component),models
    (including my datastore),KM's,folder (without its child component),packages(with child components),interaces(with child components),procedures(with child components),variables from odi 10g.
    After exporting all these objects i imported all the objects with import type set as "Synonym mode insert" into odi 11g but when i imported the interface it is giving the error of missing references.
    Source technolgy is Oracle and target technolgy is Postgres.
    Topologies have been made in the ODI 11g same as in ODI 10g.
    Please help.

    You dont need to migrate the complete repository. You can migrate a project at a time into ODI 11.1.1.5.x
    You have to be careful while importing. You have to follow a sequence when importing.
    Empty Project -> KMs -> Models (with DB Stores) -> Variable -> Empty Folders -> Interfaces -> Procedures -> Packages ---- All in SYNONYM mode insert (no exceptions)
    And your repository id in 11g MUST be different from the one in 10g.

  • Install ODI 10g on  64 bit Windows 2008 server

    Can we install ODI 10g on 64 bit Windows 2008 server ? if so Please provide the me the link for download
    Edited by: user1137989 on Oct 27, 2010 10:51 PM

    If you are looking for 10g then scroll down in the link (http://www.oracle.com/technetwork/middleware/data-integrator/downloads/index.html) Oracle Data Integrator 10g (10.1.3.5.0) and select for Microsoft Windows (x86) and extract and install .
    11g comes in 32 and 64 bit version .
    32 bit version is
    Oracle Data Integrator 11g (11.1.1.3.0)
    for Microsoft Windows (x86)
    64 bit version is .
    Oracle Data Integrator Companion 11g (11.1.1.3.0)
    for All Platforms
    Hope this helps .

  • ODI 10g configuration between Hyperion 9.3

    Hi,
    I have to pull the data from Hyperion Essbase, planning via ODI 10g.Please help me out how to configure between Hyperion and ODI 10g.
    If you have notes please send to my id [email protected]
    Thank You,
    Prasad

    Have a read of my blog as I have covered the steps - http://john-goodwin.blogspot.co.uk/2008/12/odi-series-extracting-data-from-essbase.html
    Cheers
    John
    http://john-goodwin.blogspot.com/

  • Nested query / Rank in ODI 10g?

    How can be an nested query with analytical function (RANK() OVER) can be implemented in ODI 10g without using an external view?
    SELECT field1, field2, field3, field4, field5, field6, field7
    FROM
             (SELECT
              RANK() OVER (PARTITION BY  table1.field1, table1.field2, table1.field3 ORDER BY table1.field4 , table1.field5) alias_rank,
              field1, field2, field3, field4, field5, field6, field7
                    FROM table1
              ) subset_alias   
    WHERE subset_alias.alias_rank=10)

    You should get your answers from here
    http://www.business-intelligence-quotient.com/?tag=oracle-data-integrator-subqueries

  • ODI 10g Union,Union All,Minus

    Hello,
    I have to use union,union all or minus set on ODI 10.But it doesn't have uninon,union all minus data set like in ODİ 11.How can ı use union statement in odi 10 g?For example I have to union 6 table in odi 10g.How it can be ?
    Thnx a kot

    Thx a lot Jerome .it works
    I have one more question:
    This select captured changed datas.I will delete dataas which returns from this select.And then I will insert new datas.
    I added to IKM COntrol append a new step like this:
    delete from <%=odiRef.getTable("L","TARG_NAME","A")%> T WHERE (<%=odiRef.getTargetColList("", "t.[COL_NAME]", ", ", "\n", "UK")%>) in
    SELECT PK_ID
    FROM <%=odiRef.getFrom()%>
    WHERE (1=1)
    <%=snpRef.getJoin()%>
    <%=snpRef.getFilter()%>
    <%=snpRef.getJrnFilter()%>
    <%=snpRef.getGrpBy()%>
    <%=snpRef.getHaving()%>
    /*commit*/
    I selected "UK" from target.
    But I dont know that it works true.I selected PK_ID from source but which PK_ID Will I select?When targets UK is equal to source UK(PK_ID) then these datas will be deleted.Changed datas will be inserted.How can I modify this step?İs there any solution to do this step?Can I get dynamicly PK_ID?
    Could you help me ?
    Best Regards
    Thanks

  • Exporting & Importing Contexts in ODI 10g

    Hi,
    I am using ODI 10g.
    I want to export Contexts from topology manager from my ODI test environment and import those contexts in ODI production environment.
    My requirement is that I want to export & import the contexts using some scripts and not manually.
    What steps can I follow for this?
    Thanks,
    Divya

    Hi Divya,
    Personally i feel rather than exporting individual components/objects, its suggested to export master repository as such.
    You can make use of ODI utility OdiExportMaster (under <ur package>->Tools->Oracle Data Integrator Objects) for exporting and Import Master Repository wizard (All Programs->Oracle-> Oracle Data Integrator->Repository Management-> Master Repository Import )for importing.
    Thanks,
    Guru.

  • Odi 10g installation on windows 7 ultimate 32 bit

    i am having windows7 ultimate 32 bit OS.. can i install ODI 10g on my system..

    You may check out below link how to install ODI 10g to Windows 7 platform.
    http://gurcanorhan.wordpress.com/2011/02/07/installing-odi-to-windows-7
    Cheers.

  • ODI: Is ODI 10g compatible with SOA 11g ?

    hi all,
    I have a requirement where ODI should invoke web services in SOA 11g fusion middleware.
    Is it required that i use ODI 11g?
    Or is ODI 10g compatible working with SOA 11g.
    In case if i have to use ODI 10g to invoke webservices in SOA 11g den do i have to follow any other different procedure?
    Also, Please let me know the version of ODI to be used in either case. Thanks in advance

    Thanks Kiran.
    Can u provide any docs which supports this feature(ODI 10g working/used with SOA 11g) would be very helpful.
    Thanks,
    Edwin.

  • Cannot Connect Agents created for ODI 10g

    Hi All,
    As part of installation of BI Apps 7.9.5.2, i have created two odi agents ( WORKFLOW and INTERFACE ). I can see them as part of my services in windows. I have started them. But when have set the required things in the topology manager. when i try to do a test from the topology my ODI 10g completely hangs.
    Is there anything i need to check why this is happening.
    How can i check whether the connectivity to the ODI10g agents is correct or not ?
    Thanks and Regards,
    Krishna Prasad.

    Hi Krishna Prasad,
    Normally when you setup agents as windows services, there is a log file generated in %odi%\bin\agentservice.log, that could be checked for troubleshooting.
    If i were you, Before setting them as windows services, i would try running agents as command prompt based services first
    The syntax to run the agent from command promt, for windows is given below,
    agent.bat -PORT=<portnumber> "-NAME=<agent name>"
    Once you have tested them successfully, you can run them as windows services.
    Hope this helps.
    Regards,
    Rickson Philip Lewis
    http://www.linkedin.com/in/rickson

  • ODI 10g Scheduler picking incorrect time - Issue

    When am scheduling a ODI scenario at EST time using scheduler agent, it is executing at different time. For Ex.,  lets say i scheduled my job at 5 PM EST daily. I can see their time in SCHEDULING INFORMATION window as 9 PM EST daily (i.e Current EST + 4 Hrs).
    Jobs are not getting triggered at 5 PM EST but it is executing at 9 PM EST. Whereas when i look into ODI OPERATOR, it is showing as 5 PM EST in execution log.
    Am facing this issue once i installed ODI 10g from WINDOWS 2003 server to WINDOWS 2008 R2 Server. Is it a bug in the 10g product or some other JAVA/JRE/JDK related issues.
    What might be the exact issue ? As It is really urgent problem, Can any one help me asap ?

    The master repository contains connectivity to the work repository. If you just clone the master repository the work repository connectivity in it still point to the original work rep. So running upgrade on cloned copy of master would result in upgrade of original work rep instead of cloned work rep. I guess this is what happened in your case. So you should try to restore the work rep in the original schema.

Maybe you are looking for

  • Is there a way to have two iTunes libraries connect to my iPhone 5s at one time?

    I have an iPhone 5s, I have tons of music on it, connect with an iTunes account joint owned; my Ex and I. Now I made my own account, I want to add the music to my iPhone without lossing the music from my old account.. Is this possible? (Whenever I pl

  • Guys,Is there anyone know how to fix this problem? I'm  running 10.9.2,but it often crash and reboot

    the Anonymous UUID: Anonymous UUID:       2950E4F9-3C2C-6382-F26C-F21E45D69949 Mon Mar 31 17:53:22 2014 panic(cpu 0 caller 0xffffff80100dbe2e): Kernel trap at 0xffffff80100d427d, type 14=page fault, registers: CR0: 0x0000000080010033, CR2: 0xffffff87

  • Send multiple records to rfc without using BPM

    Dear Experts, Could you please help on the following scenario. I need to send multiple customer master data records from a file  to rfc without using BPM. The following is the mapping i am using source                                                

  • File size larger in Photoshop than in windows preview

    When I preview an image in windows, the properties panel says my file is only 958KB with dimensions 2400 x 2369. But when I open the file in photoshop (I have CS6) , it says my file is over 16MB with the same dimensions. What is going on?

  • Need to copy workbook in Analyser

    Hi Experts, I have made a copy of workbook through remote access to the server. Now want to link a query with this workbook. When I tried to search the work book, I am not able to find this. Please help me. Thanks, Arpan