Removing Duplicate Data via cursor

Hello friends just wondering if anyone might be able to lead me down the right path to get this script written. I have a table with millions of duplicate rows, but with luck I have a column that has UUIDs that are used for another database to look at. So I have a column to which I can base my query off of. Basically I want to run a cursor to grab one instance of the duplicate rows and delete the rest. I was thinking that I could use the max rowid and put that in a varaiable and use that to grab one UUID to not delete. Anyway just wanted to bounce this off a few people who are much better at this kind of thing than myself.
Thanks in advance for any help insight you might be able to provide.
Luke

Hey Justin thank you for the reply, no i'll be honest and just say we got lucky that they had put a function on this table by mistake that populated a UUID column that they "thought" they were going to need. As long as one of the dupes remains that's all that is needed. The table over 70million rows as it sits right now. I want to do some data cleanup before I do any kind of tweaking to it as far as partitioning and what not.
What happens is there isn't any primary keys or constraints on these tables at all... And there are 3rd party programs that users can insert any data they want. So what they do is run a report they already ran by accident and just insert that info in the table again... well as soon as I can get this cleaned up I'm going to be adding a constraint so this can't happen any more.
Almost forgot... like I said there are millions of dupes, but it's not just one row that is duplicated. Here is an example of it:
Table: dups_are_cool
ID, date, name, UUID
123, FEB03, Luke, unique UUID
123, FEB03, Luke, unique UUID
123, FEB03, Luke, unique UUID
321, DEC99, John, unique UUID
321, DEC99, John, unique UUID
321, DEC99, John, unique UUID
321, DEC99, John, unique UUID
321, DEC99, John, unique UUID
999, MAY81, Don, unique UUID
999, MAY81, Don, unique UUID
So what I want to do is take one of the UUIDs and then delete the rest of the rows that are duplicate. Since there are millions of rows with a different number of occurrences doing it one occurrence at a time might take me a bit too long hehe
Message was edited by:
Luke22
Message was edited by:
Luke22

Similar Messages

  • To remove  duplicate  data using connect by prior

    Hi ,
    I want to details of the employee whom reporting to without duplication .
    In table data,one employee reporting to two employees.so that reporting to process is coming two times.
    Query:
    SELECT lpad(' ', (level - 1) * 2) || EMPLOYEE_NAME as EMP_NAME,SUP_BU AS BU_CODE,SUP_REP_BU,EMP_NO,EMPLOYEE_NAME,LEVEL AS THE_LEVEL
    FROM ATTD_REPORT_TO_VW
    WHERE EMP_NO IS NOT NULL
    CONNECT BY PRIOR SUP_BU = SUP_REP_BU
    START WITH SUP_BU = :p_bu
    BUT i get the duplicate data,SUPPOSE i remove the duplication using distinct keyword ,the order of hierarchical is going wrong.
    Pls provide the solution.
    Thanks ,
    Maran

    plz ask this question in seperate SQL/PLSQL forum and also provide more information with sample data

  • Pull DELETED / REMOVED Campaign Data Via Bing Ads API?

    Using Bing Ads API, is there a way to pull reports containing DELETED or REMOVED campaigns?
    When I look at the Status column, the description says: The current delivery status.
    For Bing Ads reporting purposes, it's really important and logical to allow users to pull DELETED campaign data because that data IS data that should be reported... not ignored simply because its status is deleted or removed.
    Is there a way via the API to get not only current campaigns but also deleted campaigns?
    When I pull the exact same report from the Bing Ads API, it DOES show me deleted campaign data.

    Deleted entities e.g. campaigns, ad groups, ads, and keywords, cannot be retrieved using the Campaign Management or Bulk services. You can get performance history for deleted campaigns using the Reporting service if there were any ad impressions during the
    specified report time. For more information, seeGetting
    Reports. 
    Please reach out with any further related questions. I hope this helps!

  • Want to remove Duplicate data...

    Hi,
    I have three diff queries which product a similar output.
    1) SQL Statement return
    Product Amount1
    A 100
    B 100
    C 100
    2) SQL Statement return
    Product Amount2
    D 200
    B 200
    E 200
    3) SQL Statement return
    Product Amount3
    A 300
    F 300
    D 300
    Now i want out in the following manner:
    Product Amount1 Amount2 Amount3
    A 100 300
    B 100 200
    C 100
    D 200 300
    E 200
    F 300
    G
    I cannot join these table (SQL) as 1 might not have matching records in 2 & 3 respectively.
    Kindly help
    Regards,
    RSD

    Hi,
    I cannot join these table (SQL) as 1 might not have matching records in 2 & 3 respectively.Although the simpler solution have been provided in above post. Just thought to show you an example by using JOIINS.
    SQL> ed
    Wrote file afiedt.buf
      1  WITH T1 as (SELECT 'A' prd, 100 amt FROM DUAL UNION ALL SELECT 'B', 100 FROM DUAL UNION ALL SELECT 'C', 100 FROM DUAL),
      2  T2 as (SELECT 'D' prd, 200 amt FROM DUAL UNION ALL SELECT 'B', 200 FROM DUAL UNION ALL SELECT 'E', 200 FROM DUAL),
      3  T3 as (SELECT 'A' prd, 300 amt FROM DUAL UNION ALL SELECT 'F', 300 FROM DUAL UNION ALL SELECT 'D', 200 FROM DUAL)
      4  SELECT prd,sum(amt1),sum(amt2),sum(amt3) FROM (
      5  SELECT NVL(NVL(t1.prd,t2.prd),t3.prd) prd,t1.amt amt1,t2.amt amt2,t3.amt amt3
      6  FROM T1
      7  FULL OUTER JOIN t2 on (t1.prd = t2.prd)
      8  FULL OUTER JOIN t3 on (t2.prd = t3.prd and t1.prd=t3.prd)
      9  ) GROUP BY prd
    10*  ORDER bY prd
    SQL> /
    P  SUM(AMT1)  SUM(AMT2)  SUM(AMT3)
    A        100                   300
    B        100        200
    C        100
    D                   200        200
    E                   200
    F                              300
    6 rows selected.
    SQL>Cheers,
    Avinash

  • Referencing multiple cells and removing duplicate values.

    Hello.
    I have a very complicated question, to be honest I am not totally sure if numbers is capable of handling all of this but it's worth a shot.
    I am working on a spreadsheet for organising a film. I've had the templates for years but I'm now using numbers to automate it as much as possible. Everything was going well until I hit the schedule/callsheet.
    On other sheets I can tell it to "look up scene two" it will then look up the correct row and find everything I need. On the callsheet however I might say "we're filming scenes two, five and nine" and numbers gets confused with the multiple values, Is there anyway around this?
    Also, if there is, I have a more complex question to ask. Is it possible for numbers to find and remove duplicate data? For example lets say scene two and five require Alice, but scene nine requires bob. If numbers just adds that info together it will display the result "Alice Alice Bob", is there a way to get it to parse the text, recognise the duplicate value and remove the unnecessary Alice? 
    I realise that numbers has limitations so it may not be able to do everything I want, however every bit I can automate saves me hours so even if I can only get half way there, totally worth it.
    Thanks in advance for any help you can offer, all the best.

    Ah excellent thank you.
    I've modified it to there are now multiple (well only four for now until I get this in better shape) indexes for finding a scene. And assigning each block to a new row.
    I only have one slight reservation about this. If I create 10 rows, it totally works, most of the time we'll only shoot three scenes a day so it's just blank space... However Murphy's law will inevitable raise its ugly head and put me in a situation where we are shooting 11 scenes in a day. 
    For countif, I think I get what you mean... Kinda. Basicially I would create a cell which combines the character strings from each scene into one long scene. Then I would have 100 extra cells (Lets say 100 so I'll never run out) each linked to the cast list, using the character name as a variable. These cells will each parse through the string to find their character name. If it appears then a true value comes up. This will remove duplicates as each cell will only respond once. So if Alice appears in every scene shooting that day, the cell will still only light up once. Is that it.
    One other question. Whenever I combine filled cells with blank cells. I usually gets the data from the filled cells, with a 0 for each blank cell. Usually this isn't a problem, but if I want to keep this flexible then I'll have quite a few blanks. The actor example above could result in 98 zeroes. Is there anyway to have blanks just not show up at all.
    I'll email the spreadsheet in a moment, a lot of it is still rough and under construction, but you should be able to see where it's all going hopefully.
    Thanks again, you have been extraordinarily helpful. 

  • How to avoid duplicate data while inserting from sample.dat file to table

    Hi Guys,
    We have issue with duplicate data in flat file while loading data from sample.dat file to table. How to avoid duplicate data in control file.
    Can any one help me on this.
    Thanks in advance!
    Regards,
    LKR

    No, a control file will not remove duplicate data.
    You would be better to use an external table and then remove duplicate data using SQL as you query the data to insert it to your destination table.

  • Remove duplicates while loading data from text file

    Hi,
    Data in text file (some times has duplicates) is being loaded into Oracle 9i database using Informatica. To improve performance, we would like to remove duplicates at the time of each load using Oracle procedure. Could you please help me with this?
    Thanks,
    Lakshmi

    No, our table doesn't have that. Most of the functionality is managed at the informatica level. Is there any other way? Thanks,

  • HT2905 My itunes looks nothing like the examples in this tutorial.  I do not have "display exact duplicates" or "date added".  Can someone please help me remove duplicate songs?  Also, I downloaded two audio books and they are showing up in my song list.

    My itunes looks nothing like the examples in this tutorial.  I do not have "display exact duplicates" or "date added".  Can someone please help me remove duplicate songs?  Also, I downloaded two audio books and they are showing up in my song list. Why???

    'Show duplicates' is now under the 'View' menu. To see the 'Date added' column go to 'View options' from the 'View' menu and check it in the section under 'Stats'.
    Click an audiobook once to select it and hit command-i (Mac) or control-i (Windows). Go to the 'Options' tab and set 'Media Kind' to 'Audiobook'.

  • HT201365 Hi all, im from dubai. someone stolen my iphone5. i was trying to erase my data via find my iphone. in the end there was a notification that remove. i have click remove. then it never shows my device in find my iphone. how can i see my device in

    Hi all, im from dubai. someone stolen my iphone5. i was trying to erase my data via find my iphone. in the end there was a notification that remove. i have click remove. then it never shows my device in find my iphone. how can i see my device in my pc...?

    Here's an interesting bit: iTunes will let me restore my iPod touch from backup: it's still running iOS 4.3.3. I figured as much... there's backups showing in the Preferences -> Devices list for both devices (and even my old, long gone iPod touch 2G), but my iPhone (on iOS5) isn't giving me any such option. I remember I had to downgrade before to get the backup restored (something went haywire in the setup assistant, and I couldn't restore a backup from iTunes there)... If it was possible to go to iOS 5.0 rather than 5.0.1 I would just do the same thing again, but iOS 5.0.1 isn't an option. I attached a screenshot. Hopefully Apple can get this nonsense sorted out; I think iOS 5 might be the root of this, somehow.

  • Insert data into table 1 but remove the duplicate data

    hello friends,
    i m trying to insert data into table tab0 using hints,
    query is like this..
    INSERT INTO /*+ APPEND PARALLEL(tab0) */ tab NOLOGGING
    (select /*+ parallel(tab1)*/
    colu1,col2
    from tab1 a
    where a.rowid =(select max (b.rowid) from tab2 b))
    but this query takes too much time around 5 hrs...
    bz data almost 40-50 lacs.
    i m using
    a.rowid =(select max (b.rowid) from tab2 b))....
    this is for remove the duplicate data..
    but it takes too much time..
    so please can u suggest me any ohter option to remove the duplicate data so it
    resolved the optimization problem.
    thanks in advance.

    In the code you posted, you're inserting two columns into the destination table. Are you saying that you are allowed to have duplicates in those two columns but you need to filter out duplicates based on additional columns that are not being inserted?
    If you've traced the session, please post your tkprof results.
    What does "table makes bulky" mean? You understand that the APPEND hint is forcing the insert to happen above the high water mark of the table, right? And you understand that this prevents the insert from reusing space that has been freed up because of deleted in the table? And that this can substantially increase the cost of full scans on the table. Did you benchmark the INSERT without the APPEND hint?
    Justin

  • How do i remove duplicate songs on my ipod and in my library

    how do i remove duplicate songs on my ipod and in my library

    Apple's official advice on duplicates is here: Find and remove duplicate items in your iTunes library. It is a manual process and the article fails to explain some of the potential pitfalls such as lost ratings and playlist membership, or that sometimes the same file can be represented by multiple entries in the library and that deleting one and recycling the file will break the other.
    Use Shift > View > Show Exact Duplicate Items to display duplicates as this is normally a more useful selection. You need to manually select all but one of each group to remove. Sorting the list by Date Added may make it easier to select the appropriate tracks, however this works best when performed immediately after the dupes have been created.  If you have multiple entries in iTunes connected to the same file on the hard drive then don't send to the recycle bin.
    Use my DeDuper script if you're not sure, don't want to do it by hand, or want to preserve ratings, play counts and playlist membership. See this thread for background, this post for detailed instructions, and please take note of the warning to backup your library before deduping.
    (If you don't see the menu bar press ALT to show it temporarily or CTRL+B to keep it displayed.)
    The most recent version of the script can tidy dead links as long as there is at least one live duplicate to merge stats and playlist membership to and should cope sensibly when the same file has been added via multiple paths.
    Fix the library, then syncing should fix the device.
    tt2

  • Fnd_user_pkg.updateuser - Remove End Date from Users

    As part of an upgrade, we need to end-date the vast majority of our users.
    I've used the fnd_user_pkg.updateuser API to populate the end_date on the fnd_user table.
    However, when I've come to test removing the end-date, I can't seem to do it. In the example below, the end_date remains populated.
    DECLARE
       CURSOR usercur
       IS
          SELECT fu.user_name
            FROM apps.fnd_user fu
           WHERE user_name = 'TEST_ACCOUNT';
    BEGIN
       FOR myuser IN usercur
       LOOP
          fnd_user_pkg.updateuser(
             x_user_name      => myuser.user_name
           , x_owner          => 'CUST'
           , x_end_date       => NULL
       END LOOP;
    END;On the example on this post:
    http://apps2fusion.com/forums/viewtopic.php?f=99&t=108
    They removed the end-date via:
    x_end_date => SYSDATE + 10000);
    However, that's not really removing the end date, it's just setting it to a long time in the future.
    I wondered if I might be missing something obvious?
    Any advice much appreciated.
    Thanks

    Only obvious thing you are missing is that you have posted this to the wrong forum, you need to be on e-business forum.
    BTW - have you tried re-setting the password at the same time?
    regards,
    Robert.

  • How can I find and remove duplicate photos in iPhoto?

    What is the best way to find and remove duplicate photos in iPhoto?

    Are you seeing these duplicates in iPhoto or via the Finder?  If it's in the iPhoto window then you can use one of these applications to identify and remove duplicate photos from an iPhoto Library:
    iPhoto Library Manager - $29.95
    Duplicate Cleaner for iPhoto - free
    Duplicate Annihilator - $7.95 - only app able to detect duplicate thumbnail files or faces files when one library has been imported into another with iPhoto 8 and earlier.
    PhotoSweeper - $9.95 - This app can search by comparing the image's bitmaps or histograms thus finding duplicates with different file names and dates.
    I also prefer iPLM as it is more than just a dup finder.  It's a the most versatile iPhoto utility available.
    OT

  • How do i remove duplicate songs on my iTunes?

    I goofed up and somehow i have duplicate, or, multiple songs on my iTunes. How do i remove songs without doing it manually? I am prepared to wipe my iTunes and use Music Rescue to re-populate my iTunes directly from my iPod Classic (7th Gen). If this is my last resort, how do i erase my current iTunes?
    Any other options? Thank You, in advance!

    Apple's official advice on duplicates is here... HT2905: How to find and remove duplicate items in your iTunes library. It is a manual process and the article fails to explain some of the potential pitfalls such as lost ratings and playlist membership.
    Use Shift > View > Show Exact Duplicate Items to display duplicates as this is normally a more useful selection. You need to manually select all but one of each group to remove. Sorting the list by Date Added may make it easier to select the appropriate tracks, however this works best when performed immediately after the dupes have been created.  If you have multiple entries in iTunes connected to the same file on the hard drive then don't send to the recycle bin.
    Use my DeDuper script if you're not sure, don't want to do it by hand, or want to preserve ratings, play counts and playlist membership. See this thread for background, this post for detailed instructions, and please take note of the warning to backup your library before deduping.
    (If you don't see the menu bar press ALT to show it temporarily or CTRL+B to keep it displayed.)
    The most recent version of the script can tidy dead links as long as there is at least one live duplicate to merge stats and playlist membership to and should cope sensibly when the same file has been added via multiple paths.
    tt2

  • How do I remove duplicate Outlook on my PC?

    How do I remove duplicate Outlook on my PC so I can complete installation of iCloud?

    Apple's official advice on duplicates is here: Find and remove duplicate items in your iTunes library. It is a manual process and the article fails to explain some of the potential pitfalls such as lost ratings and playlist membership, or that sometimes the same file can be represented by multiple entries in the library and that deleting one and recycling the file will break the other.
    Use Shift > View > Show Exact Duplicate Items to display duplicates as this is normally a more useful selection. You need to manually select all but one of each group to remove. Sorting the list by Date Added may make it easier to select the appropriate tracks, however this works best when performed immediately after the dupes have been created.  If you have multiple entries in iTunes connected to the same file on the hard drive then don't send to the recycle bin.
    Use my DeDuper script if you're not sure, don't want to do it by hand, or want to preserve ratings, play counts and playlist membership. See this thread for background, this post for detailed instructions, and please take note of the warning to backup your library before deduping.
    (If you don't see the menu bar press ALT to show it temporarily or CTRL+B to keep it displayed.)
    The most recent version of the script can tidy dead links as long as there is at least one live duplicate to merge stats and playlist membership to and should cope sensibly when the same file has been added via multiple paths.
    Fix the library, then syncing should fix the device.
    tt2

Maybe you are looking for

  • Prompts issue and Formula variables.

    Hi, 1. I have a variable in Bex report as Creation date, but when it comes to webi the default prompts comes as creation date(From) and Creation Date(To). Is there anyway to modify or delete these default prompts. 2. In one Bex report , i have a form

  • Apple tv not working with my tv

    I have 3 TVs at home,and apple tv is only working with one tv,I have tried everything with the other 2 TVs but it's not working

  • JMS sender adapter issue for encrypted message

    Hello Folks, We have JMS to AS2 interface facing issues when JMS sender channel read the encrypted files placed in MQ queue, messages size is increasing to almost double when it reaches PI. When sending an encrypted message from MQ to AS2, message is

  • HT201364 My Mac Pro is mid/late 2007

    System requirements for upgrading states you need a Mac Pro early 2007 or later... Will upgrading to Mavericks DEFINITELY not work on my slightly older machine, or only POSSIBLY? Thanks Rob

  • Adobe premier element 9

    Hello, I have a problem with Adobe premier element 9. I created a project with some photo and in this photo I put a little video to comment it. The project is in 1280*760. My problem is when I want to record my project in a littler resolution (800*60