Deleting Duplicates from a table

Its a huge table with 52 fields and 30k rows. I need to delete the duplicates based on one of the fields. GROUP BY is taking a lot of time. Is there a quicker way to delete the duplicates using SQL.
Thanks.

How many duplicates have you got? Do you have even a vague idea? 1%? 20%? 90%?
One way would be to add a unique constraint on the column in question. This will fail, of course, but you can use the EXCEPTIONS INTO clause to find all the ROWIDs which have duplicate values. You can then choose to delete those rows using a variant on teh query already posted. You may need to run %ORACLE_HOME%\rdbms\admin\utlexcptn.sql to build the EXCEPTIONS table first.
This may seem like some unnecessary work, but the most effective way of deleting duplicates from a table is to have relational integrity constraints in place which prevent you having duplicates in the first place. To paraphrase Tim Gorman, you can't get faster than zero work!
Cheers, APC

Similar Messages

How to choose in Delete Duplicates from internal table?

Now I need to delete Duplicates from internal table,
So at first I sort
than I delete duplicate
Sort itab1 BY Company_Code Asset_No Capital_Date.
      DELETE ADJACENT DUPLICATES FROM itab1 COMPARING Company_Code Asset_No Capital_Date
Company_Code
Asset_No
Capital_Date
Remark
BC35
1515593
20021225
Helen
BC35
1515593
20021225
Common Asset
BC35
1515594
20030109
Judy
BC35
1515594
20030109
Common Asset
But here comes my problem~If I want to delete the Common Asset in Remark Column,how I let it choose the right one to do it?

Hi Jack
Try the below coding..
Report zsamp.
types: begin of t_tab,
        comp_code(4) type c,
        ***_no(7) type n,
        cap_date type d,
        remark type string,
        end of t_tab.
data: i_tab type TABLE OF t_tab,
       w_tab type t_tab.
w_tab-comp_code = 'BC35'.
w_tab-***_no = '1515593'.
w_tab-cap_date = '20021225'.
w_tab-remark = 'Helen'.
append w_tab to i_tab.
w_tab-comp_code = 'BC35'.
w_tab-***_no = '1515593'.
w_tab-cap_date = '20021225'.
w_tab-remark = 'Common Asset'.
append w_tab to i_tab.
w_tab-comp_code = 'BC35'.
w_tab-***_no = '1515594'.
w_tab-cap_date = '20030109'.
w_tab-remark = 'Judy'.
append w_tab to i_tab.
w_tab-comp_code = 'BC35'.
w_tab-***_no = '1515594'.
w_tab-cap_date = '20030109'.
w_tab-remark = 'Common Asset'.
append w_tab to i_tab.
sort i_tab by remark.
delete ADJACENT DUPLICATES FROM i_tab COMPARING remark.

Delete Duplicates from internal table with object references

Hi
How can I delete duplicates from an internal table in ABAP OO based on the value of one of the attributes?
I have created a method, with the following code:
LOOP AT me->business_document_lines INTO l_add_line.
    CREATE OBJECT ot_line_owner
      EXPORTING
        i_user      = l_add_line->add_line_data-line_owner
        i_busdoc = me->business_document_id.
      APPEND ot_line_owner TO e_line_owners.
ENDLOOP.
e_line_owners are defined as a table containing only object references.
One of the attribute of the object in the table is called USER. And I would like to do a "delete ADJACENT DUPLICATES FROM e_line_owners", based on that attribute.
How can do this?
Regards,
Morten Nielsen

Hello Morten
Assuming that the instance attribute is <b>public </b>you could try to use the following coding:
SORT e_line_owners BY table_line->user.
DELETE ADJACENT DUPLICATES FROM e_line_owners
    COMPARING table_line->user.
However, I am not really sure (cannot test myself) whether <b>TABLE_LINE</b> can be used together with SORT and DELETE.
Alternative solution:
DATA:
     ld_idx    TYPE sy-tabix.
LOOP AT e_line_owners INTO ls_line.
    ld_idx = syst-tabix + 1.
    LOOP AT e_line_owners TRANSPORTING NO FIELDS FROM ld_idx
                   WHERE ( table_line->user = ls_line->user ).
      DELETE e_line_owners INDEX syst-tabix.
    ENDLOOP.
ENDLOOP.
Regards
Uwe

Deleting duplicates from a table ,who's size is 386 GB

Need to delete duplicate records from the table.Table contains 33 columns out of them only PK_NUM is the primary key columns. As PK_NUM contains unique records we need to consider either min/max value.
Sample data :
PK_NUM
Name
AGE
1
ABC
20
2
PQR
25
3
ABC
20
Expected data should contains only 2 records:
PK_NUM
Name
AGE
1
ABC
20
2
PQR
25
*1 can be replaced by 3 ,vice versa.
Size of table : 386 GB
Total records in the table : 1766799022
Distinct records in the table : 69237983(Row distinct with out Primary key)
Duplicate records in the table : 1697561039(Row duplicates without primary key)
Column details :
4 : Date data type
4 : Number data type
1 : Char data type
24: Varchar2 data type
DB details : Oracle Database 11g EE::11.2.0.2.0 ::64bit Production
My plan here is to
Pull distinct records and store it in a back up table.(ie by using insert into select)
Truncate existing table and move records from back up to existing.
As data size is huge ,
Want to know what is the optimized sql for retrieving the distinct records
Any estimate on how much it will take to complete (insert into select) and to truncate the existing table.
Please do let me know ,if there is any other best way to achieve this.My ultimate goal is to remove the duplicates.

As data size is huge ,
Want to know what is the optimized sql for retrieving the distinct records
Any estimate on how much it will take to complete (insert into select) and to truncate the existing table.
@ 1. - Your best chance seems to be (should require a single FTS only)
create backup_table as
select pk,name,age,a_date,a_string,a_number, ...
from (select pk,name,age,a_date,a_string,a_number, ...
               row_number() over (partition by name,age order by a_date) rn
          from big_table
where rn = 1
@ 2. - Having statistics in place and (at least nearly) up to date explain plan should return an appropriate estimate
Regards
Etbin
select pk,name,age,a_date,a_string,a_number
from (select pk,name,age,a_date,a_string,a_number,
               row_number() over (partition by name,age order by a_date) rn
          from big_table
where rn = 1
Operation
Options
Object
Rows
Time
Cost
Bytes
Filter
Predicates *
Access
Predicates
SELECT STATEMENT
13,044
1
30
53,023,860
VIEW
13,044
1
30
53,023,860
"RN" = 1
WINDOW
SORT PUSHED RANK
13,044
1
30
495,672
ROW_NUMBER() OVER ( PARTITION BY "NAME","AGE" ORDER BY "A_DATE")< = 1
TABLE ACCESS
STORAGE FULL
BIG_TABLE
13,044
1
26
495,672
select pk,name,age,a_date,a_string,a_number
from big_table
where pk in (select min(pk) keep (dense_rank first order by a_date)
                from big_table
               group by name,age
Operation
Options
Object
Rows
Time
Cost
Bytes
Filter
Predicates *
Access
Predicates
SELECT STATEMENT
6,000
1
52
306,000
HASH JOIN
6,000
1
52
306,000
"PK" = "$kkqu_col_1"
VIEW
VW_NSO_1
6,000
1
27
78,000
HASH
UNIQUE
6,000
1
27
126,000
SORT
GROUP BY
6,000
1
27
126,000
TABLE ACCESS
STORAGE FULL
BIG_TABLE
13,044
1
23
273,924
TABLE ACCESS
STORAGE FULL
BIG_TABLE
13,044
1
24
495,672
Message was edited by: Etbin

Delete duplicate from internal table

HI Abapers,
I have a query on how to remove the duplicates from an internal table
My internal table data is as follows :
Cno    Catg1 Catg2
01       0         1000
01      2000         0
I want to get only one record as
01   2000 1000
How to get the result.
I tried sorted by cno and used delete duplicates but it was not helpful.
Is there any other alternative to get this done
Please help me.
Regards,
Priya

check it out with delete adjacent duplicate records
Deleting Adjacent Duplicate Entries
To delete adjacent duplicate entries use the following statement:
DELETE ADJACENT DUPLICATE ENTRIES FROM <itab>
                                  [COMPARING <f1> <f 2> ...
                                             |ALL FIELDS].
The system deletes all adjacent duplicate entries from the internal table <itab>. Entries are duplicate if they fulfill one of the following compare criteria:
Without the COMPARING addition, the contents of the key fields of the table must be identical in both lines.
If you use the addition COMPARING <f1> <f 2> ... the contents of the specified fields <f 1 > <f 2 > ... must be identical in both lines. You can also specify a field <f i > dynamically as the contents of a field <n i > in the form (<n i >). If <n i > is empty when the statement is executed, it is ignored. You can restrict the search to partial fields by specifying offset and length.
If you use the addition COMPARING ALL FIELDS the contents of all fields of both lines must be identical.
You can use this statement to delete all duplicate entries from an internal table if the table is sorted by the specified compare criterion.
If at least one line is deleted, the system sets SY-SUBRC to 0, otherwise to 4.

Deleting Duplicate from ITAB without sorting????

Hi,
A challenging and interesting problem please help. I want to delete duplicates from an ITAB without sorting (so cant use delete adjacent duplicates)
data: begin of dpp occurs 0,
        val type i,
        end of dpp.
        dpp-val = 13.
        append dpp.
        dpp-val = 15.
        append dpp.
        dpp-val = 26.
        append dpp.
        dpp-val = 15.
        append dpp.
        dpp-val = 27
        append dpp.
        dpp-val = 15.
        append dpp.
As you see 15 is duplicated in DPP,,,how can duplicated 15 entries be deleted without sorting
                   VAL
     13
     15
     26
     15
     27
     15
thhnx
Edited by: Salman Akram on Oct 12, 2010 3:54 PM

Hi,
Loop through your DPP itab then append to another. try this:
DATA: BEGIN OF dpp OCCURS 0,
val TYPE i,
END OF dpp.
dpp-val = 13.
APPEND dpp.
dpp-val = 15.
APPEND dpp.
dpp-val = 26.
APPEND dpp.
dpp-val = 15.
APPEND dpp.
dpp-val = 27.
APPEND dpp.
dpp-val = 15.
APPEND dpp.
DATA: BEGIN OF dpp1 OCCURS 0.
        INCLUDE STRUCTURE dpp.
DATA: END OF dpp1.
LOOP AT dpp.
READ TABLE dpp1 WITH KEY val = dpp-val.
IF sy-subrc NE 0.
    APPEND dpp TO dpp1.
ELSE.
    CONTINUE.
ENDIF.
ENDLOOP.
REFRESH dpp.
dpp[] = dpp1[].
thanks.

Delete entries from the table

Hi folks,
I have delete program to delete entries from a custom table and has only one feld in it.
tables: ZABC
selection-screen begin of block B1 with frame title text-110.
select-options: P_KOSTL for ZABC-KOSTL.
selection-screen end of block B1.
delete from ZABC where KOSTL in P_KOSTL.
Upon executing I am entering certain cost center ids on the selection screen to delete them from the table.It did not work.
what is it I am missing?
Thanks,
SK

Hi,
Try this sample code..Replace ZABC with your table..
TABLES: ZABC.
selection-screen begin of block B1 with frame title text-110.
select-options: P_KOSTL for ZABC-KOSTL.
selection-screen end of block B1.
START-OF-SELECTION.
* Delete the records from the table.
DELETE FROM ZABC where KOSTL IN P_KOSTL[ ]. " [] for the select-options.
IF sy-subrc <> 0.
ROLLBACK WORK.
ELSE.
COMMIT WORK.
ENDIF.
Thanks,
Naren

Deleting data from EDPAR table

hi everyone,
I need to delete data forom table EDPAR. two conditions are there. Once through first radio button in selection screen based on values of kunnr selected from select options. In the second case. if i select the second radio button, all data from EDPAR table gets deleted.
Can any one offer me proper CODE to do it....Pls help.

SAP delivers a standard transaction to edit the contents of this table, VOE4, this is safer then using a custom code to delete rows from the table.

Deleting rows from a table

COuld anyone tell me how to delete rows from a table which has millions of rows.
TIA,
Oracle user

if you are deleting all the rows, use "truncate table" in sql*plus.
or if you are deleting all but a handful of rows, then copy the rows you still want to a spare table, drop the original table, and rename the spare table back to the original table's name.
hope this helps

Deleting row from a table binded to a matrix

Hi all
i have a form with a matrix binded to a user table which is handled as a Master Data lines by UDO.
i want to enable deleting lines from the table by selecting a row in the matrix and clicking a delete button.
currently i'm handling the click event by usint the method DeletRow of the matrix object.
when i press the Update button (UID = "1"). the fact that a row was deleted from the matrix does not affect the bounded table.
my question is how in code can i cause the deletion of a row from the matrix to also be deleted from the database table?
appreciate the help
Yoav

Hi Yechiel
flushToDatasource make the following:
Flushes current data from the GUI to the bounded data source using the following process:
1)Cleans the data source.
2)Copies each row from the matrix to the corresponding data source record.
In other words: This method load data from Matrix to DataSource (but not to database)
the next step is update database from userdatasource
Note: You migth read sdk help for more information

Deleting Text from a Table

I want to know the shortcut for deleting text from a table on a macbook pro with the latest version of Microsoft Word. I want to keep the table and it's formatting but I want the current information deleted and don't want to have to do this cell by cell. I highlight all the text I want deleted and then.........???????
Please help, this is very frustrating!!!

Thanks David! You are the best!

Deleting data from a table where there are no indexes on the table

Hi
We have one interface program for one time process.When I was testing the process it was taking too much time to load the data around 1000 records.
it happens in 2 steps
1 puts into stage table
2 puts into base table
in the process/package I have delete statement that deletes data from stage table before each process.
Stage table did not have any indexes but the base table has(obvisiosly)
any idea?
please help me on this.
Thanks,
Y

Hi,
Please post the application/database details along with the OS.
Is this interface program a seeded or custom one?
Please enable trace on this concurrent program as per (Note: 296559.1 - FAQ: Common Tracing Techniques within the Oracle Applications 11i/R12) and generate the TKPROF to find out why it takes that long to load/delete the data.
Thanks,
Hussein

Delete records from internal table

hi all,
i want to delete records from intenal table which are starting with a particular starting number .
eg internal table
10000
20000
90000
91000
92000
88880
i want delete the records starting with 9 i.e. 90000 91000 92000.
Thanks in Adv
RAJ

You can test this piece of code.
DATA:
i_tab TYPE STANDARD TABLE OF mara,
wa_tab TYPE mara.
wa_tab-matnr = '1000'.
APPEND wa_tab TO i_tab.
CLEAR wa_tab.
wa_tab-matnr = '1001'.
APPEND wa_tab TO i_tab.
CLEAR wa_tab.
wa_tab-matnr = '1002'.
APPEND wa_tab TO i_tab.
CLEAR wa_tab.
wa_tab-matnr = '1003'.
APPEND wa_tab TO i_tab.
CLEAR wa_tab.
wa_tab-matnr = '2001'.
APPEND wa_tab TO i_tab.
CLEAR wa_tab.
wa_tab-matnr = '3001'.
APPEND wa_tab TO i_tab.
CLEAR wa_tab.
wa_tab-matnr = '4010'.
APPEND wa_tab TO i_tab.
CLEAR wa_tab.
<REMOVED BY MODERATOR>
Edited by: Alvaro Tejada Galindo on Aug 8, 2008 4:49 PM

Delete records from multiple table

Hi,
I need to delete records from multiple tables using a single delete statement. Is it possible ? If so please let me know the procedure.
Kindly Help.
Thanks,
Alexander.

Hi Tim,
Syntax of DELETE statement does not allow for multiple tables to be specified in this way. Infact, none of the DMLs allow you to specify table names like this.
Technically, there are other ways of deleting from multiple tables with one statement.
1. "Use a trigger":
What was probably meant by this is that you have a driving-table on which you create a on-delete trigger. In this trigger, you write the logic for deleting from other tables that you want to delete from.
This does mean a one-time effort of writing the trigger. But the actual DML operation of deleting from all the tables would be simply triggered by a delete on driving-table.
2. Dynamic SQL:
Write a PL/SQL code to open a cursor with table-names from which you want the data to be deleted from. In the cursor-for loop, write a dynamic SQL using the table-name to delete from that table.
3. Using Foreign-Key constraint with Cascade-Delete:
This I feel is a more 'cleaner' way of doing this.
Having to delete data from multiple tables means that there is some kind of parent-child relationship between your tables. These relationships can be implemented in database using foreign-key constraints. While creating foreign-key constraint give the 'on delete cascade' clause to ensure that whenever data is deleted from parent-table, its dependent data is deleted from child-table.
Using foreign-key constraint you can create a heirarchy of parent-child relationships and still your DELETE would be simple as you would only have to delete from parent-table.
IMPORTANT: Implementing foreign-key constraints would also impact other DML operations that you should keep in mind.

How do i delete duplicates from my music library. After updating my iTunes the "Show Duplicates" option does not appear in the menu

How do i delete duplicates from my music library. After updating my iTunes the "Show Duplicates" option does not appear in the menu

If you are on Version 11.0.1 rather than 11 Apple put it back in
View > Show Duplicates
Option + View > Show Exact Duplicates
so if you are on 11 upgrade to 11.0.1 and you will have it

Deleting Duplicates from a table

Similar Messages

Maybe you are looking for