Remove duplicates without using Sort OR Script compnent OR Staging ?

Team , Can some advise on how do we go about removing duplicates without using  either of above option quoted in the subject lines ? My source is a huge  flat file  .
Thanks in advance !
Rajkumar Yelugu

I think you can do like this
1. Add a Data Flow Task with flat file source
2. Add a multicast to flat file source
3. Join an output from Multicast to Aggregate transform and group by your required field and take min or max over a unique valued column(s) (id or date or primary key)
4. Add a Merge join transform and  add the multicast output and Aggregate Transform outputs as sources. join on the group by fields from aggregate and include min/max column also in output
5. Add a conditional split and define an output as unique valued column(s) >(<) Min/Max value from aggregate
6. Join the defined output to your destination to get only distinct records from the duplicate sets
Please Mark This As Answer if it helps to solve the issue Visakh ---------------------------- http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs

Similar Messages

  • Removes duplicates from a sorted array without having to create a new array

    Funcation removes duplicates from a sorted array without having to create a new array

    Funcation removes duplicates from a sorted array
    without having to create a new arrayIs a 'funcation' like a vacation but more fun or
    something?LMFAO. You just jumped with both feet on one of my biggest peeves. I've been outed!
    Actual words/phrases I've seen used which I hate with every fiber of my being:
    "Mrs. Freshly's"
    "McFlurry"
    "Freshissimo"
    "Spee-dee"
    "Moons Over My Hammy"
    One of my favorite SNL skits had Will Ferrell leading a panel of movie reviewers (they're the worst for this kind of junk). Each one had some cheesy pun to describe their respective movie. Ferrell topped the show off with his endorsement of "Fantasia 2000 is Fantasgreat!"
    LOL.
    "Come to Slippy Village - it's a FUNCATION!"
    &#167; (runs off, laughing maniacally, head explodes)

  • How does remove duplicates option in sort transformation behaves?

    Hi
    I am currently on a performance tuning task and I am in the process of replacing the existing transformation (especially SORT, aggregate since it is blocking) into a Stored Procedures (SP). I am not getting the exact results as that of the SORT transformation (remove duplicates) when using the methods in the SP to remove duplicates such as DISTINCT clause, HAVING COUNT>1 option, ROWNUMBER() using OVER PARTITION BY ORDER BY clauses.
    Could any of one help me how exactly does the SORT transformation with remove duplicate options work in SSIS?
    Please let me know if you need any more information.
    Thanks in Advance
    Mervyn

    Hi
    I am currently on a performance tuning task and I am in the process of replacing the existing transformation (especially SORT, aggregate since it is blocking) into a Stored Procedures (SP). I am not getting the exact results as that of the SORT transformation (remove duplicates) when using the methods in the SP to remove duplicates such as DISTINCT clause, HAVING COUNT>1 option, ROWNUMBER() using OVER PARTITION BY ORDER BY clauses.
    Could any of one help me how exactly does the SORT transformation with remove duplicate options work in SSIS?
    Please let me know if you need any more information.
    Thanks in Advance
    Mervyn
    There's an important thing to understand about using the "remove duplicates" feature of the Sort transform - the results are arbitrary. For example, take the following dataset:
    ID  |   Name
    3    |   Mervyn
    2    |   Bharani
    1    |   Nitesh
    2    |   Jamie
    Now, if you use the Sort transform to sort on ID and then remove duplicates then you *might* end up with the following result:
    ID  |   Name
    1    |   Nitesh
    2    |   Bharani
    3    |   Mervyn
    or, you *might* end up with this:
    ID  |   Name
    1    |   Nitesh
    2    |   Jamie
    3    |   Mervyn
    notice how it arbitrarily picks either "Jamie" or "Bharani" because they both have the same ID. I've never seen any requirement, ever, that says "arbitrarily pick a result, I don't care which one I get".
    I guess the point I'm ultimately trying to make is this - are you *sure* that what you want to do is replicate this behaviour? There is no construct in T-SQL that replicates the arbitrary nature of the Sort transfom (and nor should there be), with T-SQL you absolutely have to tell it whether (in the above example) you would want "Jamie" or "Bharani".
    Hope that helps.
    -Jamie
    http://sqlblog.com/blogs/jamie_thomson/ | http://jamiethomson.spaces.live.com/ | @jamiet

  • Sorting techinque without using sort statement /Comparing table fields.

    Hi,
    How to achieve sorting techinque without using sort statement in tables.
    Also how to compare fields of 2 custom tables and check their compatability without using code ?

    Hi,
    Refer the below program, it will be helpful.
    types: begin of t_int,
             int type i,
            end of t_int.
    data: it_int type standard table of t_int,
           wa_int type t_int,
           wa_int1 type t_int.
    wa_int-int = 70.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 50.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 20.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 30.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 23.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 23.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 32.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 77.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 99.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 1.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 11.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 90.
    append wa_int to it_int.
    clear wa_int.
    wa_int-int = 40.
    append wa_int to it_int.
    clear wa_int.
    data: wk_line type i.
    data: cnt type i.
    data: cnt1 type i.
    describe table it_int lines wk_line.
    clear: cnt,cnt1.
    data wk_flag type c.
       clear wk_flag.
    while wk_line gt cnt1.
       cnt = cnt + 1.
       read table it_int into wa_int index cnt.
       cnt1 = cnt + 1.
       read table it_int into wa_int1 index cnt1.
       if wa_int-int lt wa_int1-int.
         modify it_int from wa_int index cnt1.
         modify it_int from wa_int1 index cnt.
         wk_flag = 'X'.
       endif.
       if cnt1 eq wk_line.
         clear: cnt, cnt1.
         if wk_flag eq 'X'.
         clear wk_flag.
         continue.
         else.
           exit.
         endif.
       endif.
    endwhile.
    loop at it_int into wa_int.
       write: / wa_int-int.
    endloop.

  • Remove locators without using

    Hi folks -- I'm just getting my feet wet with logic express 8 (used to video editing). I keep pointing to the top of the timeline and accidentally inserting locators for various things. I don't want the locators, but removing them without using them doesn't seem to be covered in the manual! "undo" doesn't work. Can anyone help me?
    thanks!
    EH

    You seem to be inserting markers. Just click-hold the marker and drag it down - this will delete it. Alternative way of deleting markers is to do it in the Marker list.

  • To remove  duplicate  data using connect by prior

    Hi ,
    I want to details of the employee whom reporting to without duplication .
    In table data,one employee reporting to two employees.so that reporting to process is coming two times.
    Query:
    SELECT lpad(' ', (level - 1) * 2) || EMPLOYEE_NAME as EMP_NAME,SUP_BU AS BU_CODE,SUP_REP_BU,EMP_NO,EMPLOYEE_NAME,LEVEL AS THE_LEVEL
    FROM ATTD_REPORT_TO_VW
    WHERE EMP_NO IS NOT NULL
    CONNECT BY PRIOR SUP_BU = SUP_REP_BU
    START WITH SUP_BU = :p_bu
    BUT i get the duplicate data,SUPPOSE i remove the duplication using distinct keyword ,the order of hierarchical is going wrong.
    Pls provide the solution.
    Thanks ,
    Maran

    plz ask this question in seperate SQL/PLSQL forum and also provide more information with sample data

  • Javascript array ;Add and remove elements without using push and pop

    Hi
     I need to perform add and remove  operation in Javascript with following scenarios
    i) Add element, if element does not exist in array(javascript)
    ii) Remove element, if element exist in array(javascript)
    Without using push and pop method how to achieve this?
    Regards
    Siva

    Completed the Scenario.

  • How to Remove Duplicates Without Changing Album?

    Hello All,
    I see a lot of topics on how to remove duplicate songs but there is one thing I would like to know if its possible...
    I would like to remove my duplicates but I still want to see the songs as part of the albums. Example: lets say the same song is present in 3 different albums. I would like to remove 2 of them but have them still showing up as part of the 3 albums. This way the albums remain "complete".
    Is there any way to tool available that can do that? Maybe keeping something like a symbolic link in place of the songs that were deleted?
    Any help would be appreciated.
    Thanks!

    Not possible and has been asked before.
    MJ

  • Duplicate monitors using command line\script in iron-python

    I want to use iron python script to extend desktop monitor. I found the HotKey for it in this site :http://www.petenetlive.com/KB/Article/0000162.htm but
    have no idea how to do it. Is there a way to do it using cmd? (I can run it using python)
    thanks a lot!

    $wshell = new-object -com wscript.shell
    $wshell.sendkeys("^{ESC}P")
    ¯\_(ツ)_/¯

  • Create test scripts using e-Tester without using visual scripts

    I'm new researching on e-Test suite.
    I have the following question. Is there any possibility to create test scripts to run automatic tests without using the visual scripts created by e-Tester?
    Today, we are using Selenium to make Web based tests and this tool is included in a custom framework developed in Java. So, we don't use record and play. Directly we create code using objects, methods and attributes provided by Selenium api.

    You can create a dummy java agent script that can that call your module. The java agent script can then be automated to run in e-Load.
    1) e-tester menu ->options->New Scripts (global) -> Advanced -> Java Agent: Check on "Create Java Agent Script.." and the other 2 check boxes.
    2) menu -> File -> New Script
    3) Do Nothing. No navigations, ect...
    4) menu -> File -> Save Script i.e. "empty"
    5) Download and install Eclipse IDE (other IDEs should work but you need to create your own project)
    6) Eclipse -> Menu -> File -> Import -> General -> Existing Project into workspace: Select the root directory that contains the etester generated Java Agent script prefixed with the underscore "_" i.e c:\ETS\etest\Default!\_empty
    7) Once imported I suggest you call your module from within scriptCallback.class EndScript(RunState, ScriptResult);
    7b) When you get better at this you can return errors by changing the ScriptResult() object and view stats from the RunState() object.
    8) You will have to modify the Eclipse Project to be dependent on your jars if you want to debug in eclipse. Secondly there are 2 batch files. These two batch files create the *.JWG file which is just a renamed .JAR file.
    9) You MUST modify the batch files to create the new Empty.JWG including your *.JAR files.
    10) I recommend compiling in eclispe and only running the makefileNocompile.bat (after you add the *.jar inclusion). Ignore the makefile.bat unless you also change the javac.exe execution line to include all the correct classpaths.
    11) Once you are happy with running in Eclipse you should be able to execute in eload. I recommend you test the procedures steps 5 through 10 before adding your own module and code.

  • How to remove duplicate items ?

    ok so ive moved my iTunes library from the NAS drive I bought (after finding out that wouldnt work) onto my new laCie external drive, but ive found some of my albums have triple copies?
    i know how to show duplicates but im not sure how to safely remove duplicates without deleting all copies?

    Oh boy.
    I'm sure there are better ways to do it than this and it will take time, but to avoid all possible loss of data, what I would do is first consolidate all of your libraries.
    • Open the iTunes Library you think is most correct by holding down the OPTION key when you open the iTunes application, which should bring up a dialogue like this: http://cl.ly/image/2i2Q3o0Z0Y3C
    • Then go to preferences and make sure it looks like this: http://cl.ly/image/2C2Z0u0C3T3c
    I would recommend keeping your main iTunes library on your main hard drive (for me that's my internal), unless you definitely can't fit it.
    • Now go to the File menu >Library > Import Playlist
    • Navigate to another of the libraries, and click on the iTunes Library.xml file and import it. Do this for each of your libraries, except the one you are currently using.
    • Now that you've got everything imported into the one library, the fun part starts.
    Do what i said in my previous post and remove all the duplicates.
    • Once that's done, check all the other libraries to make sure you haven't missed anything, and send them on their way to the Trash, and empty it to reclaim all that space.
    I really hope you get this sorted, I went through an ordeal like this myself recently, so it's going to take time, but it feels good when it's all cleaned up and finished!
    xeni
    PS. After writing all this I thought, hmm, why didn't I just Google it instead of figuring it out myself? ;P
    I found this, and it might help if my instructions weren't clear enough. https://bitly.com/LpqFPq
    Also, if you don't already, I urge you to use Time Machine backup. Read more here: http://www.apple.com/osx/apps/#timemachine and http://pondini.org/TM/FAQ.html

  • Does anyone know how to remove duplicates in the new iTunes?, Does anyone know how to remove duplicates in the new iTunes?

    I just got a new laptop (Windows) and downloaded the new iTunes on it. It scanned my computer for music, and after it was done there were up to 5 copies of each song in the library. The "old" itunes used to have an option to remove duplicates but I cannot find a way to do that in the new iTunes. Anyone know?

    The show duplicates/show exact duplicates features have been left out of iTunes 11. Rumor suggests they will be restored in the next build. In the meantime I have written two Windows scripts to make playlists of Duplicates and Exact Duplicates, either from a selection of tracks or the entire library.
    If you want to manually remove duplicate tracks use shift-delete to remove selected tracks from the library as well as the playlist. Keep one of each repeated group of files and don't send to the recycle bin unless you are sure that there are multiple files on the disc as opposed to multiple entries to the same file.
    There is also my DeDuper script if you don't want to do it by hand. This can preserve ratings, play counts, playlist membership, etc. which are lost in a manual clean up. Please take note of the warning to backup your library before deduping. See this thread for background on deduping and the script.
    tt2

  • Removing Duplicates in iTunes for Windows

    Hi, I cannot find the "remove duplicates" option in the menus.  It used to be in the file menu, but I don't see it anywhere now.  Am I just missing it?  Thanks!

    The show duplicates feature is now under the View menu. Use Shift > View > Show Exact Duplicates as this is normally a more useful selection. You need to manually select all but one of each group to remove. Or use my DeDuper script if you don't want to do it by hand. Please take note of the warning to backup your library before deduping. See this thread for background.
    tt2

  • HT2905 Can't remove duplicates in iTunes library

    I just downloaded the newest verion of iTunes and there is no option to "display duplicates" under the "file" tab. Any ideas? Thanks!

    The show duplicates/show exact duplicates features have been left out of iTunes 11.0. Rumor suggests they will be restored in the next build. In the meantime I have written two Windows scripts to make playlists of Duplicates and Exact Duplicates, either from a selection of tracks or the entire library. Note that, as with the iTunes feature, this list makes no distinction between "originals" and "dupes", you have to decide which is which.
    There is also my DeDuper script for automatically removing duplicate copies but keeping one remaining copy of each set. This can preserve ratings, play counts, playlist membership, etc. which are lost in a manual clean up. Please take note of the warning to backup your library before deduping. See this thread for background on deduping and the script.
    If you want to manually remove duplicate tracks use shift-delete to remove selected tracks from the library as well as the playlist. Keep one of each repeated group of files and don't send the others to the recycle bin unless you are sure that there are multiple files on the disc as opposed to multiple entries to the same file. Same advice to backup applies.
    tt2

  • Cannot remove duplicate songs

    I'm trying to remove duplicate songs in my Library. Instructions say to select song, "shift" and select "view". "Shift" shows plus sign but when hit, no "view" is listed. I just updated itunes. Help, please! Thanks.

    The show duplicates/show exact duplicates features have been left out of iTunes 11. Rumor suggests they will be restored in the next build. In the meantime I have written two Windows scripts to make playlists of Duplicates and Exact Duplicates, either from a selection of tracks or the entire library.
    If you want to manually remove duplicate tracks use shift-delete to remove selected tracks from the library as well as the playlist. Keep one of each repeated group of files and don't send to the recycle bin unless you are sure that there are multiple files on the disc as opposed to multiple entries to the same file.
    There is also my DeDuper script if you don't want to do it by hand. This can preserve ratings, play counts, playlist membership, etc. which are lost in a manual clean up. Please take note of the warning to backup your library before deduping. See this thread for background on deduping and the script.
    tt2

Maybe you are looking for

  • Can't transfer files to iPod

    I've a large file of 4.03 GB, and I've got 11 GB left on my iPod. I try to copy the file over, and it says that the space is all used up. what gives? it's obviously not a read only device, I can transfer music and stuff like that. help! Thanks

  • Where is this oracle.sql? ;(

    I am trying to get one of the XDB java samples to work on my PC. I got this error message: package oracle.sql does not exist I searched through the Oracle's whole web site and couldn't where and what this package is. Can anyone give me a clue on this

  • Communication with SNC

    Hi all, I very new to SNC, but I would like to know why the communication to and from SNC is based on XML and there is no possiblity to send and receive for example IDOCs from SNC (for example to ERP systems)? So the final question is why do I need a

  • Airport keeps asking for password

    My airport keeps requesting a password to join my home wireless network.  Any ideas? Thanks

  • I am now using Firefox 8. I used to be able to set a new tab's start page. Now it is just blank. What has happened to that feature?

    I am using a PC with Windows 7. What is really interesting is when I installed the plug in, Freecorder 4, it was able to change my tab start page to its search page. (which I disabled - how rude)