Regex - removing duplicate matches

I need to remove any duplicate matches from my regex pattern matcher so that only one occurrence of the match is output. Not sure how to go about this. Ive tried putting the values into a Set but duplicates were still output, it may be something as simple as ammending the Print Line statement or perhaps I need a for each loop somewhere but I think I have tried almost every possibility and cant get it to work! Any advice would be much appreciated, apologies if Ive missed something glaringly obvious!
import java.util.regex.*;
import java.io.*;
import java.util.*;
public class Tuesday {
public static void main(String[] args) throws IOException {
Pattern patt = Pattern.compile("VID_.{4}");
BufferedReader r = new BufferedReader(new FileReader("setupapi.log"));
String line;
while ((line = r.readLine()) != null) {
Matcher m = patt.matcher(line);
ArrayList retval = new ArrayList();
while (m.find()) {
//System.out.println(m.group(0));
// starting position of the text
int start = m.start(0);
// ending position
int end = m.end(0);
// CharacterIterator.substring(offset, end);
String tmp = (line.substring(start, end));
if(retval.indexOf(tmp) == -1)
retval.add(tmp);
System.out.println(retval);
}Output:
[VID_0930]
[VID_0930]
[VID_0930]
[VID_090C]
[VID_090C]
[VID_090C]
[VID_0A12]
[VID_0A12]
[VID_0A12]
[VID_0A12]
[VID_0A12]
[VID_0A17]
[VID_0A17]
[VID_0A17]

Thanks for your reply. This was how I did the set, again apologies if its something obvious that I have done wrong!
import java.util.Set;
import java.util.LinkedHashSet;
import java.util.HashSet;
import java.util.regex.*;
import java.io.*;
import java.util.*;
public class WithSet {
public static void main(String[] args) throws IOException {
Pattern patt = Pattern.compile("(VID_.{4})");
BufferedReader r = new BufferedReader(new FileReader("setupapi.log"));
String line;
while ((line = r.readLine()) != null) {
Matcher m = patt.matcher(line);
ArrayList retval = new ArrayList();
while (m.find()) {
retval.add(m.group());
HashSet set = new HashSet(retval);
//retval.clear();
//retval.addAll(set);
System.out.println(set);
}This is a sample of the log file:
[2008/01/25 20:35:54 664.3 Driver Install]
#-019 Searching for hardware ID(s): usb\vid_0930&pid_6534&rev_0100,usb\vid_0930&pid_6534
#-018 Searching for compatible ID(s): usb\class_08&subclass_06&prot_50,usb\class_08&subclass_06,usb\class_08
#-198 Command line processed: C:\WINDOWS\system32\services.exe
#I393 Modified INF cache "C:\WINDOWS\inf\INFCACHE.1".
#I022 Found "USB\Class_08&SubClass_06&Prot_50" in C:\WINDOWS\inf\usbstor.inf; Device: "USB Mass Storage Device"; Driver: "USB Mass Storage Device"; Provider: "Microsoft"; Mfg: "Compatible USB storage device"; Section name: "USBSTOR_BULK".
#I023 Actual install section: [USBSTOR_BULK.NT]. Rank: 0x00002000. Effective driver date: 07/01/2001.
#-166 Device install function: DIF_SELECTBESTCOMPATDRV.
#I063 Selected driver installs from section [USBSTOR_BULK] in "c:\windows\inf\usbstor.inf".
#I320 Class GUID of device remains: {36FC9E60-C465-11CF-8056-444553540000}.
#I060 Set selected driver.
#I058 Selected best compatible driver.
#-166 Device install function: DIF_INSTALLDEVICEFILES.
#I124 Doing copy-only install of "USB\VID_0930&PID_6534\5&719C62D&0&6".
#-166 Device install function: DIF_REGISTER_COINSTALLERS.
#I056 Coinstallers registered.
#-166 Device install function: DIF_INSTALLINTERFACES.
#-011 Installing section [USBSTOR_BULK.NT.Interfaces] from "c:\windows\inf\usbstor.inf".
#I054 Interfaces installed.
#-166 Device install function: DIF_INSTALLDEVICE.
#I123 Doing full install of "USB\VID_0930&PID_6534\5&719C62D&0&6".
#I121 Device install of "USB\VID_0930&PID_6534\5&719C62D&0&6" finished successfully.

Similar Messages

XSLT to remove duplicates while concatinating

My XML looks like folloing:
<?xml version="1.0" encoding="utf-8" standalone="no"?> <BATCHES> <item> <Material>1000000079</Material> <Description>330 Bulk</Description> <Tank>T123</Tank> <Batch>2013225287</Batch> <Quantity>510</Quantity> </item> <item> <Material>1000000079</Material> <Description>330 Bulk</Description> <Tank>T123</Tank> <Batch>2013225301</Batch> <Quantity>520</Quantity> </item> <item> <Material>1000000196</Material> <Description>340R Bulk</Description> <Tank>T700</Tank> <Batch>1000188378</Batch> <Quantity>510</Quantity> </item> <item> <Material>1000002754</Material> <Description>43 Bulk</Description> <Tank>T515</Tank> <Batch>2013180125</Batch> <Quantity>300</Quantity> </item> <item> <Material>1000002754</Material> <Description>43 Bulk</Description> <Tank>T515</Tank> <Batch>2013203124</Batch> <Quantity>200</Quantity> </item> <item> <Material>1000002754</Material> <Description>43 Bulk</Description> <Tank>T515</Tank> <Batch>2013214839</Batch> <Quantity>700</Quantity> </item> <item> <Material>1000002754</Material> <Description>43 Bulk</Description> <Tank>T517</Tank> <Batch>2013214342</Batch> <Quantity>890</Quantity> </item> </BATCHES>
My original XSLT look like this:
<?xml version="1.0" encoding="utf-8" standalone="no"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output encoding="UTF-8" indent="yes" method="xml" version="1.0"/> <xsl:template match="/"> <Rowsets> <Rowset> <xsl:variable name="materials" select=".//item[Tank!='RECV' and Tank!='PARK'] "/> <xsl:for-each select="$materials"> <xsl:if test="generate-id(.)= generate-id($materials[Material=current()/Material])"> <Row> <Material> <xsl:value-of select="Material"/> </Material> <Description> <xsl:value-of select="Description"/> </Description> <Value> <xsl:for-each select="$materials[Material=current()/Material]/Tank"> <xsl:if test="node()"> <xsl:value-of select="concat(.,'||')"/> </xsl:if> </xsl:for-each> </Value> </Row> </xsl:if> </xsl:for-each> </Rowset> </Rowsets> </xsl:template> </xsl:stylesheet>
The result of this XSLT looks like this:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <Rowsets> <Rowset> <Row> <Material>1000000079</Material> <Description>330 Bulk</Description> <Value>T123||T123||</Value> </Row> <Row> <Material>1000000196</Material> <Description>340R Bulk</Description> <Value>T700||</Value> </Row> <Row> <Material>1000002754</Material> <Description>43 Bulk</Description> <Value>T515||T517||</Value> </Row> </Rowset> </Rowsets>
I wanted to remove duplicate tanks while concatenating it in Value field. So I changed my XSLT to following:
<?xml version="1.0" encoding="utf-8" standalone="no"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output encoding="UTF-8" indent="yes" method="xml" version="1.0"/> <xsl:template match="/"> <Rowsets> <Rowset> <xsl:variable name="materials" select=".//item[Tank!='RECV' and Tank!='PARK' and Quantity > 500]"/> <xsl:for-each select="$materials"> <xsl:if test="generate-id(.)= generate-id($materials[Material=current()/Material])"> <Row> <Material> <xsl:value-of select="Material"/> </Material> <Description> <xsl:value-of select="Description"/> </Description> <Value> <xsl:for-each select="$materials[Material=current()/Material]/Tank[not(.=preceding::Tank)]"> <xsl:if test="node()"> <xsl:value-of select="concat(.,'||')"/> </xsl:if> </xsl:for-each> </Value> </Row> </xsl:if> </xsl:for-each> </Rowset> </Rowsets> </xsl:template> </xsl:stylesheet>
My result now looks like this:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <Rowsets> <Rowset> <Row> <Material>1000000079</Material> <Description>330 Bulk</Description> <Value>T123||</Value> </Row> <Row> <Material>1000000196</Material> <Description>340R Bulk</Description> <Value>T700||</Value> </Row> <Row> <Material>1000002754</Material> <Description>43 Bulk</Description> <Value>T517||</Value> </Row> </Rowset> </Rowsets>
It removed the duplicate tank T123 for material 1000000079 but for material 1000002754, it even removed T515which should appear in Value field as its quantity is greater than 500 for following:
<item> <Material>1000002754</Material> <Description>43 Bulk</Description> <Tank>T515</Tank> <Batch>2013214839</Batch> <Quantity>700</Quantity> </item>
what am I doing wrong here?

And if columns are NULLable:
with sample_table as (
                      select '111' col1,'AAA' col2 from dual union all
                      select 'AAA','111' from dual union all
                      select '222','BBB' from dual union all
                      select '333','CCC' from dual union all
                      select '444',to_char(null) from dual union all
                      select to_char(null),'444' from dual union all
                      select to_char(null),to_char(null) from dual union all
                      select to_char(null),to_char(null) from dual
select least(col1,col2) col1,
        case
          when col1 is null then col2
          when col2 is null then col1
          else greatest(col1,col2)
        end
from sample_table s
group by least(col1,col2),
           case
             when col1 is null then col2
             when col2 is null then col1
             else greatest(col1,col2)
           end
order by least(s.col1,s.col2),
           case
             when s.col1 is null then s.col2
             when s.col2 is null then s.col1
             else greatest(s.col1,s.col2)
           end
COL CAS
111 AAA
222 BBB
333 CCC
    444
SQL> SY.

First attempt to remove duplicate rows from a table...

I have seen many people asking for a way to remove duplicate rows from data. I made up a fairly simple script. It adds a column to the table with the cell selected in it, and adds the concatenation of the data to the left into that new column. then it reads that into a list, and walks through that list to find any that are listed twice. Any that are it marks for DELETE.
It then walks through to find each one marked for delete and removes them (you must go from bottom to top to do this, otherwise your row markings for delete don't match up to the original rows anymore). Last is to delete the column we added.
tell application "Numbers"
activate
tell document 1
-- DETERMINE THE CURRENT SHEET
set currentsheetindex to 0
repeat with i from 1 to the count of sheets
tell sheet i
set x to the count of (tables whose selection range is not missing value)
end tell
if x is not 0 then
set the currentsheetindex to i
exit repeat
end if
end repeat
if the currentsheetindex is 0 then error "No sheet has a selected table."
-- GET THE TABLE WITH CELLS
tell sheet currentsheetindex
set the current_table to the first table whose selection range is not missing value
end tell
end tell
log current_table
tell current_table
set list1 to {}
add column after column (count of columns)
set z to (count of columns)
repeat with j from 1 to (count of rows)
set m to ""
repeat with i from 1 to (z - 1)
set m to m & value of (cell i of row j)
end repeat
set value of cell z of row j to m
end repeat
set MyRange to value of every cell of column z
repeat with i from 1 to (count of items of MyRange)
set n to item i of MyRange
if n is in list1 then
set end of list1 to "Delete"
else
set end of list1 to n
end if
end repeat
repeat with i from (count of items of list1) to 1 by -1
set n to item i of list1
if n = "Delete" then remove row i
end repeat
remove column z
end tell
end tell
Let me know how it works for y'all, it worked good on my machine, but I know localization is causing errors sometimes when I post things.
Thanks,
Jason
Message was edited by: jaxjason

Hi jason
I hope that with the added comments it will be clear.
Ask if something is always opaque.
set {current_Range, current_table, current_Sheet, current_Doc} to my getSelection()
tell application "Numbers09"
tell document current_Doc to tell sheet current_Sheet to tell table current_table
set list1 to {}
add column after column (count of columns)
set z to (count of columns)
repeat with j from 1 to (count of rows)
set m to ""
tell row j
repeat with i from 1 to (z - 1)
set m to m & value of cell i
end repeat
set value of cell z to m
end tell
end repeat
set theRange to value of every cell of column z
repeat with i from (count of items of theRange) to 1 by -1
(* As I scan the table backwards (starting from the bottom row),
I may remove a row immediately when I discover that it is a duplicate *)
set n to item i of theRange
if n is in list1 then
remove row i
else
set end of list1 to n
end if
end repeat
remove column z
end tell
end tell
--=====
on getSelection()
local _, theRange, theTable, theSheet, theDoc, errMsg, errNum
tell application "Numbers09" to tell document 1
set theSheet to ""
repeat with i from 1 to the count of sheets
tell sheet i
set x to the count of tables
if x > 0 then
repeat with y from 1 to x
(* Open a trap to catch the selection range.
The structure of this item
«class
can't be coerced as text.
So, when the instruction (selection range of table y) as text
receive 'missing value' it behaves correctly and the lup continue.
But, when it receive THE true selection range, it generates an error
whose message is errMsg and number is errNum.
We grab them just after the on error instruction *)
try
(selection range of table y) as text
on error errMsg number errNum (*
As we reached THE selection range, we are here.
We grab the errMsg here. In French it looks like:
"Impossible de transformer «class
The handler cuts it in pieces using quots as delimiters.
item 1 (_) "Impossible de transformer «class » "
item 2 (theRange) "A2:M25"
item 3 (_) " of «class NmTb» "
item 4 (theTable) "Tableau 1"
item 5 (_) " of «class NmSh» "
item 6 (theSheet) "Feuille 1"
item 7 (_) " of document "
item 8 (theDoc) "Sans titre"
item 9 ( I drop it ) " of application "
item 10 ( I drop it ) "Numbers"
item 11 (I drop it ) " en type string."
I grab these items in the list
{_, theRange, _, theTable, _, theSheet, _, theDoc}
Yes, underscore is a valid name of variable.
I often uses it when I want to drop something.
An alternate way would be to code:
set ll to my decoupe(errMsg, quote)
set theRange to item 2 of ll
set theTable to item 4 of ll
set theSheet to item 8 of ll
set theDoc to item 10 of ll
it works exactly the same but it's not so elegant.
set {_, theRange, _, theTable, _, theSheet, _, theDoc} to my decoupe(errMsg, quote)
exit repeat (*
as we grabbed the interesting datas, we exit the lup indexed by y.*)
end try
end repeat -- y
if theSheet > "" then exit repeat (*
If we are here after grabbing the datas, theSheet is not "" so we exit the lup indexed by i *)
end if
end tell -- sheet
end repeat -- i
(* We may arrive here with two kinds of results.
if we grabbed a selection, theSheet is something like "Feuille 1"
if we didn't grabbed a selection, theSheet is the "" defined on entry
and we generate an error which is not trapped so it stops the program *)
if theSheet = "" then error "No sheet has a selected table."
end tell -- document
(* Now, we send to the caller the interesting datas :
theRange "A2:M25"
theTable "Tableau 1"
theSheet "Feuille 1"
theDoc "Sans titre" *)
return {theRange, theTable, theSheet, theDoc}
end getSelection
--=====
on decoupe(t, d)
local l
set AppleScript's text item delimiters to d (*
Cut the text t in pieces using d as delimiter *)
set l to text items of t
set AppleScript's text item delimiters to "" (*
Resets the delimiters to the standard value. *)
(* Send the list to the caller *)
return l
end decoupe
--=====
Have fun
And if it's not clear enough, you may ask for more explanations.
Yvan KOENIG (from FRANCE mardi 27 janvier 2009 21:49:19)

Removing duplicates finally fixed!

Why doesn't apple incorporate the ability to remove duplicates automatically. There is no way I am going to hand select over 300 songs to remove the duplicates. Make a feature (a code based query) that will remove any duplicates, minus 1.

I'm sure that's possible (and there are scripted solutions that will do exactly what you want). The bigger issue, though, is what do you mean by a "duplicate"? iTunes currently has two functions:
View > Show Duplicate Items will show all cases where a song exists in your library with the same Artist and Name fields - therefore you may get different edits, mixes, live vs. studio versions, of the same song. I understand that some people may want to eliminate at least some of these, but how would a fully-automated version know which songs you want to keep and which to get rid of in this example?
In extreme cases, you could have many songs that would be shown in the results of the View > Show Duplicate Items query yet this would represent an absolutely correct library with nothing that the user (me, in this case) would regard as a "duplicate":
The highlight tracks here show a case where the same "song" (Artist and Name matched) can occur within the same album - quite correctly.
SHIFT View > Show Exact Duplicate Items is much more restrictive, in that as well as Artist and Name it will only show songs that also have identical values for Album, Disc Number and Track Number. Unlike the first case, where duplicates may be entirely value, anything shown by this second function is likely to be an error. This is an area in which iTunes could maybe offer an automated function, though there are still questions that would need to be resolved. For example, such a duplicate may be reported if:
There are two entries in the iTunes database that point to the same media file
There are two entries in the iTunes database that point to different media files - still the same song/recording but they could have different filenames or be in different locations
The biggest barrier to an automated de-duplication function within iTunes, though, is that unless it offered a host of user options there's a very significant risk that it would delete duplicates but not the ones that you want to remove - and where the de-duplication process also involves file deletion this is not easy to support a robust Undo function for. The other factor is that the second case of duplication (exactly the same song occurring more than once in your library) is almost always the result of some kind of user error, or user misunderstanding of how iTunes works and manages the content of its library. iTunes is complex enough (far too complex, in some people's opinion) without adding functionality that addresses the consequence of misuse.

Best app to remove duplicates from iTunes 2014

Hi All,
I've been trying to research the best application to sort and remove duplicates from my iTunes library. I have over 7000 songs and iTunes built in duplicate finder doesn't look at the track fingerprint, which is useful for those songs which are labelled "Track_1" etc.
Has anyone reviewed any recent products? I was looking at TuneUp, but after reading so many negative comments, I've decided not to go down that path. I would prefer a program that did most of the work for me, due to the amount of songs. Happy to pay for a good product...
I do have MusicBrainz Picard, which has done a great job of tagging, but don't remove duplicates.
Thanks in advance :-)

Tune up is a great app. When they moved from version 2 to version 3 is when it went to crap and all heck broke loose. They shut their doors but they have since re opened and went back to developing version 2. I use that version and I am pretty happy with it as being an overall cleanup utility. I also use Musicbrainz and a couple of other utilities but in the end if you have an enormous library 20k plus then you are going to have a few slip through. I would probably go with Tuneup if I were you and a thorough third party duplicate finder. Dupe Guru's music edition seems to do a pretty good job.

Removing duplicate values from selectOneChoice bound to List Iterator

I'm trying to remove duplicate values from a selectOneChoice that i have. The component binds back to a List Iterator on the pageDefinition.
I have a table on a JSF page with 5 columns; the table is bound to a method iterator on the pageDef. Then above the table, there are 5 separate selectOneChoice components each one of which is bound to the result set of the table's iterator. So this means that each selectOneChoice only contains vales corresponding to the columns in the table which it represents.
The selectOneChoice components are part of a search facility and allow the user to select values from them and restrict the results that are returned. The concept is fine and i works. However if i have repeating values in the selectOneChoice (which is inevitable given its bound to the table column result set), then i need to remove them. I can remove null values or empty strings using expression language in the rendered attribute as shown:
<af:forEach var="item"
items="#{bindings.XXXX.items}">
<af:selectItem label="#{item.label}" value="#{item.label}"
rendered="#{item.label != ''}"/>
</af:forEach>
But i dont know how i can remove duplicate values easily. I know i can programatically do it in a backing bean etc.... but i want to know if there is perhaps some EL that might do it or another setting that ADF gives which can overcome this.
Any help would be appreciated.
Kind Regards

Hi,
It'll be little difficult removing duplicates and keeping the context as it is with exixting standard functions. Removing duplicates irrespective of context changes, we can do with available functions. Please try with this UDF code which may help you...
source>sort>UDF-->Target
execution type of UDF is Allvalues of a context.
public void UDF(String[] var1, ResultList result, Container container) throws StreamTransformationException{
ArrayList aList = new ArrayList();
aList.add(var1(0));
result.addValue(var1(0));
for(int i=1; i<var1.length; i++){
if(aList.contains(var1(i)))
continue;
else{
aList.add(var1(i));
result.addValue(var1(i));
Regards,
Priyanka

How to remove duplicate items ?

ok so ive moved my iTunes library from the NAS drive I bought (after finding out that wouldnt work) onto my new laCie external drive, but ive found some of my albums have triple copies?
i know how to show duplicates but im not sure how to safely remove duplicates without deleting all copies?

Oh boy.
I'm sure there are better ways to do it than this and it will take time, but to avoid all possible loss of data, what I would do is first consolidate all of your libraries.
• Open the iTunes Library you think is most correct by holding down the OPTION key when you open the iTunes application, which should bring up a dialogue like this: http://cl.ly/image/2i2Q3o0Z0Y3C
• Then go to preferences and make sure it looks like this: http://cl.ly/image/2C2Z0u0C3T3c
I would recommend keeping your main iTunes library on your main hard drive (for me that's my internal), unless you definitely can't fit it.
• Now go to the File menu >Library > Import Playlist
• Navigate to another of the libraries, and click on the iTunes Library.xml file and import it. Do this for each of your libraries, except the one you are currently using.
• Now that you've got everything imported into the one library, the fun part starts.
Do what i said in my previous post and remove all the duplicates.
• Once that's done, check all the other libraries to make sure you haven't missed anything, and send them on their way to the Trash, and empty it to reclaim all that space.
I really hope you get this sorted, I went through an ordeal like this myself recently, so it's going to take time, but it feels good when it's all cleaned up and finished!
xeni
PS. After writing all this I thought, hmm, why didn't I just Google it instead of figuring it out myself? ;P
I found this, and it might help if my instructions weren't clear enough. https://bitly.com/LpqFPq
Also, if you don't already, I urge you to use Time Machine backup. Read more here: http://www.apple.com/osx/apps/#timemachine and http://pondini.org/TM/FAQ.html

How to remove duplicate POP messages in Mail under Mavericks

Hello
I'm having some significant problems with Apple Mail, and despite having researched widely, I haven't been able to find a solution yet. Here is my tale of woe...
My set up is that I use Mail with 5 separate accounts, downloading all messages using POP. I keep ALL my mail (even the trash) so that I have a permanent record of all conversations, going back more than 10 years - this is about 250,000 emails in around 250 subfolders. I'm running the latest version of Mavericks on a 2010 MBP. I have no intention of moving to IMAP - I need to keep a local archive of all my mail.
A few months ago I was having problems with Mail being very slow to load and search emails. I decided to rebuild the mailboxes, but afterwards discovered that thousands of messages in the inbox of one of my accounts (and some of the sub-folders) had disappeared. Once I realised I tried to go back to a recent Time Machine back up, but, for whatever reason, that part of the back up hadn't worked properly and I was unable to recover the previous status. I left it while I decided what to do, and had time to deal with it...
The last 2 years worth of emails on that account were still archived on gmail, so last week I decided to try redownloading the whole lot (something like 72,000 messages), and then planned to use one of the scripts I was aware of to remove the duplicates to the trash, and then remove any duplicates in the trash forever. That ought to mean I at least had an archive of the last two years of missing messages, even if I had no record of which ones had been flagged or replied to, etc.
The downloading took about 24 hours to complete, and I now have almost 70,000 messages in the inbox of that account. I also rebuilt all the folders again, which doesn't seem to have resulted in any further loss of messages as far as I can tell. However, I hadn't figured on two things:
1. Mail 'hides' duplicate messages, meaning that all of the messages I've downloaded are showing as unread, and I can't differentiate the ones I already had in my Inbox, and the new downloads. When I click on those apparently unread messages I can see if they have been replied to or forwarded, etc, but there's no obvious way for me to remove the ones I have already dealt with to the trash, without going through them all manually, which is clearly impossible.
2. The scripts I'd found for removing duplicates don't work. Andreas Amann's Remove Duplicates script doesn't work under Mavericks, and he has abandoned the project. I've also tried the remove-duplicate-messages.scpt made by Jolly Roger (http://jollyroger.kicks-***.org/software/), and while it *sometimes* works on individual subfolders on my Mac (but as far as I can tell removes the duplicate in that folder, rather than the newly downloaded version), mostly it doesn't work at all - it creates a 'Remove Duplicate Messages' folder on my desktop, a log inside it and a folder for removed messages, but nothing appears in the duplicates folder.
So, I'm left in a position where I have 70,000 apparently unread messages in my Inbox, a massively bloated Mail library (which has pretty much doubled in size, because of the 'hidden' messages), a slow and unresponsive Mail program. I've come to the conclusion that there must be some corrupted email somewhere, which probably caused the original email haemorrhage, and may still be causing the inability to remove duplicates. Mail is so slow as to be almost unuseable.
I figure I have a number of options:
1. I could live with the situation and just archive most of the Mail in my inbox, with the side effect that there will be a bunch of messages I have never replied to that are missed.
2. I could abandon the last week's efforts and revert to the version of Mail I was using a week ago, and then redownload the recent emails from my various accounts. That would still leave me without those thousands of emails I lost on my local machine.
3. I could find another way to deal with this. Can I get the remove duplicates script working? Should I revert to the version of Mail from a week ago, download ALL the messages again, but do it in a way that allows me to find the duplicates and remove them? Should I move to another email program altogether (which would presumably be massively disruptive to my work!)
Anyway, apologies for the essay length of this request, and thanks in advance for any help you may be able to offer!

hi Eric
thanks for the tip, but I don't think that solves my problem, which is that Mail apparently hides the duplicates, so they're still there, you just can't differentiate or remove them. I actually *need* the duplicates there so I know I've got everything, but then I need to be able to move the ones I know I've dealt with from my inbox to an archive/trash, and/or remove them completely so Mail isn't totally bloated

I have more than one photo backup on my computer now itunes has downloaded three of each photo. How do I remove duplicates

I have more than one photo backup on my computer now itunes has downloaded three of each photo. How do I remove duplicates?

Wich version of iPhoto do you have? You need iPhoto '09 or '11 to order prints.
Make an album of all photos you want to order in the same order.
Then select all photos at once, when you use Share > order prints.
You should see a panel like this, where you can mark the quantities. The screenshot is from iPhoto 9.6:

HT2905 how do you remove duplicate libraries from itunes

I recently purchased a new window9.1 computer. I tried to move the library to the new pc by using an external haddrive. The songs and other items all transfered, but the mudic remsined on the externalmedia informaiton files, I had to connect the drive to listen to music.. I worked with tech support and copied themusioc files to ITunes. When I reopened the media files and copied music were on the system. ITunes recognzed there was missing information on the media data file and corrected it, so oe I have 20,000 sonfs in a 10,000 song library.How do you delete massive duplicates without individually deleting10,000 songs?

Apple's official advice is here... HT2905 - How to find and remove duplicate items in your iTunes library. It is a manual process and the article fails to explain some of the potential pitfalls.
Use Shift > View > Show Exact Duplicate Items to display duplicates as this is normally a more useful selection. You need to manually select all but one of each group to remove. Sorting the list by Date Added may make it easier to select the appropriate tracks, however this works best when performed immediately after the dupes have been created. If you have multiple entries in iTunes connected to the same file on the hard drive then don't send to the recycle bin.
Use my DeDuper script if you're not sure, don't want to do it by hand, or want to preserve ratings, play counts and playlist membership. See this thread for background and please take note of the warning to backup your library before deduping.
(If you don't see the menu bar press ALT to show it temporarily or CTRL+B to keep it displayed)
tt2

How to remove duplicate songs from Itunes

I've consolidates my music in Itunes 11.01. Playlists are just getting crazy with duplicates.
I want to eliminate duplicate music files, as well as eliminate the references in the interface.
Then sync my new IPad.
When I manually erase a duplicate in Windows Explorer, I'm left with a "!" indication in Itunes that the file cannot be found... duh.
I want to find either an easier way to eliminate duplicates without Windows Explorer, or
a way to select all the "!" references in Itunes interface and delete.
Help please from a completely organized person who hates Apples "intelligence" at thinking it knows what I want, and not providing
adequate help under "remove duplicates" in its help index.
Thanks.

How to find and remove duplicate items in your iTunes library - http://support.apple.com/kb/HT2905
Posts by turingtest2 about different types of duplicates and techniques - https://discussions.apple.com/thread/3555601 and https://discussions.apple.com/message/16042406 (Note: The DeDuper script is for Windows).
May 2014 post on iCloud duplicates - https://discussions.apple.com/message/25867873
Show exact duplicates (Mac and Windows) - https://discussions.apple.com/message/16951281
http://dougscripts.com/itunes/itinfo/dupin.php (commercial)
iTunes Duplicate Song Manager - http://sourceforge.net/projects/itunesdsm/
http://www.hardcoded.net/dupeguru_me/
http://www.wideanglesoftware.com/tunesweeper/index.php
http://www.araxis.com/find-duplicate-files (commercial, free trial) - finds duplicate files on computer, not specifically iTunes
I would do it manually. As you can see from turingtest2's post, some duplicates are not really duplicates.
I would also review how you add media to iTunes. It should not be producing all that many duplicates.

How to remove duplicates in iphoto 7.1.5 and aperture 2.1.4 on same hard drive

How to remove duplicates from iPhoto 7.1.5 and Aperture 2.1.4 on same hard drive?

For iPhoto duplicate annihalitor is a good solution
For Aperture it is best to ask in the aperture forum
LN

How do I find and remove duplicates on the 11.2 itunes?

How do I find and remove duplicates on the 11.2 itunes?

duplicate annihaitor
But often duplicates are a symptom and you really need to address the cause - more information is needed to help out with that
LN

How to remove duplicate messages from Mail in Mavericks (POP)

Hello
I'm having some significant problems with Apple Mail, and despite having researched widely, I haven't been able to find a solution yet. Here is my tale of woe...
My set up is that I use Mail with 5 separate accounts, downloading all messages using POP. I keep all my mail so that I have a permanent record of all conversations, going back more than 10 years - this is about 250,000 emails. I'm running the latest version of Mavericks on a 2010 MBP. I have no intention of moving to IMAP - I need to keep a local archive of all my mail.
A few months ago I was having problems with Mail being very slow to load and search emails. I decided to rebuild the mailboxes, but afterwards discovered that thousands of messages in the inbox of one of my accounts (and some of the sub-folders) had disappeared. Once I realised I tried to go back to a recent Time Machine back up, but, for whatever reason, that part of the back up hadn't worked properly and I was unable to recover the previous status. Ileft it while I decided what to do, and had time to deal with it.
The last 2 years worth of emails on that account were still archived on gmail, so last week I decided to try redownloading the whole lot (something like 72,000 messages), and then uplanned to use one of the scripts I was aware of to remove the duplicates to the trash, and then remove any duplicates in the trash forever. That ought to mean I at least had an archive of the last two years of missing messages, even if I had no record of which ones had been flagged or replied to, etc.
The downloading took about 24 hours to complete, and I now have almost 70,000 messages in the inbox of that account. However, I hadn't figured on two things:
1. Mail 'hides' duplicate messages, meaning that all of the messages I've downloaded are showing as unread, and I can't differentiate the ones I already had in my Inbox, and the new downloads. When I click on those apparently unread messages I can see if they have been replied to or forwarded, etc, but there's no obvious way for me to remove the ones I have already dealt with to the trash, without going through them all manually, which is clearly impossible.
2. The scripts I'd found for removing duplicates don't work. Andreas Amann's Remove Duplicates script doesn't work under Mavericks, and he has abandoned the project. I've also tried the remove-duplicate-messages.scpt made by Jolly Roger (http://jollyroger.kicks-***.org/software/), and while it *sometimes* works on individual subfolders on my Mac (but as far as I can tell removes the duplicate in that folder, rather than the newly downloaded version), mostly it doesn't work at all - it creates a 'Remove Duplicate Messages' folder on my desktop, a log inside it and a folder for removed messages, but nothing appears in the duplicates folder.
So, I'm left in a position where I have 70,000 apparently unread messages in my Inbox, a massively bloated Mail library (which has pretty much doubled in size), a slow and unresponsive Mail program. I figure I have a number of options. Either I could abandon the last week's efforts and revert to the version of

Apologies, this was posted here in error. Please find updated post here: https://discussions.apple.com/message/27261965#27261965

Search for records in the event viewer after the last run (not the entire event log), remove duplicate - Output Logon type for a specific OU users

Hi,
The following code works perfectly for me and give me a list of users for a specific OU and their respective logon types :-
$logFile = 'c:\test\test.txt'
$_myOU = "OU=ABC,dc=contosso,DC=com"
# LogonType as per technet
$_logontype = @{
2 = "Interactive"
3 = "Network"
4 = "Batch"
5 = "Service"
7 = "Unlock"
8 = "NetworkCleartext"
9 = "NewCredentials"
10 = "RemoteInteractive"
11 = "CachedInteractive"
Get-WinEvent -FilterXml "<QueryList><Query Id=""0"" Path=""Security""><Select Path=""Security"">*[System[(EventID=4624)]]</Select><Suppress Path=""Security"">*[EventData[Data[@Name=""SubjectLogonId""]=""0x0""
or Data[@Name=""TargetDomainName""]=""NT AUTHORITY"" or Data[@Name=""TargetDomainName""]=""Window Manager""]]</Suppress></Query></QueryList>" -ComputerName
"XYZ" | ForEach-Object {
#TargetUserSid
$_cur_OU = ([ADSI]"LDAP://<SID=$(($_.Properties[4]).Value.Value)>").distinguishedName
If ( $_cur_OU -like "*$_myOU" ) {
$_cur_OU
#LogonType
$_logontype[ [int] $_.Properties[8].Value ]
#Time-created
$_.TimeCreated
$_.Properties[18].Value
} >> $logFile
I am able to pipe the results to a file however, I would like to convert it to CSV/HTML When i try "convertto-HTML"
function it converts certain values . Also,
a) I would like to remove duplicate entries when the script runs only for that execution.
b) When the script is run, we may be able to search for records after the last run and not search in the same
records that we have looked into before.
PLEASE HELP !

If you just want to look for the new events since the last run, I suggest to record the EventRecordID of the last event you parsed and use it as a reference in your filter. For example:
<QueryList>
<Query Id="0" Path="Security">
<Select Path="Security">*[System[(EventID=4624 and
EventRecordID>46452302)]]</Select>
<Suppress Path="Security">*[EventData[Data[@Name="SubjectLogonId"]="0x0" or Data[@Name="TargetDomainName"]="NT AUTHORITY" or Data[@Name="TargetDomainName"]="Window Manager"]]</Suppress>
</Query>
</QueryList>
That's this logic that the Server Manager of Windows Serve 2012 is using to save time, CPU and bandwidth. The problem is how to get that number and provide it to your next run. You can store in a file and read it at the beginning. If not found, you
can go through the all event list.
Let's say you store it in a simple text file, ref.txt
1234
At the beginning just read it.
Try {
$_intMyRef = [int] (Get-Content .\ref.txt)
Catch {
Write-Host "The reference EventRecordID cannot be found." -ForegroundColor Red
$_intMyRef = 0
This is very lazy check. You can do a proper parsing etc... That's a quick dirty way. If I can read
it and parse it as an integer, I use it. Else, I just set it to 0 meaning I'll collect all info.
Then include it in your filter. You Get-WinEvent becomes:
Get-WinEvent -FilterXml "<QueryList><Query Id=""0"" Path=""Security""><Select Path=""Security"">*[System[(EventID=4624 and EventRecordID>$_intMyRef)]]</Select><Suppress Path=""Security"">*[EventData[Data[@Name=""SubjectLogonId""]=""0x0"" or Data[@Name=""TargetDomainName""]=""NT AUTHORITY"" or Data[@Name=""TargetDomainName""]=""Window Manager""]]</Suppress></Query></QueryList>"
At the end of your script, store the last value you got into your ref.txt file. So you can for example get that info in the loop. Like:
$Result += $LogonRecord
$_intLastId = $Event.RecordId
And at the end:
Write-Output $_intLastId | Out-File .\ref.txt
Then next time you run it, it is just scanning the delta. Note that I prefer this versus the date filter in case of the machine wasn't active for long or in case of time sync issue which can sometimes mess up with the date based filters.
If you want to go for a date filtering, do it at the Get-WinEvent level, not in the Where-Object. If the query is local, it doesn't change much. But in remote system, it does the filter on the remote side therefore you're saving time and resources on your
side. So for example for the last 30 days, and if you want to use the XMLFilter parameter, you can use:
<QueryList>
<Query Id="0" Path="Security">
<Select Path="Security">*[System[TimeCreated[timediff(@SystemTime) <= 2592000000]]]</Select>
</Query>
</QueryList>
Then you can combine it, etc...
PS, I used the confusing underscores because I like it ;)
Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

Regex - removing duplicate matches

Similar Messages

Maybe you are looking for