How does SharePoint determine files are duplicates in search results?

In the search results, some files are grouped as duplicates (a hyperlink view duplicates appears under the search result).
How does SharePoint determines that 2 files are duplicates?
How does SharePoint determines the one that is shown in the search result (the 'main' file)?
Can we influence both?
Patrik | My Blog

I don't know if this helps, but I've been looking into the same problem that's come to light a few times during troubleshooting customised deployments of SharePoint recently.  This is my understanding so far (paraphrased from http://blogs.technet.com/harikumh/archive/2008/11/14/some-interesting-facts-about-sharepoint-2007-search.aspx):
Document similarity or matching for the purposes of identifying duplicates is based only on a hash of the content of the document.  None of the file properties are used in calculating the hash (i.e. things like filename, author, create and modify dates are not used).  The SQL table MSSDuplicateHashes in the SSP’s search database holds all the 64bit hashes necessary to determine if one document is a near-duplicate of another against each indexed document.  This table is read while doing a search to determine duplicates if removal of duplicates is enabled.
Steve

Similar Messages

  • How to set collapsespecification to avoid duplicates in Search Results

    can someone help me How to set collapsespecification to avoid duplicates in OOTB Search Results page.
    Suresh Kumar Udatha.

    Export the content search webpart you will download the .webpart file  you will find the below property
    <property name="DataProviderJSON" type="string">{"QueryGroupName":"Default","QueryPropertiesTemplateUrl":"sitesearch://webroot","IgnoreQueryPropertiesTemplateUrl":false,"SourceID":"8413cd39-2156-4e00-b54d-11efd9abdb89","SourceName":"Local SharePoint Results","SourceLevel":"Ssa","CollapseSpecification":"","QueryTemplate":"(contentclass:STS_ListItem OR IsDocument:True)","FallbackSort":[{"d":1,"p":"ViewsLifeTime"}],"FallbackSortJson":"[{\"d\":1,\"p\":\"ViewsLifeTime\"}]","RankRules":null,"RankRulesJson":"null","AsynchronousResultRetrieval":false,"SendContentBeforeQuery":true,"BatchClientQuery":true,"FallbackLanguage":-1,"FallbackRankingModelID":"","EnableStemming":true,"EnablePhonetic":false,"EnableNicknames":false,"EnableInterleaving":false,"EnableQueryRules":true,"EnableOrderingHitHighlightedProperty":false,"HitHighlightedMultivaluePropertyLimit":-1,"IgnoreContextualScope":true,"ScopeResultsToCurrentSite":false,"TrimDuplicates":false,"Properties":{"TryCache":true,"Scope":"{Site.URL}","ListId":"dd8de50c-d533-4667-9cad-79e4402d5435","ListItemId":1,"UpdateLinksForCatalogItems":true,"EnableStacking":true},"PropertiesJson":"{\"TryCache\":true,\"Scope\":\"{Site.URL}\",\"ListId\":\"dd8de50c-d533-4667-9cad-79e4402d5435\",\"ListItemId\":1,\"UpdateLinksForCatalogItems\":true,\"EnableStacking\":true}","ClientType":"ContentSearchRegular","UpdateAjaxNavigate":true,"SummaryLength":180,"DesiredSnippetLength":90,"PersonalizedQuery":false,"FallbackRefinementFilters":null,"IgnoreStaleServerQuery":false,"RenderTemplateId":"DefaultDataProvider","AlternateErrorMessage":null,"Title":""}</property>
    update the collapse specification value (currently empty string) then upload the webpart 
    Hope that helps|Amr Fouad|MCTS,MCPD sharePoint 2010

  • How does vendor determine if no info record is maintained for the material

    how does vendor determine if no info record is maintained for the material

    Hi
    If you have to determine a vendor, the minimum requirement is Info record. Beyond that, you can ofcourse maintain Source Lists, Quota Arrangements but Info record is bare minimum for automatic determination of vendor.
    Otherwise, you have to maintain the vendor manually in th Purchasing docs.
    Tcodes for Info record are ME11, ME12, and ME13.
    Hope this clarifies.
    Thanks

  • In InDesign, how does one determine the pixel size of a text box? Specifically, we need to write text to specifications of 600 pixel width, and have no idea a) how to scale a text box to specific pixel width, b) how to

    This may be a basic question... but in InDesign, how does one determine the pixel size of a text box? Specifically, we need to write text to specifications of 600 pixel width, and have no idea a) how to scale a text box to specific pixel width, b) how to determine what word count we can fit in, and c) how to do it in a table? Thanks!

    Set your ruler increments to pixels Preferences>Units & Increments. You can fill the text box with placeholder text Type>Fill with Placeholder text and get a word count from the Info panel with Show Options turned on from the flyout.
    From the Transform panel you can set a text box's width and height

  • How does the backup file that I would like to use for restore now require a password

    How does the backup file that I would like to use for restore now require a password, when I just Backed it up on 10/30/14

    It requires a password to restore from it because at some point, you checked the box in iTunes to use encrypted backups, at which point you were prompted for, and set a password.
    If you now can't remember what that was, then you can not restore from that backup.

  • How does adv tax payments are handled in sap

    how does adv tax payments are handled in sap
    is there any  method involved to handled it through app

    Andres,
    Thanks for your feedback - my question was probably a little vague (at best).
    The ASCII characters are hidden in the barcode - they do not form part of the numeric string when scanning into a normal char field.
    If I open notepad and scan I do not see the special character. If I open word and scan, I still do not see the special character but the font size changes where I know the special character exists - obviously it is a command to word.
    If I open CMD DOS prompt and scan there, I CAN see the special character.
    This led me to this how can I format an SAP field to behave the same way?
    I've subsequently found configuration settings in the scanning device itself which allows me to swap these hidden character (called Function Codes) with actual characters of my choice. For now this is how I will have to do it. The only issue with this approach is that I have to make sure all scanners are set up properly in each country which is a bit of a pain.
    Regards,
    Mark.

  • How does sharepoint designer create item in list work ?.

    I'm using sharepoint 365, and the free sharepoint designer.
    I'm trying to make something so that if List A gets an update other lists get new records to.
    There is a sharepoint-designer action that lets one create a an item in another list.
    And i have tested it a lot but haven't got any of my tests to work, been trying for a week now.
    I'm not sure as how it is exactly named in English, since our sharepoint is al dutch  :(
    I think in English  it looks like:
    Create item in Yourlistname  (save result in variable
    Succes)
    When clicking on yourlistname, you need the all valeus of all required items in the list.
    Now asume such worklow rule is made to start after an Entry is made in List A, And we want to update List B
    Here i start i think my problems
    So you select List B as your listname, but then...
    How does one refer, to a required link title from List A?.  use current list item ?
    How to link to a 4th table entry of List C as value to make in List B
    If List A has a value that is clickable-link how to link that to Value B
    If someone knows of problems with this, permissions things, or knows a youtube explaining it, or can explain it himself. It would be of great help to me.
    Upon futher investigation, i got into problems where the workflow has to fill in an item
    Into a column type that is a lookup table (another list), if work with string columns it works, but I need the current "name" as a lookup link connected into List B

    Hi PGT,
    I have seen a similar thread from you. The solution is that you need to work with Item ID's when you define the workflow to work with Lookup columns links. only at creation time you need to enter what field you like to include in your list. Based upon ID
    it will then show the values. The thread is:
    https://social.msdn.microsoft.com/Forums/office/en-US/e342e18a-3f68-4b49-9d80-41f11b76cb4e/how-does-sharepoint-designer-create-list-item-work-?forum=sharepointcustomization
    I will mark this reply as answer to close this case. If you have any question about this issue, please feel free to reply.
    Best Regards,
    Wendy
    Forum Support
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact
    [email protected]
    Wendy Li
    TechNet Community Support

  • What exactly are Upcoming Songs & how does iTunes determine them?

    I've never really bothered with Party Shuffle before but it always bugged me when it gave the message about Upcoming Songs. I mean, what is an upcoming song? How does iTunes decide it's upcoming?
    Regards,
    spriter

    Upcoming Songs are tracks in Party Shuffle that have not yet been played. You can set how many upcoming songs are visible, but this setting does not determine how many upcoming songs there are.
    Party Shuffle is like selecting Shuffle for the Library or any playlist except that you can alter the results. You can add or delete tracks and you can manually reorder the upcoming tracks in Party Shuffle. You can also tell Party Shuffle to play higher rated songs more often.
    Regular shuffle stops once every track in the playlist has been played, but Party Shuffle will continue playing, adding repeat tracks when needed.

  • Static Files with same name. How does apex determine which files to serve?

    Hello, I'm using apex 4.2.1.00.08 and I cannot figure out how apex manages the static files and cannot find any help in the docs (other that some high level UI description).
    The application is serving some file and I cannot find which one it is in any easy way.
    I have a workspace where there are several files that have the same name, and I cannot understand how apex figures out which one to serve, and also don't understand what is value of associating a file with an application.
    There are files associated with application 0, which don't appear to show up in the "shared components", but can be seen as
    SELECT *
    FROM wwv_flow_files
    WHERE flow_id = 0;
    and can apparently only be deleted using "SQL Commands" inside apex.
    the URL called is something like
    wwv_flow_file_mgr.get_file?p_security_group_id=13498126233076320&p_fname=myfile.css
    so apparently the only parameters that matter are the workspace and the file name. The associated application is irrelevant.
    apparently files linked to flow_id 0 have precedence over all the other files ...
    Thanks for clearing up a bit of "fog" on this issue.

    VC wrote:
    Go to that application > shared components > Static files you should see the file.the problem is that there are multiple files with that name, and that sometimes the file being served is linked to application "0" and it doesn't show in the "shared components" interface ...
    I have a workspace where there are several files that have the same name, and I cannot understand how apex figures out which one to serve, and also don't understand what is value of associating a file with an application.Static files can be uploaded to apex with optionally associating with an application within that workspace.
    Files associated with an application are referenced using *#APP_IMAGES#*
    Files not associated with an application are referenced using *#WORKSPACE_IMAGES#*I referenced with #WORKSPACE_IMAGES#, but now I see that if I use #APP_IMAGES# the URL generated will also contain the application ID. This would help to discriminate between different files with the same name but linked to different applications ...
    There are files associated with application 0, which don't appear to show up in the "shared components", but can be seen asWhy are you particularly interested in application 0?Because somehow APEX puts the files that I uploaded there ... they can be seen selecting from wwv_flow_files. Do they take precedence over all other files with the same name?
    Filename is unique for the given workspace[and application]I disagree. I have multiple files with the same name ... the root of this problem ...
    so apparently the only parameters that matter are the workspace and the file name. The associated application is irrelevant.How is your static file referenced??
    But not always, try associating an static file with and application and reference it using #APP_IMAGES# instead of #WORKSPACE_IMAGES#
    See
    http://docs.oracle.com/cd/E37097_01/doc/doc.42/e35125/concept_sub.htm#BEIDCGAJ
    http://docs.oracle.com/cd/E37097_01/doc/doc.42/e35125/ui_file_manage.htm#HTMDB06011
    Thanks, but the documentation doesn't give much details ... apparently files referenced with #WORKSPACE_IMAGES# can still resolve to files linked to specific applications ... I would like to understand the actual workflow for the various cases (file references with APP_IMAGES, referenced with WORKSPACE_IMAGES, file associated with the application, with another application, with no application, with application "0" ...).
    Also, I find it somewhat misleading that you can have files associate with applications that don't exist anymore (e.g. have been deleted).
    Edited by: GChierico on Apr 11, 2013 2:16 PM

  • How does Sharepoint save documents?

    Hi,
    I want to know how SharePoint saves files from the point you hit the save button on a e.g excel document?
    Does SharePoint save the file directly to its database on the cloud or does it save the file locally on the PC first before and uploads it?
    Thank You in advance.
    Regards Vukani Khumalo

    Hi Vukani,
    Files are uploaded from your local system to SharePoint Content database using SharePoint API.
    if you are referring to Office web apps, and working within the browser then the file is not downloaded to your PC first.
    You can use Rest / CSOM/ Server side API's to upload document and use these links to understand how it works -
    http://blogs.msdn.com/b/uksharepoint/archive/2013/04/20/uploading-files-using-the-rest-api-and-client-side-techniques.aspx
    http://msdn.microsoft.com/en-us/library/office/dn292553(v=office.15).aspx
    http://msdn.microsoft.com/en-us/library/office/dn450841(v=office.15).aspx
    http://msdn.microsoft.com/en-us/magazine/dn198245.aspx
    and there is a new concept called shredded storage -
    http://blogs.technet.com/b/wbaer/archive/2012/11/12/introduction-to-shredded-storage-in-sharepoint-2013.aspx
    http://blogs.technet.com/b/wbaer/archive/2013/09/17/overview-of-shredded-storage-in-sharepoint-2013.aspx
    Hope this helps!
    Ram - SharePoint Architect
    Blog - SharePointDeveloper.in
    Please vote or mark your question answered, if my reply helps you

  • How does AppLocker classify files?

    For example, when a user double-clicks 'actually-a-malicious-exe.txt' - does AppLocker classify by content actions ("Wait a minute, this is trying to launch a process"), or solely by file extension? I've seen SRP catch such deception, but I haven't
    found anything detailing exactly how AppLocker responds to this scenario.
    How does AppLocker evaluate child processes for applications that do NOT specify LOAD_IGNORE_CODE_AUTHZ_LEVEL or SANDBOX_INERT?

    Hi,
    Based on my research, AppLocker classifies files by file extension, as you mentioned, if a user opens a 'actually-a-malicious-exe.txt', the notepad process will be used to open this text file for user to read it, instead of initializing
    the exe file.
    If applications don’t specify values as LOAD_IGNORE_CODE_AUTHZ_LEVEL or SANDBOX_INERT, then they will be examined against AppLocker rules to determine if it’s allowed to run.
    In addition, quoted from the article below: “AppLocker rules either allow or prevent an application from launching. AppLocker does not control the behavior of applications after they are
    launched. Applications could contain flags passed to functions that signal AppLocker to circumvent the rules and allow another .exe or .dll to be loaded. In practice, an application that is allowed by AppLocker could use these flags to bypass AppLocker rules
    and launch child processes. You must thoroughly examine each application before allowing them to run by using AppLocker rules.”
    Security Considerations for AppLocker
    http://technet.microsoft.com/en-us/library/ee844118(WS.10).aspx
    More information for you:
    LoadLibraryEx function
    http://msdn.microsoft.com/en-us/library/windows/desktop/ms684179(v=vs.85).aspx
    Understanding AppLocker Rule Behavior
    http://technet.microsoft.com/en-us/library/ee460942.aspx
    Best Regards,
    Amy Wang

  • How  does systemn determin pricing procedure of billing type

    How does system determinate pricing procedure of F2???????????????????
    By doc.pric.proc? but in IMG, it is empty!!!
    and I don't think it is reference sales order's pricing procedure
    because if you use OR + TAN, the reference document of billing is delivery order!!!!!!

    Hi zhang
    In pricing procedure determination OVKK , whatever DuPP you maintain that is linked to Billing document
    In VOV8 we can see the CuPP  of the document . so if the DuPP is linked to CuPP in OVKK then the same pricing procedure will be flowing to billing document also  . Apart from that in VOV8 also make sure that , in billing data you are maintaining the billing  type
    Regards
    Srinath

  • How does Oracle determine a table as key-preserved or not?

    I tried joining employees and departments in HR schema. Normally, departments is not key-preserved in the join operation. But I've arranged in the view so that each department has exactly one employee, so that dept_no may become the key for the join. But still, it said "cannot modify non key-preserved table". Any hints? does the joining type (left or right or inner or outer) affect the mechanism on how Oracle determine which are key-preserved and which are not? thanks.

    Hi,
    You can achive in many ways... demo
    Microsoft Windows [Version 6.1.7600]
    Copyright (c) 2009 Microsoft Corporation.  All rights reserved.
    C:\Users\Pavan>sqlplus scott/tiger@orcl
    SQL*Plus: Release 11.2.0.1.0 Production on Sun Dec 19 14:19:36 2010
    Copyright (c) 1982, 2010, Oracle.  All rights reserved.
    Connected to:
    Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
    With the Partitioning, OLAP, Data Mining and Real Application Testing options
    SQL> create table t_parent(
      2  code varchar2(10)
      3  ,description varchar2(50)
      4  )
      5  ;
    Table created.
    SQL> create table t_detail(
      2  the_code varchar2(10)
      3  ,the_date date
      4  );
    Table created.
    SQL> insert into t_parent values('a','first letter');
    1 row created.
    SQL> insert into t_detail values ('a',sysdate);
    1 row created.
    SQL> commit;
    Commit complete.
    SQL> select * from t_parent;
    CODE       DESCRIPTION
    a          first letter
    SQL> select * from t_detail;
    THE_CODE   THE_DATE
    a          19-DEC-10
    SQL> select * from t_parent join t_detail on the_code=code;
    CODE       DESCRIPTION                                        THE_CODE
    THE_DATE
    a          first letter                                       a
    19-DEC-10
    SQL> create or replace view test_v
      2  as
      3  select *
      4  from t_parent join t_detail on the_code=code;
    View created.
    SQL> select * from test_v;
    CODE       DESCRIPTION                                        THE_CODE
    THE_DATE
    a          first letter                                       a
    19-DEC-10
    SQL> update test_v set description='x';
    update test_v set description='x'
    ERROR at line 1:
    ORA-01779: cannot modify a column which maps to a non key-preserved table
    SQL> create or replace trigger trig1
      2   instead of update on test_v
      3      for each row
      4      begin
      5      if :old.description <> :new.description then
      6      update t_parent
      7      set description = :new.description;
      8    end if;
      9   end;
    10  /
    Trigger created.
    SQL> update test_v set description='x';
    1 row updated.
    SQL> commit;
    Commit complete.
    SQL> select * from test_v;
    CODE       DESCRIPTION                                        THE_CODE
    THE_DATE
    a          x                                                  a
    19-DEC-10- Pavan Kumar N

  • How does TimeMachine determine which backup to link to?

    We have 4 Macs, each having a Time Machine backup file on an external Time Capsule.
    How does the Time Machine on each Mac know which file on that Time Capsule it links to?
    Is it done via the file name, meta-data in the backup file, some other method?
    Ideas?
    Thanks
    -Mike

    Before "you" start an instance you set env variable ORACLE_SID. This identifies an instance. When you go into sqlplus and issue STARTUP, Oracle starts the instance named by the sid. Thus the instance running on the server is controlled by you. This changes as noted below.
    If you were using a non-Oracle tool to start instances, such as Veritas, then you would see it start the instance you coded into the tool. It would not randomly pick an instance. It looks in the Veritas config file and sees that you always want instance 1 on this node and instance 2 on that node.
    That said, you can make Oracle more random or "grid" like. 10g RAC done Oracle's way likes to bounce around between primary and secondary nodes. To see which instances are running on a node you can "ps -ef | grep pmon". Alternatively, use sqlplus to look in the database: view gv$instance gives you each instance name paired with the name of the host it is currently running on. There is one line of output per instance currently running.
    -Mark

  • How does EWA determines...?

    Hello my friends!
    I would appreciate your help with the following question.
    How does the EWA report determines the following values, that means what does it check from the monitored system to get
    values for the following:
    - Avg. Availability per week
    - Avg. DB request time in dialog task
    - Avg. DB request time in updated task
    Thanks in advance and best regards.

    Hi,
    - Avg. Availability per week
    I'm not sure, but I think this is from the availability file stored in work directory of the instance. It is not using CCMSPING or other CCMS functions, this is why availability in EWA is often not correct (e.g. if you have a network failure). Also not sure, but I think the KPI Report within solman uses CCMSPING for availability.
    - Avg. DB request time in dialog task
    From workload stats - same value as in transaction st03n
    - Avg. DB request time in updated task
    From workload stats - same value as in transaction st03n
    Regards,
    Frank

Maybe you are looking for