Indexing PDFs with legacy Indexing Service

I have a .NET application that uses the old legacy Indexing Service on Windows Server 2003 SP2 (32-bit). What do I need to install to make it index PDFs? I've installed XI Reader (as I understood that the iFilter was built into this now?) and I've restarted Indexing Service and done a full re-sync, but it still doesn't seem to be finding any PDFs (also tried a reboot).
Thanks

It appears that the answer was to uninstall XI and install 9.5. I believe Adobe made changes to the iFilter from 10.x onwards so that it no longer supports an old method that is used by Indexing Service. This thread was useful: http://forums.adobe.com/message/5115337#5115337

Similar Messages

  • Exporting Large Pdfs with Link Indexes - not working

    I have a Large pdf of the Early Church Fathers of 1080 pages with indexs to about 200 chapters... Acrobate will not export past the index pages about 40 to 50 pages then stops and saves file.????? Will not export past 50 pages in Doc, HTML, or Rtf?????
    Am I doing something wrong. I need to export to HTML to convert to PALM Plucker output.

    Yes - Both Funtions work the Same - EXPORT under File gives same window as
    Save AS...
    Thanks for the help.. would send pdf but it is 230K over the limit.

  • Convert dotx or docx to pdf with Word Automation Service failed

    Hello everybody,
    After search on the internet, I'm looking for a solution to this issue.
    I wrote this code for a document conversion in a visual studio 2010 workflow:
    string wordAutomationServiceName = "Word Automation Service";
    ConversionJobSettings jobSettings = new ConversionJobSettings();
    jobSettings.OutputFormat = SaveFormat.PDF;
    ConversionJob job = new ConversionJob(wordAutomationServiceName, jobSettings);
    job.UserToken = workflowProperties.Site.UserToken;
    job.AddFile(workflowProperties.WebUrl + "/" + file.Url,
    workflowProperties.WebUrl + "/" + file.Url.Replace(".docx", ".pdf"));
    job.Start();
    URLs are corrects and the word document exists.
    The problem is when the job is executed, I have errors in SharePoint logs:
    11/18/2011 09:24:15.87     w3wp.exe (0x1BC4)                           0x1CA0    Word Automation Services     
        Office Viewing Architecture       9rte    Medium      Request received for document 00000001-0001-10e2-80af-d08c970b9892, format: , numberInQueue: 0, request id ba03fb58-55b2-4c6c-b1ca-20fad3b11585   
    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.87     w3wp.exe (0x1BC4)                           0x1CA0    Word Automation Services     
        Office Viewing Architecture       c7ld    Medium      AppManager.BeginProcessRequest adding request to queue    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.88     w3wp.exe (0x1BC4)                           0x1CA0    Word Automation Services     
        Timer Job                         g27p    Medium      Local Controller '71cf62b9-c34c-46c4-9828-55de2d5f5ac0':
    In Progress: <http://site/Contracts/docsettest/contracttest.dotx> downloaded and queued locally    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.88     w3wp.exe (0x1BC4)                           0x17C0    Word Automation Services     
        Configuration                     g6xc    Medium      Item 00000001-0001-10e2-80af-d08c970b9892: Assigned to
    local worker process: 1D64 (7524; worker id = cce33245-48b9-4b0d-afcd-e3218845d81a)    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.88     w3wp.exe (0x1BC4)                           0x1CA0    SharePoint Foundation        
        Monitoring                        b4ly    Medium      Leaving Monitored Scope (ExecuteWcfServerOperation).
    Execution Time=23.6994391735768    2fd2393d-f36d-49a1-bfdf-737aefc8659a
    11/18/2011 09:24:15.88     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       vipp    Medium      AppWorker:cce33245-48b9-4b0d-afcd-e3218845d81a initializing for request ba03fb58-55b2-4c6c-b1ca-20fad3b11585   
    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.88     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       vipr    Monitorable    AppWorker:cce33245-48b9-4b0d-afcd-e3218845d81a worker call failed System.ServiceModel.CommunicationObjectAbortedException: The
    communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it has been Aborted.    Server stack trace:      at System.ServiceModel.Channels.CommunicationObject.ThrowIfDisposedOrNotOpen()    
    at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)     at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage
    methodCall, ProxyOperationRuntime operation)     at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)    Exception rethrown at [0]:      at System.Runtime.Re...   
    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.88*    w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       vipr    Monitorable    ...moting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)     at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData&
    msgData, Int32 type)     at Microsoft.Office.Web.Conversion.Framework.Remoting.IAppChannelCallback.Initialize(WorkerRequest request, FileItem fileItem)     at Microsoft.Office.Web.Conversion.Framework.AppWorker.ProcessRequest(ConversionRequest
    request). Worker name WordAutomationServices, Document 00000001-0001-10e2-80af-d08c970b9892    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.88     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Service                           g281    Medium      Local Controller '71cf62b9-c34c-46c4-9828-55de2d5f5ac0':
    Failure: <http://site/Contracts/docsettest/contracttest.dotx> not uploaded to <http://site/Contracts/docsettest/contracttest.pdf> (65543)    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.90     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       c78j    Unexpected    AppWorker:cce33245-48b9-4b0d-afcd-e3218845d81a ProcessRequestDone() received error response WorkerException, restarting the worker   
    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.90     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       b1qa    Medium      Shutting down process with force processId: 7524 belonging to AppWorker cce33245-48b9-4b0d-afcd-e3218845d81a   
    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.91     w3wp.exe (0x1BC4)                           0x1CA0    Word Automation Services     
        Configuration                     g6xb    Medium      Local Controller '71cf62b9-c34c-46c4-9828-55de2d5f5ac0':
    Local worker process exited: 1D64 (7524); exit time = 11/18/2011 09:24:15     
    11/18/2011 09:24:15.91     w3wp.exe (0x1BC4)                           0x1CA0    Word Automation Services     
        Configuration                     d0md    Medium      App 'Word Automation Service': Deleting temp directory
    'C:\Windows\TEMP\wdsrv\21659d2e-c634-46a2-9585-b4cd1398f64c\odsibdmm.cmv\1D64'     
    11/18/2011 09:24:15.92     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       xpre    Medium      Removing worker cce33245-48b9-4b0d-afcd-e3218845d81a, thread: 216    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.92     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       f2yg    Medium      CreateSandBoxedProcessWorker() is called    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       b10e    Medium      Created desktop: Service-0x0-3eaf55d$\Microsoft Office Isolated Environment     00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       2brt    Medium      AppWorker:89d80fff-43ec-459e-9d95-5ed8b67f20bb worker process is started Exe: WordServerWorker.exe Args: /id 89d80fff-43ec-459e-9d95-5ed8b67f20bb
    /convertingService net.pipe://127.0.0.1/WordServer71cf62b9-c34c-46c4-9828-55de2d5f5ac0 /assembly WdsrvWorker.dll /type WACWS /IsBatchedTracing True /LogQuota 100 WorkerType: WorkerType1 Directory: c:\windows\system32\inetsrv, pid : 3700, IsSandBoxed: True,
    UniqueSandBoxSid: S-1-5-26473-19571-45394-48    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       vioz    Medium      RemoveWorker isRemoved: True session id : uuid:c9cce13b-5285-47d6-a666-29da19e57c67;id=47, Guid: cce33245-48b9-4b0d-afcd-e3218845d81a   
    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       b4em    Monitorable    AppWorker:cce33245-48b9-4b0d-afcd-e3218845d81a recycle worker process because the conversion failed with result WorkerException.
    Worker is WordAutomationServices    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       xpre    Medium      Removing worker cce33245-48b9-4b0d-afcd-e3218845d81a, thread: 216    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       vioz    Medium      RemoveWorker isRemoved: False session id : uuid:c9cce13b-5285-47d6-a666-29da19e57c67;id=47, Guid: cce33245-48b9-4b0d-afcd-e3218845d81a   
    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x211C    Word Automation Services     
        Office Viewing Architecture       a2oj    Medium      PreProcessTime = 0; InConversionQueueTime = 0.0019142; ResponseTime = 0.0066997; TotalConversionTime = 0.0535976; AvgPreProcessTime
    = 0; AvgInConversionQueueTime = 0; AvgResponseTime = 0; AvgTotalConversionTime = 0; historyCount = 0; result = WorkerException; format = n/a    00000001-0001-10e2-80af-d08c970b9892
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x144C    Word Automation Services     
        Office Viewing Architecture       4sig    Medium      ChildProcess WordServerWorker.exe is launched inside worker 89d80fff-43ec-459e-9d95-5ed8b67f20bb. Pid 3700   
    11/18/2011 09:24:15.93     w3wp.exe (0x1BC4)                           0x144C    Word Automation Services     
        Office Viewing Architecture       d9hn    Medium      NotifyNewChildProcessInWorker has seen WordServerWorker.exe in worker 89d80fff-43ec-459e-9d95-5ed8b67f20bb   
    11/18/2011 09:24:16.45     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       viou    Medium      ... registering worker 89d80fff-43ec-459e-9d95-5ed8b67f20bb     
    11/18/2011 09:24:16.48     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       viox    Medium      Worker 89d80fff-43ec-459e-9d95-5ed8b67f20bb is now initialized.     
    11/18/2011 09:24:16.55     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       vipx    Monitorable    AppWorker:89d80fff-43ec-459e-9d95-5ed8b67f20bb application server host exited unexpectedly  (thread: 6)   
    11/18/2011 09:24:16.55     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       c78j    Unexpected    AppWorker:89d80fff-43ec-459e-9d95-5ed8b67f20bb ProcessRequestDone() received error response WorkerCrashed, restarting the worker   
    11/18/2011 09:24:16.57     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       xpre    Medium      Removing worker 89d80fff-43ec-459e-9d95-5ed8b67f20bb, thread: 6     
    11/18/2011 09:24:16.57     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       f2yg    Medium      CreateSandBoxedProcessWorker() is called     
    11/18/2011 09:24:16.57     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       b10e    Medium      Created desktop: Service-0x0-3eb1722$\Microsoft Office Isolated Environment      
    11/18/2011 09:24:16.57     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       2brt    Medium      AppWorker:59168d75-7086-4318-8d12-633affa7b783 worker process is started Exe: WordServerWorker.exe Args: /id 59168d75-7086-4318-8d12-633affa7b783
    /convertingService net.pipe://127.0.0.1/WordServer71cf62b9-c34c-46c4-9828-55de2d5f5ac0 /assembly WdsrvWorker.dll /type WACWS /IsBatchedTracing True /LogQuota 100 WorkerType: WorkerType1 Directory: c:\windows\system32\inetsrv, pid : 6752, IsSandBoxed: True,
    UniqueSandBoxSid: S-1-5-26473-19571-45394-49     
    11/18/2011 09:24:16.57     w3wp.exe (0x1BC4)                           0x18CC    Word Automation Services     
        Office Viewing Architecture       vioz    Medium      RemoveWorker isRemoved: True session id : uuid:c9cce13b-5285-47d6-a666-29da19e57c67;id=48, Guid: 89d80fff-43ec-459e-9d95-5ed8b67f20bb   
    11/18/2011 09:24:16.57     w3wp.exe (0x1BC4)                           0x144C    Word Automation Services     
        Office Viewing Architecture       4sig    Medium      ChildProcess WordServerWorker.exe is launched inside worker 59168d75-7086-4318-8d12-633affa7b783. Pid 6752   
    11/18/2011 09:24:16.57     w3wp.exe (0x1BC4)                           0x144C    Word Automation Services     
        Office Viewing Architecture       d9hn    Medium      NotifyNewChildProcessInWorker has seen WordServerWorker.exe in worker 59168d75-7086-4318-8d12-633affa7b783   
    11/18/2011 09:24:17.10     w3wp.exe (0x1BC4)                           0x1CA0    Word Automation Services     
        Office Viewing Architecture       viou    Medium      ... registering worker 59168d75-7086-4318-8d12-633affa7b783     
    11/18/2011 09:24:17.13     w3wp.exe (0x1BC4)                           0x1CA0    Word Automation Services     
        Office Viewing Architecture       viox    Medium      Worker 59168d75-7086-4318-8d12-633affa7b783 is now initialized.   
    Thank you for your help.

    Hi Jean,
    Were you able to resolve this?  I am coming across the exact same error on a SharePoint 2010 development machine.  I don't see any other posts on the web about it.  Here is the entry from my ULS logs:
    Local Controller 'fc8b8704-f0f1-4e85-a69a-dc5686c27e39': Failure: <http://ip-0a6ee272/Shared%20Documents/Word/hello.docx> not uploaded to <http://ip-0a6ee272/Shared%20Documents/PDF/hello.pdf>
    (65543)
    Do we share any of the following configuration points?  I'm trying to narrow down the potential root cause ...
    MSDN subscriber EXE install media "SharePoint Server 2010 with Service Pack 1 (x64) - (English)"
    SP1 slipstream patch level.  No cumulative updates.
    http://autospinstaller.codeplex.com/  PowerShell scripted install
    SQL 2008 R2 installed on same box as SharePoint
    Active Directory domain controller on same box as SharePoint
    c:\Windows\System32\drivers\etc\HOSTS file 127.0.0.1 entry for both machine and domain name
    Thanks in advance for the research. 
    I've actually tried re-installing SharePoint several times on brand new virtual machines.  That did not resolve the issue.  Strangely enough, the RTM version of SharePoint appears to work just fine.  With all other configuration points the
    same, I loaded RTM ... ran a Word Automation PowerShell script ... and received the expected PDF output.  Then when I apply the SP1 patch ... it stops working and I get error 65543.
    Best,
      @SPJeff

  • PDF Generation with LiveCycle Data Services

    Hi everybody!
    I am using LiveCycle Data Services to generate a dynamic pdf. I worked with this tutorial : livedocs.adobe.com/livecycle/es/sdkHelp/programmer/lcds/pdfgen_1.html and i adapted the code to my own example.
    I created a pdf template with livecycle designer and i succeed to generate the pdf thanks to LiveCycle Data Services.
    Here is my problem :
    In LiveCycle Designer i create a table and bind my data connection (from xml source) to this table and i bind subforms to repeating data.
    It works when i try to open an overview of the pdf with LiveCycleDesigner.
    But when i try to generate the pdf with livecycle data service, there is no repeat of my data. There is only a number of items corresponding to the minimum of repeating i set in livecycle designer binding window...
    Is it possible to generate repeating data with LiveCycle DS ?
    an example of my xml source :
    <item id="1">
         <data>blabla</data>
    </item>
    <item id="2">
         <data>blabla</data>
    </item>
    In livecycle designer, if i set the minimum of repeating to 1,  LiveCycle DS generate a pdf with only one item.
    if i set the minimum of repeating to 2,  LiveCycle DS generate a pdf with only 2 items. etc.
    I don't know how to generate an indeterminate number of items...
    Thanks in advance for your help.
    Bye
    Guillaume

    Hi Guillaume,
    there is no limitation. Dynamic PDF files can be generated with Livecycle Data Services.
    You should have a look at the XML file generated by your Flex code. Try to save it and see how the XML file behaves when you generate a PDF preview with Designer. You can go to the menu:  File >  Properties > Preview > Use XML test data...
    With the XFAHelper class, you can either load a PDF or a XDP file. Have you tried with a XDP ?
    I've attached a dynamic PDF file that I've created for a customer. I generate a dynamic PDF file using LiveCycle Data Services. Maybe you'll find some clues within the file.
    Michael

  • TREX indexing problem with PDF files

    Hi all,
    I use KM to access DMS with the "DMS Connector for KM".
    I create an index on my DMS repository.
    I have more then 8000 documents. Most of then are PDF files.
    Only word document are indexed.
    i have read and put in place OSS Note 1008299 and 1031193.
    I have error message in trc file TrexPreprocessor :
    [4648] 2009-08-27 17:12:21.969 e preprocessor Preprocessor.cpp(00963) : HTTPGET failed for URL http://rixsapfps.sbbio.be:52400/irj/go/km/docs/DS/EDIPUBLICROOTFOLDER%23ZFL%23000%2300/DMS_030%23ZFL%23000%2300/DMS_030_SOP%23ZFL%23000%2300/DMS_030_SOP_50%23ZFL%23000%2300/0000000000000009000005363%23SOP%23000%230249871722A5F41746E1000000C14A8425.pdf with Httpstatus 500
    [4648] 2009-08-27 17:12:21.969 e preprocessor Preprocessor.cpp(03553) : HANDLE: DISPATCH - Processing Document with key '/DS/EDIPUBLICROOTFOLDER#ZFL#000#00/DMS_030#ZFL#000#00/DMS_030_SOP#ZFL#000#00/DMS_030_SOP_50#ZFL#000#00/0000000000000009000005363#SOP#000#0249871722A5F41746E1000000C14A8425.pdf' failed, returning PREPROCESSOR_ACTIVITY_ERROR (Code 6500)
    Any help is welcome.
    Pascal

    Dear
    Please refer
    https://forums.sdn.sap.com/thread.jspa?threadID=1058626
    https://forums.sdn.sap.com/thread.jspa?threadID=403393&messageID=3429730#3429730
    Regards,
    Tushar

  • Create PDF document from Word with hyperlink index entries

    Hello,
    I'm having a MS Word 2010 document with a content and index directory, both directories were created with the official Word functions and their page numbers are updated automatically. If I convert this document to a PDF file with Acrobat 9 Pro, the entries within the content directory are hyperlinks (if I click on a chapter the corresponding page opens).
    But this doesn't work with the index directory at the end of the document. Where can I activate the hyperlink functionality for index directories?
    Thanks for your help,
    Devid

    Hi,
    thanks for this info.
    On another computer I have Acrobat X Pro installed, but the result is the same. Or did I missed something?

  • Plz help: how can i index multiple directories including pdfs with oracle text??

    problem:
    i habe several subdirectories with pdf files which must be indexed by a fulltext index.
    .../dir/
    sub_dir1/
    1.pdf
    2.pdf
    sub_dir2/
    3.pdf
    4.pdf
    it's possible that other users create new subdirs.
    try #1:
    i tried to update the FILE_DATASTORE parameter PATH with the concatenated directory list
    i.e.: (.../dir/subdir1:.../dir/subdir2:...) and updating the index.
    that fails, because the directory string is too long (1637 chars)
    try #2:
    i set the FILE_DATASTORE PATH parameter to the basedir
    i.e.: ('.../dir')
    now i generate a list of all pdf's including the subdirectories to store them into
    a new table.
    i.e.: '12345', 'subdir1/1.pdf'
    '23456', 'subdir1/2.pdf'
    this one fails, 'cause it seems that the database uses some kind of basename() function to
    get the "filename_only" part of the table entry 'subdir1/1.pdf' => '1.pdf'.
    so, the db fails to open (and indexing of cause) the file.
    how can i solve this prob?
    thanks in advance!!!
    best regards.
    /achim

    If you need to use multiple directories, you'll need to put the full directory and filename into the table, and not use the PATH attribute at all. PATH only works where all files are in the same directory (though you MAY find you can use more than one directory on certain OS's).
    - Roger

  • Indexing pdf documents with indextype ctxsys.context

    I have an application that stores the contents of uploaded documents in BLOB data fields. We provide web pages which search through the uploaded documents based on text entered by the user. We currently upload both MS Word .doc and HTML documents. For the HTML documents, which are made available to the public, we index the table with the following procedure:
    CREATE OR REPLACE procedure WEBADMIN.index_redacted_docs is
    begin
    declare
    cur           PLS_INTEGER;
    exec_int           PLS_INTEGER;
    counter          number;
    begin
    select count(*) into counter
    from user_indexes
    where index_name = 'DOCS_CTX_REDACTED_IDX';
    if (counter = 1) then
    ctx_ddl.sync_index (idx_name => 'docs_ctx_redacted_idx');
    else
    cur := DBMS_SQL.OPEN_CURSOR;
    DBMS_SQL.PARSE (cur, 'create index docs_ctx_redacted_idx on documents_ctx_redacted (blob_content) ' ||
         'indextype is ctxsys.context parameters (''filter ctxsys.null_filter'')', DBMS_SQL.NATIVE);
    exec_int := DBMS_SQL.EXECUTE (cur);
    DBMS_SQL.CLOSE_CURSOR (cur);
    end if;
    exception
    when others then
         DBMS_SQL.CLOSE_CURSOR (cur);
         raise;
    end;
    end;
    We run this process after every uploaded HTML file and are able to locate documents which contain any text entered by the user. The portion of the command we use to query the documents_ctx_redacted table (blob_content is the BLOB field in this table) is (using "corn" as a sample query text):
    WHERE (contains (BLOB_CONTENT, 'corn', 10) > 0)
    Our customer is now asking that PDF files be uploaded as well and searched in the same manner. After the PDF files are uploaded (into the same table as the HTML files) and the index updated, with the above command ctx_ddl.sync_index (idx_name => 'docs_ctx_redacted_idx'), since the index already exists, we cannot get any rows returned with the above WHERE (contains .... ) clause. We know the text we're looking for (such as "corn") is contained in the PDF files, but the search does not find them, although it finds the HTML documents just fine. I've also tried dropping the index entirely and recreating it, but that also only finds the HTML documents but not the PDF's.
    What are we doing incorrectly with the PDF files? Thanks.

    We are using Oracle version 10.2 . I looked at the relevant Oracle Text documentation for that version, and the best I could glean was that PDF files are supported by the filter ctxsys.auto_filter (rather than null_filter) when creating the index. I dropped the existing null_filter index and created a new index with the auto_filter parameter, but the end result was the same. I still get no PDF records found when issuing the command (using "corn" as the text query)
    WHERE (contains (BLOB_CONTENT, 'corn', 10) > 0)
    although the HTML records show up fine again.

  • Indexing Problem with FILE_DATASTORE and .pdf files

    Hello all,
    Do any of you have an example showing how to index .pdf files through FILE_DATASTORE? I am able to successfully index text and .doc files but not a .pdf file. Below is the script that I use to index my files:
    create index myindex on mytable(docs)
    indextype is ctxsys.context
    parameters ('datastore COMMON_DIR filter ctxsys.null_filter');
    I am using Oracle 8.1.6
    Thanks you!!!
    -garrett

    I don't think that you are able to index anything else then plain ascii texts, because you are not using the INSO filter.
    Use preferences like this:
    exec ctx_ddl.drop_preference('NO_PATH');
    exec ctx_ddl.create_preference('NO_PATH','FILE_DATASTORE');
    exec ctx_ddl.drop_preference('MY_LEXER');
    exec ctx_ddl.create_preference('MY_LEXER','BASIC_LEXER');
    exec ctx_ddl.set_attribute('MY_LEXER','MIXED_CASE', 'NO');
    exec ctx_ddl.set_attribute('MY_LEXER','INDEX_THEMES','NO');
    exec ctx_ddl.set_attribute('MY_LEXER','INDEX_TEXT', 'YES');
    exec ctx_ddl.drop_Preference ('MY_FILTER');
    exec ctx_ddl.create_Preference ('MY_FILTER','INSO_FILTER');
    exec ctx_ddl.drop_section_group ('MY_SECTION');
    exec ctx_ddl.create_section_group ('MY_SECTION','NULL_SECTION_GROUP');
    drop index i_filenames;
    create index i_filenames on filenames (filename)
    indextype is ctxsys.context
    parameters ('datastore NO_PATH
    section group MY_SECTION
    lexer MY_LEXER
    filter MY_FILTER
    memory 10M
    IMPORTANT is the INSO_FILTER preference.
    Thomas

  • Install 3rd party PDF iFilter for index PDF file as attachment in e-mail (msg)

    I have called Microsoft Permium Support, base on the reply, SharePoint 2013 does not support to index a PDF file attachment in E-mail (msg) except 3rd party iFilter installed. And they finally told me how to edit Windows Registry for install the Adobe iFilter.
    But, the Adobe iFilter is too weak to call large PDF files. So, I would like to install and try the Foxit PDF iFilter, but I cannot find an installation guide for this 3rd party ifilter with SharePoint 2013. 
    Does anyone here have the experience for Foxit PDF iFilter with SharePoint 2013 can help me?
    I am not sure it is bug or feature in SharePoint 2013, but in case I still have to install 3rd party iFilter for index PDF file. I have no idea what is the out of box pdf file indexing support for.

    You ca plan to use Foxit. 
    steps are nearly the same which we use in sharepoint 2013
    1. We need to update registry for pdf . Registry value is {987f8d1a-26e6-4554-b007-6b20e2680632}
    2. we need to install the foxit ifilter
    Here are steps for same
    http://support.microsoft.com/kb/2293357
    3. run below command:
    net stop spsearch4
    net start spsearch4
    net stop osearch14
    net start osearch14
    Check below:
    http://bjarnegram.wordpress.com/2011/07/13/installing-foxit-pdf-ifilter-on-sharepoint-server-2010/

  • Creation of  rules index failing with ORA-01652 exception

    I am trying to create a rules index in the following way,
    BEGIN
         SEM_APIS.CREATE_RULES_INDEX(
         'APPS_RDF_IDX',
         SEM_Models('SEMANTIC_SEARCH_MODEL'),
         SEM_Rulebases('OWLPRIME','SEMANTIC_SEARCH_RULEBASE'));
    END;
    with semantic_search_rulebase having about 5 rules and with 28839 triples in the model.
    When I am trying to run create index it fails after a long time by throwing exception
    ORA-01652: unable to extend temp segment by 128 in tablespace TEMP
    though TEMP is allocated 5GB memory.
    Please clarify me on the following questions,
    1. How much TEMP space should be allocated if the triples are going to be in millions and rules at about 10 to 100 and why is indexing taking a lot of TEMP space with a less amount of triples.
    2. How much time normally would create rules index take with triples of size from thousands to millions.
    3. How to make the create rules index run faster.
    Thanks,
    Phani

    First of all, please start using create_entailment API instead of that create_rules_index API.
    Regarding 1), 5GB temp space is not a whole lot.
    It is hard to say exactly how much you need because you have user defined rules.
    Regarding 2) and 3), please check out the following inference best practice paper.
    http://www.oracle.com/technology/tech/semantic_technologies/pdf/semantic_infer_bestprac_wp.pdf
    Also, if you like, please post your rules and I may be able to help you model
    some of your rules using native OWL constructs.

  • BizTalk 2006 Event Log Warnings - Cannot insert duplicate key row in object 'dta_MessageFieldValues' with unique index 'IX_MessageFieldValues'.

    We have been seeing the following 'warnings' in the event log of our BizTalk machine since upgrading to BTS 2006. They seem to occur randomly 6 or 8 times per day.
    Does anyone know what this means and what needs to be done to clear it up? we have only one BizTalk server which is running on only one machine.
    I am new to BizTalk, so I am unable to find how many tracking host instances running for BizTalk server. Also, can you please let me know that we can configure only one instance for one server/machine?
    Source: BAM EventBus Service
    Event: 5
    Warning Details: Execute batch error. Exception information: TDDS failed to batch execution of streams. SQLServer: bizprod, Database: BizTalkDTADb.Cannot insert duplicate key row in object 'dta_MessageFieldValues'
    with unique index 'IX_MessageFieldValues'. The statement has been terminated..

    Other than ensuring that there exists a separate and single tracking host instance, you're getting an error about duplicate keys.. which implies that you're trying to Create a BAM Activity twice with the same data.
    I suggest you have a in-depth examination of the BAM (TPE or API) associated with the orchestration. In TPE ensure that the first binding you select is the "Instance Id" or "Message Id" before going ahead to map the ports or others.
    Regards.

  • Oracle XE 10.2.0.1.0  - Problem indexing PDF

    I am using Oracle XE 10.2.0.1.0 with Czech national settings set.
    I need to make PDF with czech national characters working. Indexing TXT, HTML and DOC2003 documents with the same content works fine.
    Below is my configuration.
    h1. National Language Support
    NLS_CALENDAR GREGORIAN
    NLS_CHARACTERSET AL32UTF8
    NLS_COMP BINARY
    NLS_CURRENCY Kč
    NLS_DATE_FORMAT DD.MM.RR
    NLS_DATE_LANGUAGE CZECH
    NLS_DUAL_CURRENCY Kč
    NLS_ISO_CURRENCY CZECH REPUBLIC
    NLS_LANGUAGE CZECH
    NLS_LENGTH_SEMANTICS BYTE
    NLS_NCHAR_CHARACTERSET AL16UTF16
    NLS_NCHAR_CONV_EXCP FALSE
    NLS_NUMERIC_CHARACTERS ,.
    NLS_SORT CZECH
    NLS_TERRITORY CZECH REPUBLIC
    NLS_TIME_FORMAT HH24:MI:SSXFF
    NLS_TIMESTAMP_FORMAT DD.MM.RR HH24:MI:SSXFF
    NLS_TIMESTAMP_TZ_FORMAT DD.MM.RR HH24:MI:SSXFF TZR
    NLS_TIME_TZ_FORMAT HH24:MI:SSXFF TZR
    h1. Datastore
    PDF url: http://www.mpsv.cz/files/clanky/6981/tiskove_avizo_CJ.pdf
    I renamed the pdf to SummitKZamestnanosti_PDF-40.
    CREATE TABLE file_datastore
    id NUMBER PRIMARY KEY,
    fmt varchar2(10),
    docs VARCHAR2(2000)
    -- INSERT data INTO File Datastore TABLE
    INSERT INTO file_datastore VALUES
    (111560,'binary','C:\Docs\SummitKZamestnanosti_PDF-40'
    -- Configure DATASTORE
    EXEC ctx_ddl.drop_preference('NO_PATH');
    EXEC ctx_ddl.create_preference('NO_PATH','FILE_DATASTORE');
    -- Configure LEXER
    EXEC ctx_ddl.drop_preference('LEXER');
    EXEC ctx_ddl.create_preference('LEXER','BASIC_LEXER');
    -- CREATE CONTEXT INDEX
    DROP INDEX idx_file_datastore_text FORCE;
    CREATE INDEX idx_file_datastore_text ON file_datastore
    ( docs )
    indextype IS ctxsys.context parameters
    ( 'format column fmt Datastore NO_PATH filter ctxsys.AUTO_FILTER lexer LEXER' );
    -- QUERIES that should work.
    -- SELECT id, docs, score(1) FROM file_datastore WHERE contains ( docs,'Bližší',1 ) > 0;
    SELECT id, docs, score(1) FROM file_datastore WHERE contains ( docs,'dopadů ',1 ) > 0;
    -- SELECT id, docs, score(1) FROM file_datastore WHERE contains ( docs,'sociálních',1 ) > 0;
    SELECT id, docs, score(1) FROM file_datastore WHERE contains ( docs,'Španělsko',1 ) > 0;
    -- SELECT id, docs, score(1) FROM file_datastore WHERE contains ( docs,'Švédska',1 ) > 0;
    SELECT id, docs, score(1) FROM file_datastore WHERE contains ( docs,'Summit',1 ) > 0;
    h1. My Output
    ID DOCS
    0 rows selected
    ID DOCS
    0 rows selected
    ID DOCS
    111560 C:\Docs\SummitKZamestnanosti_PDF-40
    1 rows selected
    Regards

    I have used the 100% same configuration as above, but now for the Oracle Database 11g R1 11.1.0.7.0 – Production instead of Oracle 10g XE.
    The result is that AUTO_FILTER for Oracle 11g is able to parse Czech language characters from the sample PDF file without any problems.
    The problem with Oracle Text 10g R2 may be I guess:
    1. In embedded fonts as mentioned in the Link: [documentation | http://download-west.oracle.com/docs/cd/B12037_01/text.101/b10730/afilsupt.htm] (I tried to embbed all fonts and the whole character set, but it did not helped)
    2. in the character encoding of the text within the PDF documents.
    I would like to add that also other third party PDF2Text converters have similar issues with the Czech characters in the PDF documents – after text extraction Czech national characters were displayed incorrectly.
    If you have any other remarks, ideas or conclusions please reply :-)

  • How to configure one TREX host with multiple index servers ?

    Hi All,
    Does anyone know how to configure TREX on the one host,
    with multiple index servers ?
    Reason for this is to make better use of resources available on the host server(4 Gig, 4 Processor, Windows2003), to improve the search performance of
    our KM content for portal users.
    I am using TREX 7 and have not been able to do this,
    despite reading the Single and Distributed install
    documentation.
    Any help would be appreciated.
    Regards,
    Andres

    Hi Andres,
    To make use of the RAM a Server provides you have to run two indexserver processes (each can then consume 2 GB);
    Proceed like this:
    1. Go to TREXdeamon.ini; check if section [indexserver2] is there (it is already provided, but not active in standard installation)
    2. In TREXdeamon.ini go to
    [daemon]
    references sections below
    programs=nameserver,preprocessor1,indexserver1,queueserver,alertserver
    and add indexserver2 here. Restart TREX; second porcess is then started; can be checked in TREX monitor in Portal as well
    3. To distribute existing indexes to the new process, start TREXadmintool and go to Index: Landscape
    Go to the last two columns and move the indexes (move master here/secondary mouse click)
    If you don't distribute the indexes the new index server process will be regarded when an new index is created.
    Hope this helps!
    cheers
    Bettina

  • Index usage with nls_sort and nls_comp

    Hi,
    I have created a logon trigger
    CREATE OR REPLACE TRIGGER "SYS"."ON_LOGON_SET__SCHEMA" AFTER
    LOGON ON DATABASE BEGIN
    EXECUTE IMMEDIATE 'alter session set NLS_SORT=BINARY_CI';
    EXECUTE IMMEDIATE 'alter session set NLS_COMP=LINGUISTIC';
    EXCEPTION
    WHEN OTHERS THEN NULL;
    END;
    because the user does not want case sensitive searches in the database.
    However, when using this, Indexes on text fields are no longer used. What should I do with those indexes?
    Regards

    Possible answers explained in MOS GENERIC_BASELETTER Linguistic Definition [ID 109118.1] and Linguistic Sorting - Frequently Asked Questions [ID 227335.1] depending on what your users want to happen.

Maybe you are looking for

  • Skipping mapping execution in process flow

    I have a process flow that calls multiple mappings. Based on some condition I want a mapping not to execute, E.g. when rerunning the process flow in case of failure. Currently I keep track of mapping execution and store the status in a control table.

  • My mac mini unexpectedly died!

    Hello all, I bought a mac mini 10 days ago. It was all so nice when today I looked at the mac (I was abandoned it for about 10 minutes) and I found a strange screen, saying to restart the machine. Late I found this screen is an analog to the Windows

  • Hidden page PDF file

    Hello, I have a pdf file in which I can read almost every page, but in some of them appears a blank page with the next text "Hidden page". With the Adobe Acrobat Pro tools I could search for hidden information and there is hidden text in those "Hidde

  • Formula to subtract current month from previous month

    Hi I would like to how to subtract current month from previous month in Crystal 10.  Basically am looking at the variations that between the two month .   I would also like to know the 12 m onth rolling data formula.  Using the same formula, you need

  • Workitem ID  / Workitem through event Question

    Hi I have a requirement that I need to send few details from an ABAP report as a workitem to certain users.Iam using the module 'SWE_EVENT_CREATE' to trigger an event linked to a task which sends the workitem. This is working fine. I need to send an