Searching pdf files

It is apparent that the search engine (find) does not "find" keywords in some files.  Why is this occurring?

You would have to share one of the affected PDFs with others for a good look-see to get a more meaningful answer.
Some possibilities:
Characters not mapping to unicode.
PDF is a scanned image with no OCR output.
PDF is a scanned image processed with ClearScan - some characters not recognized and left as a bitmap image.
Be well...

Similar Messages

  • Indexing and Searching PDF Files

    Hi All,
    I am trying to store and search PDF files in the oracle database.
    I can insert and index the PDF files just fine but cannot get any result. I always get No Rows.
    Here's what I am doing and the issues I am facing.
    I created a Table with fields
    ID (VARCHAR)
    NAME (VARCHAR)
    DOC (BLOB)
    I inserted the PDF file in the BLOB field through a Java program and insert worked fine as I verified by retreiving the PDF and writing to file.
    I created index using following SQL:
    create index my_index on PDF_TABLE(PDF_FLD) indextype is ctxsys.context
    parameters ('datastore ctxsys.default_datastore
    filter ctxsys.inso_filter');
    The index was created successfully without any problems.
    I ran query as follows and got no rows although the searched text is in PDF
    SELECT SCORE(1), PDF_FLD from PDF_TABLE WHERE CONTAINS (PDF_FLD, 'Table of Cotents',
    1) > 0;
    I tried alternate queries as well with no luck.
    Any ideas ??
    Thanks

    After creation the index you need execute next operations.
    first, check that your index tables conatins indexed terms. Execute
    select token_text from dr$YOUR_INDEX$i;
    Second, you will need to check the index errors table CTX_INDEX_ERRORS. This is owned by the user CTXSYS, and most users do NOT have # SELECT privilege to it by default.
    If it's OK, then check that your PDF documents is supported by INSO filter.
    Citation:
    "PDF - Portable Document Format
    Acrobat Versions 2.1, 3.0, 4.0, and 5.0 including Japanese PDF"
    (Appendix B. Supported Document Formats in Oracle Text Reference 9.2)
    For Oracle 9i you could install 9.2.0.4 patchset (it included INSO FILTER 7.5)
    P.S.
    for the beginning, you could find answers for your question about Oracle Text here
    http://otn.oracle.com/products/text
    Sorry for my English.
    Best regards, Victor Zogin.

  • How to search pdf files in another language?

    I have been trying to search a  large Arabic documents but not result came up?

    Yes it is happening in all pdfs files that views Arabic writing, see
    attached as an example. When I search any Arabic word I don't get no result
    in either basic or advance search boxes.
    Also, I am using Windows7 Home Premium as operating system in my computer.
    Note: Select any Arabic word from the enclosed text and search it, and see
    if you can get a result.

  • Cannot search PDF file contents - Windows 7 32 bit - Adobe Acrobat X

    Hello,
    If this is in the wrong forum please move it.
    I work in an enterprise environment and our systems are having trouble searching file contents in Windows Explorer using Acrobat X and Windows 7 32 bit. The files are on a mapped network location.
    After removing all adobe products from a test machine and reinstalling the Acrobat 10.0.0 software the windows explorer search function seems to work locally but once I install Acrobat 10.1.anything update, it will fail. It never worked on a networked location.
    I have also tried installing Adobe Acrobat 11.0.00 after removing 10.0. Then I made sure my indexing settings were setup to index files and contents and made sure the .pdf extension was selected under file types.
    I then created a mapped drive to the network location, and setup my indexing to the folder on that network drive. I was able to do this by installing this Microsoft Add-in that allows use of UNC paths in the indexing.
    http://www.microsoft.com/en-us/download/confirmation.aspx?id=3383
    Once I set this up, I rebuilt my index and restarted the computer. This is where it gets weird. I can now search the contents of PDF files in this indexed network location, but only by one letter. Searching "c type:pdf" will produce results, but "co type:pdf" will not. I know for sure some of the documents have the work Comment in there so this should should up.
    Does anyone have experience getting this to work correctly with the latest versions of Adobe Acrobat X or XI and Windows 7 32 bit? It would be greatly appreciated.
    Thank you.

    I will never understand why but in the end I rebuilt my 32 bit dell laptop from scratch and the pdf files can now be searched.
    I cannot search them on a mapped drive as I was able to with Windows XP because now they must be indexed and windows 7 will seems not to allow a mapped location to be indexed which must be done to make the pdf files searchable so I have had to move the files to the local drive.
    My Windows 7 64 bit systems can search the mapped drives just fine without needing to be indexed. Again I will never understand why this works and the 32 bit machine does not.

  • Searching PDF File hindered by inserted spaces

    In the pdf files i have there are numbers like "H123456789" that i would like to search for. when searching for; in adobe 8; the number H123456789 is not found. if i cut and pastethe number from the document the number that looks like this "H123456789" in the document now looks like "H 1 2 3 45 6 7 8 9" in the search field. is there a way to stop the insterted spaces or to force the search to find that number.

    Sorry for unabling to reply in forum. I cannot find the button to reply.
    Yes. I can select words, but the selected contents are all in strange
    characters.
    I also tried to convert into word file, and the resulting file was in
    strange characters.
    I attach a snapshot of the pdf doc property.

  • Search pdf files with Windows Desktop Search

    I recently installed Windows 7 and have attemped to use Windows Desktop Search to search for text contained in pdf files without success.  Today I found a reference to Adobe PDF IFilter v6.0 at http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611.  I have Acrobat 8.2.1 installed on my computer.  Since I couldn't figure out if the iFilter had already been installed, I downloaded the ifilter60.exe file.  However, when I ran this exe file I didn't see any evidence that the filter was installed.  For example, the above web page mentions a ReadMe file file which is "included with the download" but I don't know where it or the plugin can be found.  I'm still unable to search within pdf files.  Is there anything special I need to do to install this IFilter?
    Peter

    Can't help directly on the question. However, the generic problem in Windows is that the search will only delve into a list of MS denoted file types. I spent a year looking for a file I had misplaced and did not remember the name of. The following link resolved my issue and now I can search for anything in any file. I followed the registry edit in the response. Editing the registry can be dangerous and you should back up the registry at a minimum unless you know what you are doing. I also do not know if the fix is available in Windows 7. As I recall, I did find it in VISTA. Good luck if you use this, but also be careful. Bill
    http://www.winvistatips.com/search-inside-bas-and-frm-files-t570174.html

  • Searching PDF files viewed thru Safari

    I know I can view PDF files in Safari on my 3GS but is there a way to search them without having to download them onto the phone?
    Thanks,
    Scott

    No, as all you have is a viewer of PDF's not something that can interpret content, and create a searchable index.
    And you cannot save to your iPhone in ay case - it cannot store files other than those attached to emails, or those in the Photo library.

  • How Best to Search pdf Files Over the Web

    I have a large number of pdf files. (about 73,000). These are scans of old newspapers that have been OCR'd and saved as pdf. I work in a library and need to find the best way to make these text-searchable through my library's website.
    Do I need to create an index?
    What I would like to do is have a search box on my web page where a user can enter a keyword, and pull up the pages that contain that word.
    Any suggestions on how best to do this are greatly appreciated!!
    Thanks!

    Google can do it for you if you use their search engine as the basis of your library search capabilities.

  • Searching PDF files in a folder

    I am trying to search a folder containing several thousand pdf files using the search facility with Adobe Reader but keep getting nil results. I get the same result if I open a pdf and search within that document. Can anyone help!

    Hi,
    Yes the pdf's are scanned using a Canon ir1020 scanner. I cannot find text on any of the pdf's.
    Cheers

  • TREX does not search PDF files

    Hi,
    we have another problem with TREX 6.0.
    Our file repository is working fine, search also works for .txt files, but doesn't work for pdf files. Out pdf files are indexed correctly, but there are no result for this kind of files if we do a search.
    What can we do?
    Kind regards
    Thomas

    Your situation may already be solved.  However, one thing I did not hear in the details was: 1) how many PDF's were being indexed.  What was the size of the files?  Did you check the TREX Monitor to ensure all the PDF's had been sent through the entire system.  In the crawler monitor, did it state it found the correct number of files you believe to be in the index?  By default, TREX holds documents in a que for 30 minutes between processes unless you either reset this property or flush the que.
    There is a document TREXRecomenations which give some very good tips with regards to file size and other common settings.  For PDF it states:
    You want to index very large documents in PDF format from Adobe. These documents are not being indexed because they fail to pass the preprocessing stage.
    Limitation PDF is a complicated file format to preprocess. Typically PDF files larger than 15 MB cause problems. The time taken for preprocessing and filtering rises to over an hour and the process delivers bad results. Recommendation You should avoid the indexing and processing of PDF files that are larger than 15 MB.
    If you cannot find this document, let me know and I can forward it to you

  • Secure & Search PDF Files

    Hi,
    I am using CF10 with SQL Server 2012. I am building a site for my client that includes catalogues of PDF files. What is the best way to achieve this with Cold Fusion. In the past, the link to the PDF files were bookmarked and were accessible by the users after their subscription is expired __MCE_ITEM__L
    I have used Verity 7 years ago (CF MX I think) to do the search. If the user is subscribed to a certain catalogue, within the search results, a link is activated and the user is able to view the entire file. If not only a highlight will come up asking the user to subscribe.
       What is the best way to achieve this:
    __1. Using SQL Server blob fields? Will Verity be able to search PDFs inside BLOB fields?
    __2. SQL Server Full Text Search? How will we return the results to Cold Fusion
    __3. Keep the PDF files outside the database, what is the best technique to secure and search the files?
    Any other suggestion is appreciated.
    Thanks So Much

    Can't help directly on the question. However, the generic problem in Windows is that the search will only delve into a list of MS denoted file types. I spent a year looking for a file I had misplaced and did not remember the name of. The following link resolved my issue and now I can search for anything in any file. I followed the registry edit in the response. Editing the registry can be dangerous and you should back up the registry at a minimum unless you know what you are doing. I also do not know if the fix is available in Windows 7. As I recall, I did find it in VISTA. Good luck if you use this, but also be careful. Bill
    http://www.winvistatips.com/search-inside-bas-and-frm-files-t570174.html

  • Windows 7 x64, can't search PDF files

    I am running Windows 7 x64. I have installed Reader version 9.3.1.. Seems to work OK, but Windows search does not index PDF files (this works fine on my Vista x86 box).
    I checked the Indexing options and I notice that under Advanced, File Types "pdf" it says: "Registered IFilter is not found".
    How can I fix this?

    There may be an answer here:
    http://blogs.adobe.com/acrobat/2008/12/adobe_pdf_ifilter_9_for_64bit.html

  • Indexing and Searching pdf files which are used as attachment in an Announcemnet list item

    Hi all,
    I am using a SharePoint 2013 online environment and trying to search and find pdf files which are attached to a announcement list item. However it does not find anything when I search for the name of the pdf file or the content of the pdf file.
    When I attach a word to the list item it gets indexed and it find the file.
    thanks and appreciate every kind of advice.

    Are you able to search for pdfs in other locations? SharePoint 2013 comes with an iFilter out of the box unlike 2010 which needed configuration.

  • Search pdf files from excel or text edit list

    I have a folder called "Photos"
    in that folder i have many say (100+) files like
    1234567_001
    1234567-001.01
    1234567_001.02
    1234567_123456
    11111111_001
    11111111-001.01
    11111111_001.02
    11111111_123456
    2222222_001
    2222222-001.01
    2222222_001.02
    2222222_123456
    6666666_001
    6666666-001.01
    6666666_001.02
    6666666_123456
    and so on
    I  have a list in my excel that contains selected few files (say 20) as 11111111 and 2222222 in seperate cells or in text pad
    my requirement is i need to find all the files irrespective of 001 002 and copy, paste in new folder
    so i will get 20 main files and other associated files like 001 002 etc
    Kindly help!!! Thanks in advance!!!

    Is your Excel file on a location outside your machine's disk? If so, try copying to local disk.
    Does the Excel file have any protection applied to it? If so, try removing the protection
    If these don't work, can you post a sample file that demonstrates the problem.

  • How to highlight searched words in pdf file opens from web??

    Hello fellows
    I need to create a web page that holds couple controls include "search" button .When client clicks on the button the programm will search pdf file, open it in client reader with highlight on searched words ..
    Is that any posibility to do it using javascript on a client or c# ,vb.net on the server ??
    I'm very novice in PDF developing so any ideas or conclusions would be very usefull!

    Thank you for answer
    But maybe you have any idea how I can do it??
    This is very important feature for our project.
    Some additional information : our project will hosted in SharePoint 2007
    When I use adobe Ifilter for moss and set in querystring #search="searchword"  I get highlights, but only in a browser.
    But if a reader was configured to open client application this feature wasn't work.
    Any suggestion ???

Maybe you are looking for