Some PDFs contain text (e.g. OCR'd) and some do not contain text.

I have a document management system that full text indexes all of our documents, but if a PDF is simply a picture then the full text indexing becomes useless unless I run an 'OCR' on the PDF...in essence adding text.
In order to make PDFS fulltext searchable I run an OCR process on an entire folder.  I'd rather not run that process on PDFs that ALREADY have embedded text.  Is there a way to identify whether or a PDF has embedded text without opening it.

You can use a Preflight to check the PDFs via an Action (Acrobat X) or Batch Sequence (pre-Acrobat X).
The Preflight would use a Custom check.
You could use "Can be mapped to Unicode"
To be searchable the PDF pages' glyphs must map to Unicode.
Similarly you can create a Preflight, Custom check to evaluate for "Invisible text objects".
"Invisible text objects" are text objects using text rendering mode 3 (invisible text).
Text rendering mode 3 (no glyph/font fill or stroke) is used for the output of OCR (Searchable Image and Searchable Image (Exact)).
Acrobat Pro has two out-of-the-box Preflights that may also be of interest.
One to use a fix-up to embed fonts another to embed fonts (including text rendering mode 3).
Be well...

Similar Messages

Maybe you are looking for

  • Trying to install the Photostop Elements 12 on my iMac

    Trouble installing the trial version of Elements 12 on iMac. It says it going to take 8-9 hours but then stalls & i have to restart. Getting frustrating. Have cleared my cache & installed latest Flash Player as mentioned in other people.  Left it to

  • Syncing Podcasts after OS 3

    After upgrading to OS 3 certain old podcasts which were long deleted and which I can not find in my library reappeared on the phone. I can not figure out how to delete them. Syncing certainly isn't working Any ideas?

  • How can I create my own iTunes music file?

    I have an iPhone and want to put music on it. My son has created an iTunes application on our home computer and has dowloaded all his songs to it. I thought all I would have to do is to create a new user name or folder for my music and dowload away,

  • Mac OS X 10.8 Frequent Crashes

    Date/Time:       2013-02-07 00:36:35 -0600 OS Version:      10.8.2 (Build 12C60) Architecture:    x86_64 Report Version:  11 Command:         Illustrator Path:            /Applications/Adobe Illustrator CS6/Adobe Illustrator.app/Contents/MacOS/Adobe

  • Graphical work flow protocol

    Experts, which T-Code shows the graphical work flow protocol ? Or how can a visulize a  SAP work flow ? Thanks in advance Jörg