OCR in Acrobat 9

Hi Guys
I've invested in CS5 Master Collection, with AcrobatPro 9. I have a bunch of manuscript notes that I've scanned and keep on my laptop for business meetings. Will AcrobatPro 9 be able to convert manuscript into text, like one of the IRIS programmes? If so, how accurate is it?
Regards,
Graham

Whenever I open a new pdf in 9.0  pro the program seems to scan it to create an
OCR version. It is annoying when time is short - I usually hit <Cancel> but
how can I permanently turn this feature off, or can I?
Frank

Similar Messages

  • BUG in OCR in Acrobat 9.3 on MRC (Mixed Raster Content) PDFs

    I have lots of (non-searchable) PDFs, that were generated from scanned images (1 image per page @ 400 dpi) using LuraDocument from www.luratech.de.
    The images are stored internally in the PDFs as MRC (Mixed Raster Content), that means, the PDF contains a foreground and a background layer for every image/page. These layers have a low resolution, are highly compressed and merged together by Acrobat (while displaying) using a high resolution mask layer. This results in very low file sizes, about 50 kB / page.
    I'd like to make these PDFs searchable, but WITHOUT manipulating or changing the original image layers in any way. OCR software like FineReader or Omnipage seems to store images always with own algorithms, so that the image quality would suffer from the conversion and the size of the PDFs would rise significantly. Acrobat on the contrary offers to maintain the original image(layer)s by using the output style "Searchable Image (Exact)" in the OCR window. Now the problem:
    After starting OCR, Acrobat applies OCR only at the first page (with good results) and deletes (!) all content (the image layers) on all other pages. For my eyes this seems to be a bug.
    I tried a workaround: In Acrobat's Layers Panel I choose the menu option "Flatten Layers". Starting an OCR now, Acrobat does OCR on all pages of the PDF, but the OCR result is a disaster, less than 10% correct. Presumably Acrobat does not take the resulting (actually displayed) page content as input for its OCR, nor the high resolution mask layer, but instead one of the (low resolution, highly compressed) image layers described above.
    Has anyone made similar experiences with MRC-compressed PDFs, e.g. PDFs generated by other MRC-Generators like JRAPublish ? Is there any workaround or bugfix ?
    Thank You in advance !
    L.Benic

    I'm having the same issues.  Using the latest version 9.3.3.  Is this a bug? I tired calling adobe but their CR sounds like 3rd country only.  Anyone can shed a light on this issue?

  • Disable Auto-Rotation in OCR Process (Acrobat 9 Pro)

    Is there a way to disable the auto-rotation step as part of the OCR process? I have documents which I assemble in from images of tables. (If I had acces to the text of the tables, I wouldn't be using images of them). I place each screenshot in the correct orientation, but when I have Acrobat 9 Pro OCR the document, it auto-rotates many pages into the wrong orientation (some 90° clockwise; others 90° counterclockwise and some 180°). Since Acrobat is horribly confused by what it is seeing, I need to disable the auto-rotation part of the process so that it will OCR what it finds exactly as it finds it.

    Very likely the pages of interest have a least one character of renderable text. This might not be readily seen by you or me but it'd be there.
    Acrobat's OCR will not process a page that has any renderable text.
    Be well...

  • Applescript auto-OCR with Acrobat 9

    on adding folder items to
         tell application "Adobe Acrobat Pro" to activate end
         tell application "System Events" to tell process "Acrobat"
              click the menu item "Recognize Text in Multiple Files Using OCR..." of menu 1 of menu item "OCR Text Recognition" of the menu "Document" of menu bar 1
              -- Here I need help. The following commands won't work.
              -- click button "Add Folders..." of window "Paper Capture Multiple Files" isn't working.
              -- After that I have no idea how to select a folder
              click button "Choose" of window "All supported files under selected folder and subfolders will be added"
              click button "OK" of window "Paper Capture Multiple Files"
              click button "OK" of window "Output Options"
              click button "OK" of window "Recognize Text - Settings"
         end tell
    end adding folder items to

    Whenever I open a new pdf in 9.0  pro the program seems to scan it to create an
    OCR version. It is annoying when time is short - I usually hit <Cancel> but
    how can I permanently turn this feature off, or can I?
    Frank

  • ClearScan OCR in Acrobat 9 deletes portions of text

    I am experimenting with book scanning using a digital camera and various software including Acrobat 9.  I discovered that in some cases, when I perform OCR with ClearScan, apparently random portions of the text in the scanned PDF image are deleted.  Sample1.jpg shows a page before ClearScan OCR, and Sample2.jpg shows it after.  As you can see some of the text has been deleted.  How could this happen?  And, how can it be prevented?

    I'm having the same issue on a document I have a scan of and am trying to convert to a ClearScan pdf. I have found a workaround to the problem, which involves selecting "Optimize Scanned PDF" and using the default setting with the dial set to "high quality" before performing the OCR. This seems to stop chunks of words from dissappearing (at least I haven't found any cases), but it has the horrible consequence of dramatically increasing the file size when performing OCR with ClearScan. In a small file this may not matter much, but in my case it makes a 10MB pdf increase to 70MB.
    Can anyone confirm if this also happens on Acrobat X?
    From what I can tell, there are no preferences to control the OCR besides the downsampling. Specifically, I would really appreciate it if anyone knew a way to reduce the size of the fonts that are generated by the ClearScan OCR as they account for over 95% of the size of the file.

  • OCR on Acrobat 9 Pro not working - will not explain why.

    I ran the OCR function on multiple 500-page PDF documents for an upcoming project. For the most part, the OCR worked. However, OCR DID NOT work on specific pages within the documents (would usually happen once or twice every few thousand pages). I went back through the PDF documents to try and run the OCR on the individual pages that did not work the first time, and the screen showed the following error: "Acrobat could not perform recognition (OCR) on this page because:". There was nothing else in the error box - no explanation as to why OCR would not work on those individual pages.
    Has anyone else had luck with this problem? I would really appreciate any help I can get.

    Very likely the pages of interest have a least one character of renderable text. This might not be readily seen by you or me but it'd be there.
    Acrobat's OCR will not process a page that has any renderable text.
    Be well...

  • A way to undo Formatted Text & Graphics OCR from Acrobat 7?

    Over the course of a few months, my company received a large number of PDF files for a project for which the internal policy was that every file should be text searchable.  Unfortunately, we did not save the native files in any sort of convenient way, having at that time not realized that failing to do so was a very bad idea.  We ran OCR on every one of the files that we received, which total approximately 4,000.  At the time that we received the majority of these files, my company was still using Acrobat 7; we've since upgraded to version 8.
    Recently we discovered that there were discrepancies between our electronic copies and the hard copy printouts from which our electronic copies had been generated:  in the electronic copies, uppercase F had changed to P, S had changed to 8, etc.  We eventually worked out that it must have been that at some point a computer was mistakenly set to run OCR using the Formatted Text & Graphics setting, as opposed to either Searchable Image or Searchable Image (Exact).  This was absolutely not want we wanted, as for our purposes using a type of OCR that causes the original images to change essentially renders the files useless.  My questions, then, are the following:
    1)  As I asked in the title, is there any way of undoing Formatted Text & Graphics OCR that was performed in Acrobat 7?
    2)  Is there a way of identifying files that have had Formatted Tex & Graphics OCR performed on them (something stored in the metadata)?
    Rebuilding these files from scratch is going to require a gargantuan effort, so any help would be much appreciated.

    Hi,
    Bernd's been across the mountain and seen the bear; so, you can bank on what he posted.
    But, just because, I'll second his "no".
    Formatted Text and Graphics (Acrobat 7, 8) and ClearScan (Acrobat 9, X) effectively replace the image of textual characters.
    If a character is not recognized as 'something' a bit map is of the thing is left behind.
    Now, while Acrobat or other OCR engines (Abbey FineReader, AdLib, Adobe Capture, etc.) are really rather impressive no OCR engine has 100% accuracy 100% of the time. Other variables  come into play (scan lamp age/brightness, platen cleanliness, scanner mechanicals cleanliness, calibration of scanner, hard copy 'quality' (characters' darkness density, contrast between characters and background, presence of lack thereof of boxed in text, text in or adjacent to line arcs/circles, etc.).
    All of that is for semantic content that is "textual". Semantic content that is not textual (but, coincidently may contain text) provides little to no useful OCR output (e.g., graphs, drawings, etc.). Validate this by performing OCR on such a PDF then Export to a plain text file. Print this file out and compare that to the source paper or the scanned image.
    There is no metadata info that identifies the OCR mode used.
    Perhaps something buried in the bowls of PDF page description content; if so, not intrinsically easy to obtain.
    My suggestion (fwiw) - move forward with re-scan.
    A server product would help to move it along but a high speed scanner hooked to a local machine (with ample resources) and Acrobat Pro 8 or 9 get it done. With Acrobat 8 or 9 use Search Image (Exact).  In Preferences check the category Create PDF or TIFF to assure it is what you desire. Check Acrobat's scan presets to assure you have what you want vis-a-vis Compression and Filtering. Do avoid "Automatic".
    Be well...

  • OCR'd Acrobat 8 pdf's not compatible with Preview? (s p a c e s in words)

    I recently came across an odd problem. When I use Acrobat 8's OCR function to turn an image pdf into a readable pdf, I get results that, at first, look fine: selecting text works, copy/pasting text from acrobat to other apps also works fine.
    However, when I open this pdf in apple's Preview, then select text, then copy/paste it, I get garbled text: words with spaces between each character, etc.
    This problem makes pdf's almost completely unusable to me. I cannot copy/paste bits of a pdf, I can't index the documents (either through spotlight, or another app I use, DEVONthink).
    Now, saving as acrobat 7 does not solve this problem. Actually using acrobat 7 does work, however. I don't want to resort to that because acrobat 7 is quite a bit slower (due to it being for PowerPC) and this is particularly noticeable during something as memory-intensive as OCR.
    Anyways, I can imagine this being a problem that needs to be fixed on the apple side rather than the Adobe side. furthermore, the more people start producing ocr'd pdf files in acrobat 8, the bigger this problem will become.
    If anyone can help me find a way to save 'Preview-compatible' pdf's from within acrobat 8, that would at least be a temporary workable solution for me!

    I use PDFs I've created on my macs (both have 10.5.1) on several windows XP machines at work. Apple's PDF solutions are different from Adobe, so the compatibility does not have to do with the availability of Adobe software. Still, one option would be to get "Adobe PDF" installed as a printer (not sure where it comes from -- maybe Reader, or with the CS3 suite) and then use that to print to PDF instead of using Apple's system technology. Still, it should be concerning that the system's PDFs are not being created properly. Do the PDFs open properly on other Macs? Have you tried multiple PCs?

  • Why so big files in the OCRs from Acrobat X?

    I used Acrobat 8 by many years. Now, I thought maybe was the time to upgrade, and I'm testing Acrobat X. My main use to Acrobat is to scan my own books (photocopy + ADF scan + PDF) and do an OCR scan (usually: exact copy) The reason of this is that I manage maybe 4 or 5.000 books and articles and in my work (history and genealogy) is useful to search the content directly.
    With Acrobat 8, this was done one by one, with certain accuracy.
    With Acrobat X, I try OCR and I noted three big differences:
    1st: The OCR is far better than in the v.8. In the same document where I found after OCR, ie: 50 occurrences, Acrobat X OCR give me now: 150 occurrences to the same word. Nice!!
    2nd: Now, I can do batch OCR. This mean that I can let OCR running a 10 documents list the whole night, or the whole day when I go to office, working one after the other... Nice!!
    I thought I touch the sky... At least I'would OCR all my library...! and all this books downloaded from archive.org and similar ones...
    But all this is nothing compared to:
    3rd: The resultant size of the OCRed files are impossible to manage. A book of 82MB gone to 538MB... As minimum, files doubles the size. With Acrobat 8, maybe gone 10 o 20% up, but not more than double as with the new Acrobat X. I have still 400GB to do OCR... I can't skyrocket this to 1 or 2 TBs...
    I tried: search, exact search and clear scan (no differences) I tried 600, 300 and 75dpi. Even 75dpi was bigger. And illegible, despite at 75dpi at 100%, at least it must appear clear in the screen.
    Im not sure if I must do the upgrade to Acrobat X. Someone can help me? Someone could recommend a specific, commercial software, to do OCR in PDFs, tha don't fat the resultant documents in this way? I can't believe that only text added, can do this.
    Thank you very much,
    Martin

    This is a tough one. One thing to look at is the client side
    buffers and server side buffers. A live stream usually drops data
    to "catch up" to streamtime - buffers size allocated so Im not sure
    how a song can be playing from 15 min ago...
    Are the users that connect up rebuffering alot?
    When troubleshooting try to find the least common
    denominator; see if accessing the stream on the same network is an
    issue. If it's successful, keep moving further away to determine
    what point the bottleneck lies in your network.

  • Have any API for OCR in Acrobat SDK?

    I want to use OCR to handle a pdf file in my application. But I cann't Find usefull API in Acrobat SDK. Can anybody help me?
    Thanks

    This should answer your question : http://forums.adobe.com/thread/1096057
    Thanks
    Varinder

  • OCR in Acrobat

    Hello,
      I need the capability of OCR pdf. Can I use Acrobat engine or any other DLL. What is the experience or suggestion?
    Thanks,
    Sow

    Acrobat certainly supports OCR and can be used ON THE DESKTOP in an automated fashion.

  • OCR with Acrobat 8.2.3 occassionally stops with error (5) cannot access paper capture service

    After I updated to acrobat 8.2.3 (no other changes in my computer software), OCR recognition occassionally stops with an error that reports the paper capture service is not accessible/available.  This is not a fatal error because OCR can continue in the document but the page is not fully processed.  The error appears to occur after decomposing and before recognizing text.  This problem did not happen with 8.2.2.  I have two computers with 8.2.3 and this happens on both computers.  I would be happy to downgrade to 8.2.2 until Adobe fixes this problem but apparently adobe does not allow removing the 8.2.3 update.
    I tried opening the .pdf directly from the Adobe open menu as someone had suggested in another forum, but the problem persists.
    I use this OCR capability almost daily on large (>100 page) files so I need at least a temporary solution right away.  Any suggestions?

    Not sure if this is a Flash Player problem.  You are more likely to get a reply if you post your question in the AS3 forum.

  • How to OCR  using Acrobat SDK

    Hi All,
    I am C# DotNet developer new to Acrobat Pro X SDK.
    I need to know how to achieve OCR extraction using Acrobat SDK?
    Please provide any link or samples to start with.
    I searched the web and not able to find any specific samples or documentation on how to use the SDK.
    Thanks in Advance
    SRR

    Read this:
    http://forums.adobe.com/thread/676594?tstart=0

  • OCR in Acrobat 7, errors out.What about OCR in 8?

    We publish a magazine in InDesign CS2. Our pages consist of text generated inhouse and mixed advertising (pdf, tiff, psd etc) on the same pages.
    We are attempting a digital edition with searchable text. Acrobat 7's OCR does not work with mixed text and graphics. When we convert the files to tiff or jpeg format it errors out with this message: "Unable to process the page because the Paper Capture recognition service unexpectedly terminated."
    We do not want to switch all our machines to Acorbat 8 although we might consider a dedicated machine.
    Does anyone have strong successful experience making searchable text form mixed graphic and text pages with Acrobat 8? Any other suggestions with either 7 or 8?

    > if I "save as" to a tiff of 600ppi shouldn't I be able to bring those tiff files into a new pdf and ocr them?
    You should be able to save as TIFF files, make a new TIFF from that.
    But if the resolution was too low, or the scans too poor, this won't
    improve anything. OCR needs images of a certain quality and detail:
    scaling up doesn't add detail.
    >
    >I know that if I print all 170 pages onto paper, I will be able to scan them into Acrobat and simultaneously ocr them.
    If this works, then something else is preventing it from working (not
    quality issues). Really, it seems to me that your Acrobat is broken,
    and it needs to be fixed. Did you try a repair?
    Aandi Inston

  • How can users who have Acrobat Reader only save scanned pdf files so that the text on them is searchable using ctrl-F?  I just use the recognize text with ocr feature in the full version of Acrobat and this seem to do the trick. Reader doesn't work!

    Our users have scanned pdf files they want to be able to search using ctrl-f.  I got them to be searchable by doing a recognize text using ocr with Acrobat Professional vesion 8.  They want to know if they can make the files searchable with Acrobat Reader only or if they need the full Acrobat Professional software to make the files searchable.
    Thanks for the help!!
    Ken K. - 2191

    To clarify a bit they need to have Adobe Acrobat, not Adobe Reader. Reader has not been associated with the Acrobat name for 3 or more versions. The process you are asking about is a creation process - the purpose of Acrobat - and NOT a reading feature.

Maybe you are looking for

  • Getting the label of a JButton in a for loop

    hi, I doing a project for my course at the minute and im in need of a bit of help. I have set up 1-d array of buttons and i have layed them out using a for loop. I have also added an annoymous action listener to each button in the loop. It looks some

  • Order related & delivery related billing

    Hi all, I have a scenario which is explained as under: In the sales order i have the following line items: 1. ATM machine - 1 qty. This will be a stock item. 2. ATM installation - 1 qty. This will be a service item so non-stock item. 3. Cables - 500

  • Elements 4.0 - OS X 10.9.3

    I have Photoshop Elements 4.0 - I have a Mac running 10.9.3 - Do I need to update or upgrade?

  • 8 or 16 Gig model?

    Hi all, I'm contemplating getting the new 3G phone and need to decide between the two models. Does anyone know in real practical terms, what I'll be getting besides the larger capacity? Please provide examples in terms of how many songs I can have lo

  • Office 365 - Folder retention policy greyed out

    I think the issue may have been because I had a default retention tag, see "Types of Retention Tags" in the following technet article. https://technet.microsoft.com/en-us/library/dd297955%28v=exchg.150%29.aspx Will leave an hour or so and see if dele