Acrobat Pro 9 - Clearscan OCR

Hi folks,
A collegue has Acrobat Pro 9.3.3 (I am on Pro 8 so no issue) and when he runs the clearscan OCR engine it creates the hidden text but for some reason deletes the scanned text from the image..... certainly a clearscan!!!!! Is this a known bug with Pro 9? or is there a fix to get the text back in the image
Look forward to some thoughts on this issue
John

Bill,
I am attaching a page from the document. I have marked where the text
has been deleted. If I examine the document... the hidden text can be
seen. If I copy all and paste into textpad the text can also be seen!
Lastly I did a text touch up to edit the pdf on the blank bits and when
I changed the font to times roman the text appears?
Any thoughts because there is hidden text... I think
John

Similar Messages

  • Acrobat Pro ClearScan OCR Crash

    I have been using Acrobat Pro 9 to scan and OCR books. It has worked very well for thousands of pages. I have now come across a set of images that it will not OCR. Visually I don't see any difference between them and hundreds of others that I have OCRed. If I load the image into Acrobat Pro, select the option to OCR it and create a ClearScan document, Acrobat just crashes. No error message. It is a 600 dpi image, produced in the same run as hundreds of others that work. I am scanning at 600 dpi because it actually does make a difference in the quality of the output...
    There is a related issue. I have experimented producing the ClearScan PDFs in two ways: (1) create ClearScan PDFs of each image and then combine the PDFs. (2) Load all the images into Acrobat and Create a single ClearScan PDF from them. When I do procedure 2 the final PDF is more than half the size of the that produced by process 1. But the image quality is better with 1. The main difference is character spacing. It is off significantly on many words using 2. I am curious as to why.
    Any help, especially on the first question, would be appreciated.
    Thanks,
    Mike

    Is there any way that I can view and correct ALL of the text from the text recognition process?
    Select the text with touchup text tool.
    Change the color of the text.
    Edit the text.
    Change the color of the text.

  • Acrobat Pro 9.5.0 crashes on ClearScan OCR but not Searchable Image OCR

    I have Windows and Mac copies of Acrobat Pro 9.5 running on my computer (both version 9.5.0). My computer is an iMac running Mac OS X 10.7.3 (Lion) with the Parallels hypervisor, version 7.0.15055 installed, in which is running Windows XP. As at 11 April 2012, these versions are all the most recent version of the respective software.
    When I try to apply OCR to a scanned image, and select ClearScan as the option, Acrobat Pro 9.5 (Mac) crashes but Acrobat Pro 9.5 (Windows) does not. When it crashes, Acrobat Pro 9.5 (Mac) produces a VERY long problem report which is sent back to Adobe.
    In order to apply OCR to a scanned image in Acrobat Pro 9.5 (Mac), I am forced to change the OCR option from ClearScan to Searchable Image, but this results in a larger file.
    I am aware of the thread at http://forums.adobe.com/thread/920032#4000947 but this has not helped, because I have reformatted the hard drive of my Mac (wiping out all remnants of previous versions of Acrobat), and reinstalled Acrobat Pro 9.5, but the problem persists.
    Can anyone explain the different behaviour and offer a solution?

    Acrobat 9 is not supported on MAC 10.7
    Hence there is nothing much we can do about it. To use OCR functionality on MAC 10.7, please use Searchable Image else please upgrade to Acrobat 10.

  • Acrobat pro 10.1.9 stopped being able to OCR scans

    Acrobat pro 10.1.9 stopped being able to OCR scans.
    I am running Windows 7 (64-bit) and I have OCRd several hundred pages of test over the last 5 months, but receintly I get a "Some of the pages couldn't be OCRed"  and "Acrobat could not perform recognition (OCR) on this page because: Unknown error" 
    I've tried "Repair Acrobat Installlation" 3 times and no improvement. 
    Any ideas on how to fix?

    Hi,
    Can you please share the PDF file. Also, can you please try with the latest version 10.1.10
    Thanks & Regards
    Sachin Soni
    Adobe Systems

  • ClearScan OCR in Acrobat 9 deletes portions of text

    I am experimenting with book scanning using a digital camera and various software including Acrobat 9.  I discovered that in some cases, when I perform OCR with ClearScan, apparently random portions of the text in the scanned PDF image are deleted.  Sample1.jpg shows a page before ClearScan OCR, and Sample2.jpg shows it after.  As you can see some of the text has been deleted.  How could this happen?  And, how can it be prevented?

    I'm having the same issue on a document I have a scan of and am trying to convert to a ClearScan pdf. I have found a workaround to the problem, which involves selecting "Optimize Scanned PDF" and using the default setting with the dial set to "high quality" before performing the OCR. This seems to stop chunks of words from dissappearing (at least I haven't found any cases), but it has the horrible consequence of dramatically increasing the file size when performing OCR with ClearScan. In a small file this may not matter much, but in my case it makes a 10MB pdf increase to 70MB.
    Can anyone confirm if this also happens on Acrobat X?
    From what I can tell, there are no preferences to control the OCR besides the downsampling. Specifically, I would really appreciate it if anyone knew a way to reduce the size of the fonts that are generated by the ClearScan OCR as they account for over 95% of the size of the file.

  • Why can't I interact with Acrobat Pro XI while it is performing OCR on a set of large files?

    I am running Acrobat Pro XI on a beefy MacBook Pro with lots of RAM and CPU to spare. I need to perform OCR on hundreds of pages of text spread across dozens of files. That understandably is time consuming and is not perfectly parallelizable.
    However, why am I prevented from doing *anything* with Acrobat while the OCR process runs?
    Two suggestions:
    1. Allow the user to perform OCR on discrete files in parallel. (I realize this will consume additional CPU and memory.)
    2. More urgently, allow the user to access non-OCR features of the application while OCR is in progress.

    Maybe it should be 64-bit and have multi-threading capabilities .  Understand the problems associated with that though and I won't hold my breath.

  • I have just installed Adobe Acrobat Pro XI for Mac. I am not finding where to OCR a document.

    How do I OCR in Adobe Acrobat Pro XI for Mac?

    Hello Treacie,
    You might need to launch Acrobat and go to Tools> Text Recognition and choose 'In this File' option to run OCR within the document.
    Regards,
    Anubha

  • When I OCR two versions of the same document and then compare th documents in Acrobat Pro XI, I usually get the message that there are no changes to mark.  However, I know there a quite a few number of changes.  I raised this question more than a year ago

    When I OCR two versions of the same document and then compare the documents in Acrobat Pro XI, I usually get the message that there are no changes to mark.  However, I know there a quite a few number of changes.  I raised this question more than a year ago, and the response I received had to do with the quality of the OCR and the scans of the documents.  However, if I use Acrobat Pro XI to save the same documents in Word and then run a comparison in Word all of the changes are marked.  When a PDF is saved as a Word document in Acrobat Pro XI, is a different OCR module being used than the one used in Acrobat Pro XI for text recognition?

    OCR is only for recoginition of the image / picture of text provided by an scanner.
    Content typed into a Word file which is converted to a PDF is (in Word and in PDF) *not* an image  or picture of text - it is the digital text. So, no OCR involved.
    When the "digital" (renderable) text of a PDF's page content is exported to Word no OCR is involved.
    When a PDF's content is from the image output of a scanner and this is a picture of text then OCR comes into play.
    If this content is exported to Word before doing OCR then it is the image that is exported to the Word file.
    Once OCR is performed it is the OCR output that is exported.
    OCR output is (always will be) impacted by "the quality of the OCR and the scans of the documents". 
    Regardless "Compare" is based on a Word file output to PDF1 then edits to the Word file followed by an output to PDF2. You use Acrobat Pro to do a compare of PDF1 & PDF2.
    Paper 1 scanned to image 1 to image 1 in PDF1 that gets OCR 1 and
    Paper 2 scanned to image 2 to image 2 in PDF2 that gets OCR 2
    being processed with Acrobat Pro's Compare can certainly be done.
    But - well you've described what can be observed.
    Be well...

  • Recognize / OCR Thai pdf in Acrobat Pro 10.1.9

    I have wasted an hour trying to install a language pack and change my installation but have failed.
    I am not even sure it is possible on this version of Acrobat but I want Thai as a Primary OCR Language setting in order to recognize the text in a scanned PDF.  Please help?
    Kind regards
    John

    Thanks for your response.
    So Acrobat Pro can recognize certain languages (Hebrew, Chinese etc) and these are configured during the installation process but there are no 'language packs' I can install to OCR Thai?
    Any suggestions how one might recognise the text in order for it to be copied and pasted etc?

  • OCR Support for Vietnamese language in Acrobat Pro 9.1

    I am using Acrobat Pro 9.1 and am needing to OCR the earlier scanned Vietnamese document.
    It seems that the default setup of Acrobat 9.1 doesn't support this. The question is does Acrobat 9.1 support support at all?
    I have noticed the font pack (including Vietnamese support) for Acrobat Pro 8.x, but failed to find equivalent support for Pro 9+?
    Appreciate any ideas on this.

    Try printing to the Adobe PDF printer, the more fundamental process (PDF Maker is a preprocessor for the printer). If that does not work, then try with print-to-file selected. Open the file in Distiller and see if the PDF is created. If the latter happens, then check for AcroTray running in the background. It is required to automate the process and is needed by PDF Maker.

  • I have created PDF from hardcopy by using my scanner. After I run OCR option for my PDF by using Acrobat Pro 9. But "Text-to-speech" functionality of the PDF says that an error message comes up that says the page is empty when I turns on the read out loud

    I have created PDF from hardcopy by using my scanner. After I run OCR option for my PDF by using Acrobat Pro 9. But "Text-to-speech" functionality of the PDF says that an error message comes up that says the page is empty when I turns on the read out loud option in Acrobat. Kindly help me to sortout this problems?

    So I tried generating the same PDFs on two other computers that have Acrobat 9 Pro.  Results were reproduced.  The verdict is:
    - complex PDF files (that is, containing cross-references, tables of contents, and bookmarks) generated by Acrobat 9.x Pro are roughly 2-5x larger than the identical file generated with Acrobat 8.x Pro.
    - different PDF conversion settings make a negligable difference (less than 10% rather than 70-80%).
    - using the "Reduce File Size" or "Optimize PDF" option cuts the file size roughly in half, almost always resulting in a "image downsampling mask" warning message, which requires acknowledgement (that is a problem for batch processing or automation).
    - adding an Acrobat watermark to the file cuts the file size roughly in half.
    - just using Save As to another filename has no effect on file size.
    - generating the PDF in Acrobat 9 with links but no PDF bookmarks still results in the inflated file size.
    - generating the PDF in Acrobat 9 without any links or bookmarks results in approximately the same file size as the Acrobat 8 PDF with full links and bookmarks.
    It appears that Acrobat 9's manner of adding links is what's bloating  the files, and in my case it's probably not related to images or image resolution/print quality.  It's a shame, because Acrobat 9 seems to have made some  improvements to the Review Tracker interface, and a few other bells and  whistles which I haven't really gotten around to exporing yet.  But  unless I find a way to keep my links and the PDF file sizes comparable to what I was  getting with Acrobat 8 Pro, it looks like I'm going to stay with Acrobat 8.

  • OCR not working with Acrobat Pro X

    I have Acrobat Pro X ver 10.1.9 using Windows 7 Pro (64 bit) OS running in a virtual environment (Parallels ver 9.0) on a MacBook Pro. I have 16 GB of System Memory and 6 CPU's dedicated to the virtual environment. For some reason: 1. I cannot select renderable text with my cursor in a pdf containing such (i just see the cursor turn to a hand symbol as if to use the cursor to move the image around;  and  2. I can run Acrobat's OCR on a recently scanned pdf but once again I cannot select the text (same hand symbol appears). I went to Control Panel -  Programs and selected "Repair" the Acrobat program but this was no help.  What I don't understand is prior to installing my Acrobat Pro X program on this computer I had it installed on a PC running Windows 7 Pro (32 bit) and everything worked fine. Any ideas?  WR
    02/10/14 Update: Spoke with Adobe Tech Support and they informed me the problem I am having with the windows version of Acrobat Pro X above is caused by running the program in the Parallels virtual environment on a Mac. This is odd as the purpose of running Parallels on the Mac is so that one can operate the Windows OS and programs on a Mac. They suggested I purchase the Pro XI version for Mac.

    "Copy C:\ProgramFiles (x86)\adobe\acrobat 9.0\acrobat\plug-ins\PaperCapture\* to the parent directory C:\ProgramFiles (x86)\adobe\acrobat 9.0\acrobat\plug-ins
    For Acrobat X, the path would be acrobat 10.0."
    IT WORKS!!! Hey thanks dodland, now all I have to do is get my CS6 MC installer to take less than X hours to install just a version of Acrobat X that will recognize my MC serial number! I installed a version of Acrobat 10 from the MC extracted files and ran it as a trial, no go on the OCR. Then I did your copy suggestion and all is well. I've copied this fix as a text file to live with my MC install files. Now I just have to get a activated version running.
    Thanks again, as I was considering wiping the entire drive just to fix this single issue and that would not be fun...
    TLL

  • OCR error in Acrobat Pro 8

    Hello,
    I am getting an error in Acrobat Pro 8 when I scan w/OCR option enabled. The software basically crashes and closes.
    Here is the scenario;
    We have Fujitsu Color scanners running twain drivers. The 200+ page documents seem to scan okay but when the OCR process starts it hangs on what looks like charts or graphs in the document. This is consistent on several computers but not all charts or graphs. The strange part about this is if I deselect OCR option, scan the document, save it to disk, and then run the OCR from the Document menu, it works fine. Any clues?
    Thanks in advance,
    -Robert

    Probably Acrobat and/or the scanning software is running out of resouces. It could be RAM. It could be hard disk space. Doing it in two steps seems like the way to go even if you were to prefer it to be one step.

  • OCR of mixed content with Acrobat Pro 9

    Hi,
    I've heard that it should be possible to perform OCR on documents consisting of mixed content by the use of Acrobat Pro 9. By mixed content I mean e.g. a header made up from text, and a body which is a scanned document (e.g. a jpeg file). In Acrobat Pro 8 I tried to carry out this process, but Adobe just returned an error message. Thus, I decided to upgrade to Acrobat Pro 9. But this did not solve the problem. The same error message appear, and the body of the document is therefore not OCR'ed.
    The error message is the following: "Acrobat could not perform optical character recognition on this page because: " .. 
    Have anybody tried to OCR mixed content and found a solution to this problem/challenge?
    Best regards,
    Andreas

    Successful OCR depends upon the quality of the scan. A nice 300 dpi tiff file has a good chance of being OCRd by any half-decent OCR program. Acrobat when it imports a 300 dpi tiff will be able to OCR if the quality is good. Take the same tiff file and convert it to a jpg and all bets are off. jpf is a lossy format. It is designed for photographs not line graphics or text. Upon conversion to jpg many of the edges of the lines/boundaries of the letters will be come fuzzy. It is the boundaries of the characters an OCR program uses to figure out the which letter is which. Hence, the jpg will not easily be OCRd. Its not impossible, but it is a much harder job.

  • Acrobat Pro always on top during batch OCR ! ;(

    Hello,
    Do you know a way to change the default behaviour of Acrobat Pro 11 as it comes back always on top each time there is a new pdf file to OCR during a batch OCR process ?
    This is very annoying as I have to click each time on the other window I was focusing on !
    Example: I was on reading something in Firefox, then Acrobat Pro window appears - too bad! - I need then each time to click on Firefox in order to get back on what I was reading !
    Thanks in advance
    Win7 64bits Acrobat Pro 11, tools/action wizard using only : OCR and Save .

    Hi,
    There are 2 resolutions for this issue:
    1: Uncheck Prompt user checkbox while creating the Action:
    2: Use Tools -> Text Recognition -> In Multiple Files option
    Please use either of them.
    Thanks,
    Ankit

Maybe you are looking for

  • Image Processing in Java (E-Book) Request

    Can anyone post the links for this e-book i think this e-book is helpful.so,iam requesting all of you. Image Processing in Java by Douglas A. Lyon Publisher: Prentice Hall PTR; Bk&CD-Rom edition (March 1, 1999) Language: English ISBN: 0139745777

  • Same nr range in Invoice Verification (MIRO) and FI document

    Dear all can you please help to solve this issue? We would like to have the same number range for invoice verification and for financial document, what we have to do? Regards Marco

  • Apex function apex_util.get_blob with unexpected behavior

    Hi Experts. I've been building an Apex search application and hit a snag while testing the apex_util.get_blob download url.  It appears the Download url is only valid in the session that it was generated. Meaning this.  The url in Step 2 below now re

  • Problems with Mapping Tool in Tutorial

    Hello, I am having problems with section 2.3 in the tutorial, even with the completed solutions of the tutorial. In particular, I cannot add Rabbit and Snake to the petshop. I get the following errors: C:\Apps\kodo-jdo-3.1.2\tutorial\solutions>javac

  • PCA-Activiation of direct postings

    Hi Guru's I am notable to activate direct postings in pca iam finding the error it gives incomplete controlling area settings when checked coa settings everything is done well except that i have not mentioned dummy pc when i wanted to activate it hta