OCR: acrobat 9 or readiris?

Hi all,
I have to OCR many PDF and make them searchable. I've both Acrobat 9 and Readiris 11, anyone know of any important difference between the OCR process of the two software (which one is better to use?).
Many thanks!

I think this is a matter of opinion. The drawback of using Designer is that you can no longer edit in Acrobat at all. Those of us that learned in Acrobat, often favor the Acrobat route. In Designer you can import WORD forms and that might be useful at times; however, you can not edit it in Acrobat. Indeed, you can still open the form made in either using Reader or Acrobat. If you want backward compatibility, you need to be careful about saving it as such and not adding capabilities that are new with AA9. Those capabilities will be lost with earlier versions.
The best way is to try it both ways and see what you think. When you buy Acrobat, Designer is part of the package. Thus you have the choice and can try both. Make a sample PDF for creating a form and then make 2 copies. You might even start out with a WORD form. Discover the pros and cons yourself. I have never felt the need to go to Designer and it seems that a lot of folks go there only because that is where the forms menu sends them, not realizing they could have used the TOOLS in Acrobat.

Similar Messages

  • OCR Acrobat 9

    I’d like to ask you some questions on the use of Acrobat 9 Professional:
    - When the OCR is performed, is it possible to choose the font to use as reference, as in the case of the choice of the language?
    - Done the OCR Image Only, how is it possible to correct the hidden text?
    - Why in some pages it is impossible to effect the OCR getting the message "the page contains text which can be submitted to rendering?"
    You can give me some sources which I can consult; in the guide I haven’t found any answers.
    Many tanks

    It's strange that while I posted to this Adobe forum, there is a response over at objectmix.com. As contributing to this topic from 2 locations seems confusing, I'll carry on here.
    Amannagpal76 responded, saying in part that ClearScan in 9 Pro replaces Formatted Text & Graphics. Good to know this. ClearScan does, however, continue the mix. If ocr doesn't work on a character graphic, that graphic will continue to be displayed as such, amidst ClearScan's synthesized type 3 font imitation of the original font. This is most obvious when using the marquee zoom tool.
    Aman suggests using the Touchup Text Tool and changing the font to any font installed on one's system. This doesn't work for ClearScan. Selecting a different font in Touchup for a PDF that came via a wordprocessor works fine, but not for a PDF that came via a scan. That, unfortunately, is the only time that ClearScan is used. The error message when I try this states that there's no system font to match the one in ClearScan, and text can't be added or deleted.
    ClearScan is remarkable for the small size file it produces. That size can be reduced considerably even further by converting it to the Adobe 7 file format. ClearScan's synthesized font is also remarkable when enlarging the page on screen. Then you can see its true outlines -- rather chewed up in high magnification, but that's OK. It would be nice to extract the font in question and use it on one's system. One downside to ClearScan is that its ocr fails to retain italics when output to RTF and Word.
    I have never found a suspect in 9 Pro.
    The conclusion from the above is that the hidden text produced by any ocr'ing in 9 Pro can't be corrected.

  • BUG in OCR in Acrobat 9.3 on MRC (Mixed Raster Content) PDFs

    I have lots of (non-searchable) PDFs, that were generated from scanned images (1 image per page @ 400 dpi) using LuraDocument from www.luratech.de.
    The images are stored internally in the PDFs as MRC (Mixed Raster Content), that means, the PDF contains a foreground and a background layer for every image/page. These layers have a low resolution, are highly compressed and merged together by Acrobat (while displaying) using a high resolution mask layer. This results in very low file sizes, about 50 kB / page.
    I'd like to make these PDFs searchable, but WITHOUT manipulating or changing the original image layers in any way. OCR software like FineReader or Omnipage seems to store images always with own algorithms, so that the image quality would suffer from the conversion and the size of the PDFs would rise significantly. Acrobat on the contrary offers to maintain the original image(layer)s by using the output style "Searchable Image (Exact)" in the OCR window. Now the problem:
    After starting OCR, Acrobat applies OCR only at the first page (with good results) and deletes (!) all content (the image layers) on all other pages. For my eyes this seems to be a bug.
    I tried a workaround: In Acrobat's Layers Panel I choose the menu option "Flatten Layers". Starting an OCR now, Acrobat does OCR on all pages of the PDF, but the OCR result is a disaster, less than 10% correct. Presumably Acrobat does not take the resulting (actually displayed) page content as input for its OCR, nor the high resolution mask layer, but instead one of the (low resolution, highly compressed) image layers described above.
    Has anyone made similar experiences with MRC-compressed PDFs, e.g. PDFs generated by other MRC-Generators like JRAPublish ? Is there any workaround or bugfix ?
    Thank You in advance !
    L.Benic

    I'm having the same issues.  Using the latest version 9.3.3.  Is this a bug? I tired calling adobe but their CR sounds like 3rd country only.  Anyone can shed a light on this issue?

  • Acrobat 9 crashes on OCR

    I've been trying to convert a batch of large PDF files to PDF searchable files by using the OCR of Acrobat. In the middle of a batch, a large (1000+ page) document crashes acrobat. I have narrowed it down to this image:
    http://img90.imageshack.us/img90/2418/badke2.png
    59,520 bytes
    When I convert it to PDF (File->Create PDF->From Single File) and then use Acrobat to "Document->OCR Text Recognize->Recognize Text using OCR", Acrobat always crashes.
    Is this true for anyone else that could try it?
    It kills my batch processing and is making this large conversion quite painful. Is there a way around it?

    MacBook Pro (1997)
    - Mac OS X 10.7.2
    - 2.6GHz Core 2 Duo
    - 4GB RAM
    Acrobat 9 Pro
    - version 9.4.6
    Acrobat 9 Pro OCR always crashes when using ClearScan but not when using "Searchable Image" or "Searchable Image (Exact)." I scanned several journal pages at 300 dpi (color, grayscale, bitmap) in .tiff and .png as well as screen selecting text from a browser. The results were consistent across all variations.
    The last time I used the Acrobat's OCR function was last Summer before upgrading from Snow Leopard to Lion. Under Snow Leopard, Acrobat did not crash during OCR (it did crash, just not while processing text for OCR). I did not attempt Acrobat OCR under Lion 10.7 or 10.7.1.
    Repeatable test.
    1. Open wikipedia "Crash (Computing)" page
    http://en.wikipedia.org/wiki/Crash_(computing)
    2. Enlarge text size, if desired.
    I tried several text sizes from the default to much, much larger. Text size has no impact on the results.
    3. Create a PDF
    File >> Create PDF >> From Selection Capture
    I selected the first paragraph:
    A crash (or system crash) in computing is a condition where a computer or a program, either an application or part of the operating system, ceases to function properly, often exiting after encountering errors. Often the offending program may appear to freeze or hang until a crash reporting service documents details of the crash. If the program is a critical part of the operating system kernel, the entire computer may crash. This is different from a hang or freeze where the application or OS continues to run without obvious response to input.
    4. OCR
    Document >> OCR Text Recognition >> Recognize Text Using OCR
    4.1 Searchable Image (Exact)
    Primary OCR Language: English (US)
    PDF Output Style: Searchable Image (Exact)
    Downsample: None
    Result: No crash — OCR successful
    4.2 Searchable Image (tested for each downsample option)
    Primary OCR Language: English (US)
    PDF Output Style: Searchable Image
    - Downsample: Lowest (600 dpi)
    - Downsample: Low (300 dpi)
    - Downsample: Medium (150 dpi)
    - Downsample: High (72 dpi)
    Result: No crash — OCR successful
    4.3 ClearScan (tested for each downsample option)
    Primary OCR Language: English (US)
    PDF Output Style: ClearScan
    - Downsample: Lowest (600 dpi)
    - Downsample: Low (300 dpi)
    - Downsample: Medium (150 dpi)
    - Downsample: High (72 dpi)
    Result: Crash — OCR not successful

  • How To Automate OCR after documents are scanned?

    I am using Acrobat 8 Professional
    1. Is there a way to have Acrobat 8 Professional to automatically ocr after the documents are scanned?
    2. Is there a way to create a buttom to ocr the scanned documents. Currently I have to go through: document>ocr text recognition>Pages options >OK
    Any advice would be appreciated! Joe

    I'm sure the javascript experts will be able to create a button that runs a script for your task, but have you considered using other OCR software (Finereader, ReadIris etc)? Corporate versions of these programs allow you to have a 'Watched Folder' where any file sent to the folder will automatically be OCRd with no user intervention.
    Although the OCR quality may be no different to Acrobat's plug-in, there will also be far more in the way of options to view and edit the recognised text.

  • OCR renderable text error

    someone was having the problem below with an older version of acrobat.
    is there now a solution in acrobat mac x?
    i note that exporting to image file loses quality and increases file size
    thanks
    Well, since this is the digital age, it makes sense that I ought to  read the PDFs in digital form (this is a stretch for me, I really like  paper), which is facilitated by a tablet since I can actually see the  page when it’s in the portrait configuration.  It also makes sense that I  ought to mark up the file in Acrobat, using the native highlighting and  searching tools, which is also facilitated by the tablet for obvious  reasons.
    Here’s the problem.  Apparently *every* PDF file, in every digital library, is tagged with headers, or footers, or bates numbers, or some other tag that halts the OCR recognition of the PDF file.   If you google “This page contains renderable text”, you’ll see that  this has been a complaint since Acrobat 6 at least.  So you can’t just  OCR the document and get a nice,  mark-up-able document.
    Now, I know what you’re thinking.  There has to be a workaround,  right?  Of course, there is.  You can manually remove the headers and  try again.  Oh, now there’s a footer; you can take that out too  (manually) and try again.  Oh, now there’s a bates number, okay, take  that out too.  There’s STILL some renderable text in there somewhere,  well, now you can either try and edit out the blocks of renderable text  (again, manually, made more entertaining by the fact that you can’t just  right click on the page and say “remove renderable text”), or you can  export the entire document to a graphics file (say, a TIFF), re-convert  it to a PDF file (which turns the entire document into a rasterized  image), and THEN run the OCR tool to get an actual mark-up-able  document.  This process is made more enjoyable by the fact that Acrobat  will turn that 300 page dissertation you’re reading as part of your  research into 300 distinct TIFF files, which you then need to recombine  into a PDF file.  Multiply this by 100, and you’ll see what sort of a  barrier to productivity this is for me to get started organizing my  existing document collection.
    This is CLOSE TO THE DUMBEST THING I HAVE EVER SEEN.  And I’ve seen a  LOT of bad design.  Rather than prompting me “This document has  renderable text” and giving me “Cancel” as the only option, any  feature-driven developer would say, “Gosh, people get really frustrated  by this.  I know, because I can read the results of a simple google search.    We need to change this right away!  Here, I’ll make it so that you  can just click ‘Treat existing renderable text as white space’ or even  prompt the user to rasterize the renderable text and embed it in the  document, then OCR the resulting file!”
    The only conceivable reason I can imagine that this hasn’t taken  place is because your lovable electronic document vendor wants to make  it a colossally, enormously painful process for someone to actually do anything to the document they’re providing you to use.  Thank you, electronic  document vendor.  You’re going to be wasting about 20% of the time that  you’re saving me by giving me electronic access to this document in the  first place.
    Progress is grand.  Collide it with self-interest, progress seems to lose out more often than not.
    Now, if you’ll pardon me, I’m going to go get some sleep.  Then I’m  going to get up in the morning and go to work.  Then I’m going to come  home, and instead of enjoying some family time with my kids, I’m going  to fart around with manual document conversion.

    Elias,
    I completely agree with your anger. I ran into the same problem and I think I have figured out a workaround. I wrote up a blog post about it.
    http://www.ideationizing.com/2011/03/ocr-acrobat-pdf-with-renderable-text.html
    I hope this works for you.

  • Acrobat pro 8 scanner error 200010

    Acrobat pro 8 scanner error 200010. I am using a Canon all in one fax scanner laser printer. Acrobat prints correct but does not seem to recognize scanner. Using MAC Pro Intel / 9gb ram with OSX 10.6.8 and Canon Imageclass MF4350 d printer/fax/copier/scanner device. Acrobat shows Canon MF 4350 USB as only device in scan menu. I believe it should show Canon MF 4350 FAX but no option to change device in scan menu. Help would be appreciated 

    Moving this discussion to the Scanning & OCR Acrobat sub-forum.

  • Acrobat 8.1 Pro.でテキスト認識でエラー

     初めまして、よろしくお願いします。
     Acrobat 8 Professionalを、つい先日のアップデートで8.1にしたところ、これまで正常に動いていたスキャニング後のテキスト認識ですが、その動作直後Acrobat自体が強制終了するよ うになってしまいました。
     「テキスト認識とメタデータ」の検索可能にする(OCR実行)のチェックボックスを外して、スキャナーで読み込むとエラーは出ません。
     これは何が原因なのでしょうか。ご教授いただけますようお願いします。

    OS不明。
    8.0の段階で以下のサポート文章が出ている。
    OCRでの日本語文字化けという問題だが、ここでは2つの解決策が示されている。
    1つはOCR設定の確認と再設定。もうひとつは、 [検索可能にする(OCR実行)] オプションのチェックを外してスキャンし、その後でOCR化するというもの。
    後者は質問者が試している方法であろう。
    文書番号 : 230601
    最終更新日 : 2007/01/25
    OCR 機能を使用してスキャンを行うと日本語が文字化けする(Acrobat 8)
    以下は一般的なスキャンに関するトラブルシューティング。
    TWAIN ドライバがきちんと認識されているであろうか? WIAドライバでの試みは出来るであろうか? TWAIN ドライバの再インストールは?
    文書番号 : 230367
    スキャン時の問題のトラブルシューティング(Windows 版 Acrobat 8)

  • I have downloaded the converter program and the conversion from PDF to Word is terrible. Is there a better solution or setting?

    I am trying to convert a PDF to Word without re-typing.  I downloaded the converter program onto my MAC but the conversion was terrible.  Unusable.  Is there a better way.  I also have a PC so I could try that as well if would make any difference.

    "Scanned"  Scanned anything starts you with an image / picture of the page content that was on the source paper.
    There is no "renderable" text. OCR can provide an output of text that can be exported. Without that all that is exported is the image.
    As to OCR, Acrobat's ClearScan lends itself to repurpose via the export process.
    Regardless, anything sourced from a scanner output is the "pig's ear" and that'll not yield the "silk purse".
    Export is dictated by the input (GIGO). Export output identifies input quality.
    As a C Student stated a Word file (being well-built helps) that sourced the PDF will yield workable export back to Word.
    PDF's sourced for FrameMaker export rather nicely.
    For optimal export always start with a well-formed Tagged PDF (ISO 14289-1, PDF/UA-1 compliant).
    Two core design considerations of tagged PDF are (1) support Accessible PDF and (2) support export (repurpose) of PDF content. 
    Be well...

  • Convert pdf to text in adobe pro x

    I would like to convert a PDF document into text.  Can someone advise?

    If the PDF was originally scanned as a picture (JPG) there will be no "renderable" text in it to save from. You'll need to run OCR on it in order to save it as text. Reader CANNOT do OCR, Acrobat Pro can, under Tools.

  • Reduce scanned pdf size

    Hello All,
    I have about 317 pages document, only text. I scanned this document and made a pdf out of it using Acrobat 8 Pro. The file size was 65mb. I OCRed it and the file size was 135mb. I selected common "reduce file size" option for Acrobat 8.0 and later. The file size was 134mb. I checked the space usage by Audit Space Usage and 98.8% of space is used by "image"?
    I was under the impression that once I OCR the document then its more like text than an image.. Can anyone please elaborate on that.
    Finally, I want to reduce the file size, what best setting do you recommend that I get get rid of this space occupied by the 'image'.
    Many thanks in advance.

    The searchable image leaves the image intact and simply adds searchable text behind the image. To reduce the size with OCR you have to use the formatted text and graphics as Bernd indicates. This replaces things Acrobat recognizes with text and deletes the image.
    At this point you will have to scan again or save to TIFF and import to a new PDF. Once you have done an OCR, Acrobat will not do it again.

  • Need to record my professor's powerpoint--without him seeing it!

    (let me know where else this could be posted if this isn't the best discussion section)
    Hey guys, my professor is basically a giant toolbag and doesn't distribute his powerpoint slides (even though all the other professor's at my grad school do give out the PPTs prior to class). It would be fantastic if I could somehow discreetly take snapshots of each slide as he goes along with a camera of some sort plugged into my MBP. Here's the requirements for this job:
    1. Must be high enough resolution to read reasonably-sized font (28pt) on a 20 ft X 20ft screen, about 20 feet away, up at about a 30 degree angle from the laptop.
    2. must take the shots without making a cheeseball sound
    3. must take the shots without flashing or anything like what photobooth does right before snapping shot
    4. the laptop must (obviously) be facing me and not the professor
    Any ideas of a hardware/software solution that would do this? Currently my MBP's built-in isight + photobooth is a no go because (a) i bet it's too low res (I could be wrong, i'll test this), (b) photobooth flashes the screen white before snapping the shot, (c) you have to turn the laptop towards the professor side to take the photo.
    Any help is greatly appreciated, and believe me I'll consider even the most ridiculous ideas, so throw them at me without caution.
    P.S. I really don't want to hear anyone lecture me on respecting his intellectual property and whatnot, because the school requires that all lectures be audio-recorded anyways, so this is a foregone conclusion. And I pinkyswear I won't publish his content online or write a book based off his lectures. I'm a student with a learning disability and I don't the time/ability to furiously write/listen/pay attention all at the same time. In a nutshell: let's keep this thread on topic, kthnx.

    You're welcome, bcapple123
    Thanks for letting me know that your question is answered.
    Here are my ideas in response to your new questions:
    (1) I don't know any way to keep photo-booth (PB) hidden while still taking photos.
    Photo Booth does not work unless it is the foreground application, and, as you note, it only makes one photo while minimized. After first picture, I find I need to bring PB back to foreground, click "Effects", and then "Normal", to unfreeze PB for the next capture.
    (2) iSight is not a zoom camera. Zoom is one of the reasons I suggested a separate digital camera in my previous response.
    iGlasses may let you zoom your iSight image in the way you need, but it does so digitally, so image quality is effected. You can try the demo to see if it does what you want before you buy it.
    (3.a) To transform your trapezoidal images back to the square shape of the original slides, you can use the perspective correction feature of any full-featured image editor like PhotoShop or PhotoShop Elements. The process is simple, but, unless you know a way to AppleScript the workflow, I cannot suggest a way to automate it. I do not know a way to do it with my PhotoShop Elements File > Process Multiple Files... menu command.
    This Google search may give you some ideas or other web locations to search or post about this subject.
    (3.b) I have never hear of OCR software doing the kinds of things you mention. Some well-known Mac OCR apps are ReadIris, OmniPage Pro X, and VueScan. Thoroughly examine their features, system requirements, and compatibility before you buy. You may want to contact the developers to ask your specific questions relative to their individual applications before spending your money.
    This Google search may give you some ideas or other web locations to search or post about this subject.
    EZ Jim
    PowerBook 1.67 GHz w/Mac OS X (10.4.11) G5 DP 1.8 w/Mac OS X (10.5.2)  External iSight

  • How do I use the OCR feature in Acrobat 9?

    Does Acrobat 9 have the OCR feature?  If so, where do I find it?  In Acrobat 8, one would go to Document, OCR Text Recognition, Recognize Test Using OCR, OK.  I specifically want to know how to apply OCR to a scanned document in order to use the Find feature.

    Simple. go to DOCUMENT, and then select OCR TEXT RECOGNITION.

  • How can I use applescript for OCR of a bunch of files - with Acrobat XI?

    Hi there,
    Iwant to write a script (eg applescript) that can be used as a droplet or has a menu to open a folder of scanned pdf-files to conduct OCR. And - I want to use Acrobat XI (as this is my version), German verison.
    As Acrobat XI is not recordable with the applescript-editor and I do not find a manual of objects and methods I googled a script that worked with Acrobat 9 but not with Acrobat XI. Here you define a "Action Assistant"-Script e.g. called "OCR this" and "click" the item. But as in Acrobat XI this Item is noch in the main-sbubmenu anymore it seems not to work.
    Here is a Screenshot:
    And this is theapplescrpt, tha opens the "Aktionsassistent":
    click the menu item "Aktionsassistent" of menu "Werkzeuge" of menu item "Werkzeuge" of menu "Anzeige" of menu bar item "Anzeige" of menu bar 1
    Butr then I cannot reach "OCR this".
    If any body has a hint - either for clicking "OCR this" or to sript  an OCR on an opened pdf.file with applescript this would be great.
    Thanks,
    Maritn

    AppleScript is documented in the Acrobat SDK. But there is no method for this.

  • Acrobat V9 Pro OCR can't produce a file

    I am trying to perform OCR on a credit card statement. The statement has 3 PDF pages and except for the non-regular header info at the top of the page, everything is in nice columns - five of them.   I specify the output file to be an excel spreadsheet. The OCR engine works OK on pages 1,3.  It chokes on page 2 with an error that it cannot recognize any table OR sometimes produces this message: Acrobat could not perform recognition (OCR) on this page because: This page contains renderable text.
    I tried the technote soln to convert to .tiff , but that did not work (actually the instruction are not clear: do you rerun OCR on the .tif file or the newly created .pdf that was made from the .tif file...no matter, I did both, and both failed)http://kb2.adobe.com/cps/333/333110.html
    I have also seperated the .pdf doc into three individual files, and OCR'ed page two with same results.
    I took page2.pdf, scanned it (not with Acrobat), at 600DPI, and tried to OCR it again, same results.
    The page contains a bar code in the margin-could this be killing the OCR process?  I  tried to edit out some of the noise but can't figure out how to delete parts of the .pdf doc.
    Also, I highlight only the colums, select Document-> OCR Text Recognition -> Recognize text using OCR....and it does its thing, says it generates output document, but....WHERE?  It does not ask me where it should be placed, and I have no clue where it sticks it.....
    Any help is appreciated....
    JOhn
    sample is below:

    Really unresolved, but OK.

Maybe you are looking for

  • Jerky video with FCP5 on Quad

    I'm disapointed with the viewer and canvas performance on with FCP5 on the quad: Mu DV footage is very jerky with displayed in the viewer or canvas on the Quad. Does anyone have the same experience? Could it be, that FCP5 is upgrade tyet to perform o

  • Crystal 8.5 Licensing?

    I've been assigned to maintain some old VB 6.0 code that was developed by a different company.  Some of the reports appear to be Crystal Reports 8.5 ttx files. While I've found several places I can download Crystal Reports 8.5 developer from, I would

  • Reading a Borland structed File

    I am trying to throw together a quick program with LabView which needs to read files that have been written by a Borland compiled program. It seems Borland does not handle structs the same as LabView, Borland seems to condense the memory allocated to

  • Load as2 inside as3

    hi, I came a cross a strange problem while trying to load (using Loader Class) an as2 swf inside an as3 swf. I'm not sure that it matters but the as2 size is bigger. when i load the as2 file (even without addChild ) the overall swf that I run suddenl

  • Showing links conditionally

    Hi I am trying to show a link based on certain condition. If condition fails I wish to show it as a label if not show it using h:command_link. I am using <c:choose jstl tags right now, which shows up fine. Is there a JSF way to do this? <c:choose> <c