Grabbled text after OCR

I'm scanning old law books from a Minolta PS-7000 using IrfanView software, then bringing them up on Adobe Acrobat 8 Standard for PDF files. I then did a text recognition, OCR and some of the pages are OK, but then I come across other pages that have text that is all grabbled with symbols and numbers instead of letters. I have a HP Pavilion 732 computer, with Windows XP. Can someone please give some direction on this problem? I have a lot of old statute books to scan.
Thanks,

It's part of the process. I recommend you check out the help pages
about OCR in Acrobat, there's more to it than you might imagine (or
than most people want).
Aandi Inston

Similar Messages

  • Ocr question - after ocr text still displays the way it was before - can the letters be changed?

    hi!
    i have a question regarding ocr.
    the function works perfect but after ocr the letters still get displayed the way they looked from scanning.
    is there a way the replace the (recognised) letters by a vector font?
    shouldnt be a problem as the letter recognition worked .
    cheers
    frank

    It would be helpful if you'd tell us what version of Acrobat you're using. The names and locations of the commands move with different versions.
    The option you want is called ClearScan.
    This method synthesizes a new Type 3 font that closely approximates the original, and preserves the page background using a low-resolution copy.

  • PDF file size increases several folds after OCR

    Hello,
    I've been using Acrobat Pro 9.  Whenever I receive a scanned PDF from someone, I always try to OCR it.  I thought this would reduce the file size because it's saved as text characters instead of images.  However, the file size actually increases several folds.  For example I just had a scanned PDF with original size of 3M, but after OCR (settings:  searchable image, downsample to 300dpi), the file size is > 19M.  This is a legal document with no pictures or diagrams.  Even ClearScan gives a result of 15M file.  Could you help explain why and how I can change this?  Thank you.
    Hung

    I just did an OCR with Searchable Image (exact); and here are all results
    Original:  3,046 KB
    Searchable Image (300dpi):  19,408 KB
    Searchable Image (Exact) (600dpi):  3,377 KB
    ClearScan (300dpi):  14,813 KB
    So in this example, Searchable Image (exact) only causes about 10% increase in size.  I used Save As "PDF" (not "Reduced Size PDF" or "Optimized PDF"...).  Does this make sense?  Thank you.
    Hung

  • How do I scan to text using OCR on the Envy 5660?

    Hello,
    Prior to ordering an HP Envy 5660 printer, I confirmed that OCR text recognition is expressly included in the Printer Specifications for the HP ENVY 5640, 5660, 7640, and Officejet 5740 and 8040 e-All-in-One Printer Series document here.
    As you can see, under Scanning Specifications, which apply to all models listed in the above document’s title, it says: "Scan to text: Integrated OCR software automatically converts scanned text to editable text."
    I have now received and set up the HP Envy 5660 printer that I ordered. It is connected via USB to a MacBook Pro running Mavericks (OS X 10.9.5). After clicking the Download HP Software link on the accompanying CD, I was automatically connected to HP's Product Setup area, from where I obtained the latest driver package for my operating system, "HP-ENVY-5660-series_v12.39.0.dmg." Using the "Custom Install" option, I installed “Essential Software,” “HP Scan,” and “Product Help."
    The print and scan functions on my HP Envy 5660 are working, but regardless of whether I scan a page of text via the printer’s control panel, or the installed “HP Scan” application, or the installed “Image Capture” application, I can find no evidence of integrated OCR software, and no option to convert scanned text to editable text.
    Please tell me where to locate the specified OCR software, and how to enable its operation on the Envy 5660.
    Thank you.

    Greetings, @TeaMasterLing , welcome to the community!
    I read through your post about how you are attempting to use OCR software that was to be included with your printer software installation. I was unable to recreate this situation here on my lab computer to see what you are seeing on your end.
    For that reason, I cannot provide you with a possible solution and would suggest calling in to phone support, as they can log on to your computer if need be to see how the issue could be resolved to have the OCR software working for you.
    Here is HP's contact info:
    If you are calling within North America, the number is 1-800-474-6836 and if you are calling outside of the US/Canada: click here.
    I hope you soon have a solution!
    Have a great day
    R a i n b o w 7000I work on behalf of HP
    Click the “Kudos Thumbs Up" at the bottom of this post to say
    “Thanks” for helping!
    Click “Accept as Solution” if you feel my post solved your issue, it will help others find the solution!

  • How to keep the same image quality after OCR ?

    Hello, I have scanned a page of a book, it has mostly black text over white background. In order to keep a good visual quality and a low size for the file I chose the GIF format. The gif is 153.4KB, when I save it as PDF the file is 119KB  and after OCR it is 224.2KB (however the resulting rich text is only 3.6KB).
    How come the PDF is smaller than the GIF ? Is the GIF converted into a JPEG ?
    How come after OCR the PDF is twice bigger while the added text is only a few KB ? Is the image converted again ?
    I only want to keep my GIF as it is and OCR it.
    Is it possible ?  Even though in the "Convert to PDF" settings it says "There are no settings that can be edited in Conpuserve GIF"
    If not, what other software could I use ? I have DEVONthink Pro but it also converts my GIF against my will.

    niuza wrote:
    Why are you talking about DPI ? the problem is not with acquisition but with convertion.
    I don't want more detail I want to keep the same quality in the PDF that I had in the source file.
    PjonesCET wrote: you can always use pdf Optimizer (in advanced menu) to reduce the size of the PDF without demishing the quality of  text.
    Well I'd like to see that, because before optimizing a PDF you must save it and when you save it the image is converted.
    Here is what I get with a 600DPI TIF converted to PDF. Notice the difference.
    Because DPI (Dots per inch) density affects the quality of the OCR.   The higher the dpi, the better looking and more reliable the OCR. The less DPI is , the poorer the quality, and less relaible the scan is. There are different settings under the Create PDF Using Scanner Those setting affect the Quality of the OCR.
    Once the Document has been OCR'ed have you tried to save as a Word Document or as an RTF Document? Then sleceted all the text and choosen a Font (Arial for example) save as a word document. Then created a new PDF. It might clean up the look at the text.
    I will leave the answer at this and let someone else try. I don't wish to get anyone upset.

  • Thin Horizontal Lines visible within PDF file after OCR run

    I print to PDF(Acrobat 9.1 Pro), from an imaging application where the images are stored and managed as 300dpi TIFF(Group 4) B&W.
    The PDF file is produced with no issues.  At this time I run the OCR process to capture the type written data.  As the OCR process works it's way throught the file, the PDF is re-written with both OCR data and thin black horizontal lines every 2/3 of a inch down the page.
    These lines will be saved with the file and print out.
    How do I eliminate these lines?

    I had this problem too.  I'm using Acrobat Pro 9.4 and after OCRing a doc it had horizontal lines running across the page.  We were able to fix it by first Optimizing the document and then OCRing it.
    Document > Optimize Scanned PDF
    When finished, Document > OCR Text Recognition > Recognize text using OCR
    Hope this helps

  • Bullets and Numbering "Text After" attribute

    Hi Everyone,
    I need to grab "Text After" attribute of and existing bullet list in InDesign document and change it to my text. I think, I need to get access to "kBNBulletTextAfterBoss", but I don't know which boss agregates it.
    Thanks

    I've found it a much better practice, especially in heavily numbered documents, to look for a paragraph style that can be set to start at zero. I usually apply it to any headings or subheadings where the numbering below remains consistent. But that's just my experience.
    You can also right click and restrat or continue numbering.

  • How can users who have Acrobat Reader only save scanned pdf files so that the text on them is searchable using ctrl-F?  I just use the recognize text with ocr feature in the full version of Acrobat and this seem to do the trick. Reader doesn't work!

    Our users have scanned pdf files they want to be able to search using ctrl-f.  I got them to be searchable by doing a recognize text using ocr with Acrobat Professional vesion 8.  They want to know if they can make the files searchable with Acrobat Reader only or if they need the full Acrobat Professional software to make the files searchable.
    Thanks for the help!!
    Ken K. - 2191

    To clarify a bit they need to have Adobe Acrobat, not Adobe Reader. Reader has not been associated with the Acrobat name for 3 or more versions. The process you are asking about is a creation process - the purpose of Acrobat - and NOT a reading feature.

  • "Recognize Text Using OCR" Option Grayed Out in Acrobat 9 Pro (9.5.1)

    Running Adobe Acrobat 9 Pro.  I'm working with electronically filed court documents.  I regularly use the OCR tool (Document -> OCR Text Recognition -> Recognize Text Using OCR...) on these court documents.
    Problem is, every once in awhile, I'll run into a document where the "Recognize Text Using OCR" option is inexplicably grayed out.  I have no idea what is causing this.  I have checked the Document Properties and confirmed there are no security restrictions for the document.  It happens inconsistently, in that OCR will work with a document filed by an attorney in one case, but it won't work in the same kind of document filed by the same attorney in a different case.
    Any help getting OCR to work on these few rogue documents is appreciated!

    Form created with LiveCycle Designer are XML forms in a PDF wrapper and many of the usual PDF properties are not available. This is like embedded rich media in a PDF. If you want to research this, Adobe and ISO have the PDF Reference manual available as a free download.

  • How to add text after number in the same cell? "200 units", "3Kg", "17 sqm"

    Dear Sirs,
    I have problem with adding text after a number in the same cell.
    For instance, in excel I am able to do this:
    200 units -----> this is on the same cell. Any number I type, "units" will follow automatically.
    I know that we can just type "200 units" but its in text format so it cannot be calculated supposedly I want to multiply it by others number.
    I know that we can do this by splitting into two cells, one on the left for the number "200" and another one on the right for the text "units".
    It would be helpfull if we could do this function such as "20 years", "3Kg" etc.
    Sorry to trouble you all.
    Thank you
    Q

    This is a case of uneven implementation in Numbers. This sort of functionality is available to format numbers used to label axis ticks on various charts, but not to format numbers in a cell. This feature should be requested.
    As an aside, perhaps a spreadsheet that actually managed numeric units as part of the calculation would be powerful and useful in avoiding formula bugs, something like is available in Google calculations. If you are unfamiliar with this, try one of the following examples typed into any Google search entry field:
    150 miles per gallon in liters per 100 kilometers
    100 * 20 yards / 40 minutes in mph
    Read more at http://www.google.com/intl/en/help/features.html#calculator

  • A way to undo Formatted Text & Graphics OCR from Acrobat 7?

    Over the course of a few months, my company received a large number of PDF files for a project for which the internal policy was that every file should be text searchable.  Unfortunately, we did not save the native files in any sort of convenient way, having at that time not realized that failing to do so was a very bad idea.  We ran OCR on every one of the files that we received, which total approximately 4,000.  At the time that we received the majority of these files, my company was still using Acrobat 7; we've since upgraded to version 8.
    Recently we discovered that there were discrepancies between our electronic copies and the hard copy printouts from which our electronic copies had been generated:  in the electronic copies, uppercase F had changed to P, S had changed to 8, etc.  We eventually worked out that it must have been that at some point a computer was mistakenly set to run OCR using the Formatted Text & Graphics setting, as opposed to either Searchable Image or Searchable Image (Exact).  This was absolutely not want we wanted, as for our purposes using a type of OCR that causes the original images to change essentially renders the files useless.  My questions, then, are the following:
    1)  As I asked in the title, is there any way of undoing Formatted Text & Graphics OCR that was performed in Acrobat 7?
    2)  Is there a way of identifying files that have had Formatted Tex & Graphics OCR performed on them (something stored in the metadata)?
    Rebuilding these files from scratch is going to require a gargantuan effort, so any help would be much appreciated.

    Hi,
    Bernd's been across the mountain and seen the bear; so, you can bank on what he posted.
    But, just because, I'll second his "no".
    Formatted Text and Graphics (Acrobat 7, 8) and ClearScan (Acrobat 9, X) effectively replace the image of textual characters.
    If a character is not recognized as 'something' a bit map is of the thing is left behind.
    Now, while Acrobat or other OCR engines (Abbey FineReader, AdLib, Adobe Capture, etc.) are really rather impressive no OCR engine has 100% accuracy 100% of the time. Other variables  come into play (scan lamp age/brightness, platen cleanliness, scanner mechanicals cleanliness, calibration of scanner, hard copy 'quality' (characters' darkness density, contrast between characters and background, presence of lack thereof of boxed in text, text in or adjacent to line arcs/circles, etc.).
    All of that is for semantic content that is "textual". Semantic content that is not textual (but, coincidently may contain text) provides little to no useful OCR output (e.g., graphs, drawings, etc.). Validate this by performing OCR on such a PDF then Export to a plain text file. Print this file out and compare that to the source paper or the scanned image.
    There is no metadata info that identifies the OCR mode used.
    Perhaps something buried in the bowls of PDF page description content; if so, not intrinsically easy to obtain.
    My suggestion (fwiw) - move forward with re-scan.
    A server product would help to move it along but a high speed scanner hooked to a local machine (with ample resources) and Acrobat Pro 8 or 9 get it done. With Acrobat 8 or 9 use Search Image (Exact).  In Preferences check the category Create PDF or TIFF to assure it is what you desire. Check Acrobat's scan presets to assure you have what you want vis-a-vis Compression and Filtering. Do avoid "Automatic".
    Be well...

  • I am experiencing delayed typing with texting after upgrading to IOS 7 - anyone seeing this?

    Both my wife and I are experiencing delayed "typing" with texting after the upgrade to IOS7.  Concisely, you can keep typing, but there is a noticable delay before the text appears in the message section.  I have also noticed delays with entering my 4 digit Passcode. 

    Mine was so bad that it was nearly impossible to type in my passcode.  I finally restored the phone (reinstalled the software) and then restored from backup.  All is working fine now.  This is on my iPhone4.  The real trick was getting past the passcode and then turning off Find my iPhone (which is required to restore) when it was nearly impossible to type my password in. 
    It solved my problem.  After I did it I read somewhere here to try resetting all settings.  That would be quicker but you may lose some settings you had.  In any case restoring completely solved my problem.
    My iPad did not experience any of these problems.

  • Want to add Custom text after Product Branding Image for application

    Hi
    Can we add any text after Product Branding Image for application
    By default it will show the responsibility function name
    My requirement is after company logo I want to add Text at the top of application
    with center align
    Need suggestions
    Regards
    Krishna

    Hi Anoop
    If I done what u suggested then the function responsibilty name will also come
    and the application will be disturbed
    Is their a way to edit the text at the top
    Regards
    Krishna

  • Error with PS text after Unicode conversion

    Hi, we are having problem after doing unicode conversion with special
    character not displayed correctly in a Web interface. In SAP
    (transaction CN04) everything is OK, but if the PS text is displayed
    through a Web interface (BSP for instance) some characters are
    displayed wrong. One of these characters is the apostrophe in French
    language ('). Is there an available tool to perform a conversion of
    existing PS text after performing unicode upgrade ?

    Hima ... try this
    http://<server>/Lighthammer/JCOProxy?Mode=Reset

  • Recognize Text Using OCR from DLL

    Hi:
    We are a service company,working on a project we need to do OCR on PDF files: convert a PDF to a searchable PDF.
    The customer has licenses for Adobe Acrobat Pro Extended.
    The problem we have to solve it: from a JSP page, run an applet and to have access to Adobe Acrobat Pro Extended for use the funcionality "Recognize Text Using OCR" on a PDF file.
    Ideally, we would be able to access a DLL and invoke this functionality, it is possible?
    If not, what would be the way to access this functionality: IAC? Plug-In?
    Would greatly appreciate any help.
    Thank you very much.
    Raimundo Carlos
    www.base100.com

    [lrosenth:]
    > LiveCycle ES includes lots of PDF functionality that you can use from various APIs.
    I tended to associate the term "LiveCycle" with the newfangled (XML-based) way to handle forms, but it has become clear that LiveCycle is much more than a new Forms paradigm.
    It sounds like the LiveCycle SDK/Library can be used as a (full?) replacement for the original APDFL.
    Is there a table somewhere with the differences between those two SDKs?
    TIA,
    -Ramon

Maybe you are looking for

  • How do you add a Vertical scrollbar to a JTextArea?

    How do you add a Vertical scrollbar to a JTextArea? This is what I've tried so far but it hasn't worked. I got that off of someone asking a similiar question here. aTextArea = new JTextArea(10, 40);          JScrollPane scrollPane = new JScrollPane(a

  • Is it possible to play the whole playlist-folder?

    In iTunes I subdivide larger playlists in multiple smaller ones, by putting them in one folder. I can then play either only the single sub-playlist or the whole folder with every playlist included. The same worked wonderfully with my iPod Classic. Bu

  • Can't start my Adobe Applications - returning Error 150:30 after updated Mac OS to 10.9

    I have just updated my mac to OS X 10.9 and my Adobe CS4 applications won't open, returning error 150:30. I understand this is licensing related. I have been browsing various forums to get a fix for this, and I haven't found anything that works to da

  • Process Instance query problem

    Hello, everyone. Here is the problem. There is a variable named "contractNumber" in the process. The variable is set in certain stage of the workflow. I need to find all the process instances whose "contractNumber" equals to a certain value. A maybe

  • RAW files from Fujifilm X10, why have I lost the ability to import after last RAW file update?

    When I first installed iPhoto '11, it would not accept imports of RAW files from Fujifilm X10, then when doing an RAW file compatibility update in around April 2013, I could then import X10's RAW files, but when I did the latest RAW file combatibilit