OCR and hidden text in PDF scans of historic documents

I need to edit the hidden text behind a scanned PDF image of a document.  The image must remain as an “exact” copy of the original scanned document.
I used Acrobat Pro (versions 7 and 9) to make PDF images of old typed documents from the 1940’s.  When I open those images and run OCR in version 9, then examine the hidden (invisible) text layer behind the image, there are errors.  For example, the word “book” has been picked-up by the OCR as the word “look.”  I need to change the “l” to a “b” in order to make the PDF accurate when it is searched at a later date. 
I have checked many user forums.  Most people imply that hidden text can be viewed, but NOT edited in Acrobat Pro 7 and 9.  (Hidden text can be viewed in Version 9 by selecting “Document” “Examine Document” and then clicking on the “+” symbol next to “Hidden Text,” then clicking “Show preview.”)  Some say to use Adobe Capture 3.0 to edit hidden text.  Others say to use Photoshop or Illustrator to edit hidden text (I think these folks may have been confused, because Photoshop and Illustrator would be used, logically, to edit the image ON TOP OF the hidden text).  Yet another person seemed to say that a hidden text editor was added to Acrobat 8, but was taken away in Acrobat 9.  (I can’t verify that because I don’t have version 8.)
The closest answer I was able to find involved using the Text Touch Up Tool on top of the image to edit hidden text behind it, but when you do that you are typing “blind.”  In other words, you highlight a spot on the image (top layer) where you THINK the error MIGHT be, and you type the correction without being able to see what you are typing over.  Then, you go back to the “Examine Document” procedure (described above) to see if you “hit” your mark, and if not, you redo it until you do “hit” your mark.  With the number of documents and corrections that we have, that procedure would be too labor intensive and thus a budget breaker.
If we have to buy more software, my preference would be to buy a genuine Adobe product because I have experienced problems in the past switching back and forth between Adobe products and other PDF manipulation software.
Can anyone answer any of these questions: 
(1) Is there a way in Acrobat versions 7, 8 or 9 to edit hidden text, and if so, how? 
(2) What Adobe software (other than Acrobat) will edit hidden text behind a PDF image? 
(3) Assuming no Adobe product will edit hidden text behind a PDF image, is there any non-Adobe products that will do that?
Thank you!

Hi,
Unless you use Acrobat 8 Pro's Formatted Text & Graphics" or Acrobat 9 Pro's ClearScan you will find that there is no
practicable means of editing the OCR "hidden text" in a PDF.
The TouchUp text tool (Advanced Editing toolbar) is reliant upon the selected text having an available system font to use during touchup. However, both Searchable Image and Searchable Image (Exact)  OCR output is of text rendering mode 3 (invisible text) that is provided from within Acrobat and not any installed system or other application installed font.
With Searchable Image (Exact) you have the untouched image augmented by the invisible text which is provided as a user aid for search or find with Adobe Reader or Acrobat. The invisible text is not intended to support word processor like editing.
To your questions:
#1. There is no practicable way to edit invisible text (text rendering mode 3) with Acrobat (any past or current release).
#2. None.
#3. A good question. Perhaps a specialty program. Keep in mind, many products provide a promise but those those that actually deliver tend to be expensive.
Something to play with. Using Acrobat 9 Pro or Pro Extended, try the Preflight Fixup to embed hidden text.
Then try using the TouchUp Text tool. You may also want to see if you can change the font type of this newly embedded font.
(use copies of the "real" files - just in case <g>).
Be well...

Similar Messages

  • Editing Hidden Text with PDF

    We currently are in the process of scanning our historic documents and we had them scanned with OCR and now we are using a indexer that looks at the hidden text for indexing.  We will be using this for searching for documents.  However some of these documents are older.  Which means OCR did not work and great as expected.  I want to see if there is a way to edit that hidden text from the OCR to change minor things in order to ensure the ability to find the documents.  Any hints would be greatly appricated.
    Thank you,
    Jeff

    Jeff,
    Historical documents - implies you'd not want to adversely effect the scanned image. If so, then OCR Searchable Image (Exact) (SIE) is desired.
    As to editing the OCR output. Neither Searchable Image (Exact) nor Searchable Image lend themselves to this.
    Yes, there are work around's; but... labor intensive and awkward from within Acrobat.
    If you use SIE, consider exporting out the OCR of each PDF to a text file. Referencing the PDF or the source paper you can edit this text (migrated to a word processor perhaps). Output a second PDF. Use these for the catalog index. Link each second PDF to the first PDF.
    Search gets you to PDF 2, the link gets you to the scanned PDF.
    Alternatively, hold off for Acrobat X.
    Today's Adobe Acrobat X: First Look eSeminar demonstrates how Acrobat X can export an OCR'd scanned image directly to Word with impressive retention of layout and format (without coming from a Tagged PDF). This process would permit getting cleaned up text back into PDF(s) to serve as the source of a Cataloged index.
    The eSeminar will be presented again Thurs., Oct. 21.
    See: http://acrobatusers.com/events/49361/adobe-acrobat-x-first-look
    Be well...

  • I have a MacBook Pro and can't print pdf files or word documents using my HP Wireless Photosmart Printer.  It will print files from the internet.  Any suggestions or ideas will be greatly appreciated.  Thank you.

    I have a MacBook Pro and can't print pdf files or word documents using my HP Wireless Photosmart Printer.  It will print files from the internet.  Any suggestions or ideas will be greatly appreciated.  Thank you.

    Hello, SoonerAnesthetist. 
    Thank you for visiting Apple Support Communities.
    Here is an article that I would recommend going through when experiencing this issue.
    OS X Mavericks: Solve printing problems
    http://support.apple.com/kb/PH14142
    Cheers,
    Jason H.

  • How to remove a hidden text in pdf file with Acrobat Pro 9. How to save pdf file and remove hidden text?

    I
    I made this file in indesign, the highlited empty spaces indicates that their is a hidden text and it pop up when searching for some words in pdf file. so how can I save pdf file to keep only the seen text ???

    Dear lrosenth,
    I went through some codes/suggestions in internet and I found that I need to have cmap file and cid font file for the respective font since pdf doesn't support unicode fonts directly.
    Can you help me to know where can I get cmap file and cid font file for tamil language font Latha(TrueType) microsoft font.
    Regards,
    Safiq

  • Editing hidden text in pdf?

    Scanning 19th-century and early 20th-century documents as TIFFs, creating PDF using the original images as pages. OCR can't recognize the text well, so the hidden text needs extensive editing. Using Adobe Acrobat 8.0 Professional on Windows 2000 Professional.
    Can see the hidden text using Examine Document; can only edit using the Text TouchUp Tool on the page, where the hidden text is not visible.
    * Is there any way to see both the hidden text and the page at the same time?
    * Is there a better way to edit the text?
    * Is there any way to import text to use in the hidden text?
    * Is there any way to apply hidden text to an image where none was created in the OCR conversion?

    The answer to most (maybe all) of your questions is probably to use 'proper' OCR software like Abbyyy Finereader or ReadIris. Trial downloads are available from their web pages. Output can be in the same pdf format that you need.

  • Activate OCR and Enable Comment in PDF document On an Unix platform

    Hi every Body,
    I have an amount of PDF document stored on an unix server, and i want to anable "Add comment" feature for all of those documents, so that i can
    open every document by Adobe Reader and add comments; sticky, underline....etc. This feature is Avalabe in Acrobat Reader Pro 9 i test it
    it work fine, but i need to do same thing in commande line, i mean install a library or something else and i can do this operation by taping a command on shell terminal.
    The same problème with te OCR feature.
    Thanks for you help

    Hi,
    Unless you use Acrobat 8 Pro's Formatted Text & Graphics" or Acrobat 9 Pro's ClearScan you will find that there is no
    practicable means of editing the OCR "hidden text" in a PDF.
    The TouchUp text tool (Advanced Editing toolbar) is reliant upon the selected text having an available system font to use during touchup. However, both Searchable Image and Searchable Image (Exact)  OCR output is of text rendering mode 3 (invisible text) that is provided from within Acrobat and not any installed system or other application installed font.
    With Searchable Image (Exact) you have the untouched image augmented by the invisible text which is provided as a user aid for search or find with Adobe Reader or Acrobat. The invisible text is not intended to support word processor like editing.
    To your questions:
    #1. There is no practicable way to edit invisible text (text rendering mode 3) with Acrobat (any past or current release).
    #2. None.
    #3. A good question. Perhaps a specialty program. Keep in mind, many products provide a promise but those those that actually deliver tend to be expensive.
    Something to play with. Using Acrobat 9 Pro or Pro Extended, try the Preflight Fixup to embed hidden text.
    Then try using the TouchUp Text tool. You may also want to see if you can change the font type of this newly embedded font.
    (use copies of the "real" files - just in case <g>).
    Be well...

  • Hidden Text in PDF file generated from Ai

    One of my clients (an Ad Agency) has a problem with a PDF file.
    They make the layout in Adobe Illustrator them (to send the file to the newspaper) use the "Save as" menu and use the prepress setting.
    The designer use "Helvetica Neue" the TrueType that came with MacOSX.
    But for a weird reason one letter in the headline dissappear... this one>>>> "É"
    When I check the file in Acrobat 9 and X reports a "Hidden Text".
    Any idea what happen there???
    Thanks a lot

    "Save as PDF" occasionally writes internal links as external links (pointing to a file with the current PDF file name). Such links won't work after the PDF is renamed, even if the PDF is a stand-alone PDF.
    Try printing the book to a .ps file and distilling, instead of "Save as PDF".
    Also see: http://www.microtype.com/Hmmms.html#0702
    Shlomo Perets
    MicroType, FrameMaker/Acrobat training & consulting
    "24 easy ways to improve your PDFs with FrameMaker-to-Acrobat TimeSavers/Assistants",
    http://www.microtype.com/ImprovePDF.html

  • Preview - Cannot copy and paste text from pdf

    Hi there ...
    I usually have no problem copying and pasting text from a pdf using Preview ...
    However .. I recently received a pdf ... and there is no way it will highlight the text to copy ...
    The older pdf's are not affected ...
    I did a get info on the pdf ... it's not locked ... and it was made by Adobe In Design CS2
    I tried opening it with Adobe Reader 9 ...... but still no luck
    IN my search .. I noticed one other person had the same problem ... but he didn't get any responses..
    Thanks for any info ....

    If it's behind something else then you can open it in Illustrator and select the specific object you want. If it's rasterized then no.
    If this is a document you can share with the world then I'm sure we could tell you what specifically is going on if you posted it here.

  • Embedded fonts and hidden text

    I've just discovered that any font used in hidden text only isn't embedded. Has anyone else found this?
    My hidden text includes Times New Roman italic. This text becomes visible on the preprint event. TNR italic isn't used for any non-hidden text. If the user fills out the form and prints it, fonts are perfect. However, if the user fills out the form and saves it, TNR italic isn't embedded in the form. Later when the user opens the form to print, the previously hidden text actually prints in Arial!

    Yes, you can dynamically load the needed fonts at run time.
    Doc is here:
    http://livedocs.adobe.com/flex/3/html/styles_10.html
    hth,
    matt horn
    flex docs

  • How to quickly switch between speech bubble and highlight text in pdf viewer in Maverics.  Can now only use menu to change to speech bubble in Maverics (Tool Icon gone since Lion and no shortcut listed)

    I use Preview to mark up student assignments, switching between the speech bubble and highlight markup tools.  The speech bubble is not on the toolbar since Maverics upgrade.  It also have no shortcut, which means navigating the menu every time since the Text shortcut, under which the speech bubble is now located does not remember the previously selected item and always reverts to plain text.
    From a usability point of view, for me this is a massive step backward since the Lion version...
    Or is there some hidden magic that will let me switch between these tools without having to navigate the menu everytime...
    Neal

    100% agreed with this. The new interface is far too bulky, Even on my gaming computer's massive 21inch monitor, it's nearly impossible to keep up with my chat group of 23+ people. I constantly have to scroll back up to read messages, making it furstrating and inconveniant. And now that the update is being forced on all of them, they're complaning of the same issues. Like said previously, these are full computers, Not bloody tablets or cell phones, but even then the new skype's even more useless on my Venue8 Pro tablet, I get to see a whole 3 messages at a time (And don't get me started on the Win8 Skype app). And what made it worse, MS has removed all previous versions of skype from the internet. THEN even when I managed to get my hands on an old version, I can only log it in once before it kicks me off and gives me a worthless error message, "Skype Can't Connect". (But seriously, what happened to giving us error codes guys? Advanced users like myself want to be able to fix the issue ourselves!) MS, I'm sorry but, you need to stop fixing things that are not broken. I'm wasing my time trying to correct this now, and I really don't apreciate that. What happened to quality control and testing?

  • Dreamweaver CS5.5 - Header, Named Anchor and Hidden text

    Hello,
    I am a relative novice when using dreamweaver. I don't use it in an official capacity, rather than as a Hobby.
    I have created a webpage with a position fixed header, displaying a Jump Menu/Drop Down Menu linking to named anchors on the page. I have a css linked to the page using:
    #header {
        text-align: center;
        width: 100%;
        background-color: #CFC;
        position:fixed;
        top:0;
        left:0;
        z-index:1000;
    When I use the menu to jump to the named anchor (#a04x01), the page will display the text, but the "Chapter One" text is hidden behind the header.
    My question is how to either offset the named anchor so when it is jumped to, it appears 200px above where it is meant to (So displaying "Chapter One")?, or to alter the header so it is in a solid state so the text cannot scroll "beneath" it, rather the page considers the bottom on the header the top of the page?.
    Any suggestion or solves would be gratefully accepted.
    PS: If needed, I can supply the .html page, & .css
    Thank you
    Regards
    TjStorm

    Remove position:fixed from your <header> and you won't have that problem.  Fixed positioned elements are not part of the normal document flow as they are at fixed coordinates at all times.
    This example uses fixed elements with ample top- and bottom-padding between sections.  View source to see the code.
    jQuery Smooth Scrolling with Fixed Layout
    Nancy O.

  • Copy and paste text from pdf tablet to indesign cc 9.1

    Is it poosible to copy/paste from a PDF (created from an exel-fil) into an inDesign-dokument without getting all the tekst into the first cell??
    (Paste without formatting is not avaible)

    If it's behind something else then you can open it in Illustrator and select the specific object you want. If it's rasterized then no.
    If this is a document you can share with the world then I'm sure we could tell you what specifically is going on if you posted it here.

  • Some text in PDF fails to print -- document originally created with InDesign

    Hello. Thanks in advance for your help.
    I joined Creative Cloud a few days ago, and dowloaded several CS6 programs -- InDesign, Illustrator, Photoshop and Acrobat XI Pro. My operating system is Windows 7.
    I created a simple order form using InDesign, and it prints perfectly directly from the InDesign platform. After creating a PDF by selecting Adobe PDF as the printer, the resulting PDF looks perfect on screen when viewed in Acrobat XI Pro. However, when printing from Acrobat to a paper-and-ink printer, chunks of the text go missing. Two other people who were emailed the PDF of this document have experienced the same problem, with the same chunks of text failing to print.
    Since it prints perfectly from InDesign, I am guessing the problem lies somewhere on the Acrobat side of the equation. Since the same problem has occured with several different printers, I am assuming it's not a problem with my particular printer.
    I have already tried unchecking the "Rely on system fonts only; do not use document fonts" option in the Adobe PDF printing preferences, and that did not fix it. I didn't expect that was the problem, since the rest of the type prints perfectly.
    I would like to attach two PDFs and/or the original InDesign document, but I cannot find an attachment option on the interface for this discussion page. I would be happy to email them to anyone who would like to take a look at them.
    As usual, I am working on a deadline and would very much appreciate any help you can give me.
    Thank you!

    YES! A SOLUTION!
    Thanks to an Adobe user who goes by "Test Screen Name" for this solution!
    DO NOT print to PDF from InDesign to create a PDF. Instead, use the Export function.
    I followed that advice, and it solved the problem. Hope this helps you, JuleeBruce!

  • Download and placing a remote pdf in an Illustrator Document.

    I'm trying to make an Illustrator that while download a pdf from a remote url and then allow the user place this on the page.
    I can download the pdf object no problem, but the PlacedItem needs a file handle.
    Is this possible or is it restricted due to security reasons?

    I managed to do it using CarlosCanto suggestion.  Downloaded it as a content and saved that to a file.  I was hoping to not do the step of saving it as a file, but it looks to me like due to the way Illustrator manages assets it needs an actual file.

  • I get little squares when I copy and paste text from safari to a MSword document. help, I get little squares when I copy and paste text from safari to a MSword document. help

    Does mavricks hava anything to do with this?

    This sometimes indicates some missing fonts, or some other font problem.  What happens if you paste the text, say, into a TextEdit document?

Maybe you are looking for

  • I've been unable to use mail in Mountain Lion because it is stuck updating the database

    Hi, I installed Mountain Lion yesterday and i didn't have much time to try it till today in the morning... When i tried to open mail it started a wizard for a kind of migration or conversion to the new mail app. The point is that i have the same scre

  • Rotae photo causes image size change?

    Hello -- So, I've taken some photos with my camera held vertically, then I import the images, and then I want to rotate them. I KNOW that this did not used to happen in earlier versions of iPhoto, but, when I rotate the image to its proper "direction

  • How to customize the *Search Help ID* in selecion parameters of FBL5N?

    Hi All, I'd like to know if is possible to customize the Search Help ID in the selecion parameters of FBL5N Thanks for Your Help G.

  • Acrobat X does not install the Adobe PDF printer

    When I try to install the Acrobat X on a Vista SP2 machine, the installation completed successfully but it does not create the Adobe PDF icon in the printers section. This was not a problem earlier, but there were some microsoft patches and GPO updat

  • Iweb issues on Maverick

    My problems is as follows. I uploaded Maverick last Thursday. I don't know why I did it but I did. I think it was a stupid move on my part. The first problem I encountered was when I went to update my Iweb file, it displayed an older file. I checked