Editing hidden text in pdf?

Scanning 19th-century and early 20th-century documents as TIFFs, creating PDF using the original images as pages. OCR can't recognize the text well, so the hidden text needs extensive editing. Using Adobe Acrobat 8.0 Professional on Windows 2000 Professional.
Can see the hidden text using Examine Document; can only edit using the Text TouchUp Tool on the page, where the hidden text is not visible.
* Is there any way to see both the hidden text and the page at the same time?
* Is there a better way to edit the text?
* Is there any way to import text to use in the hidden text?
* Is there any way to apply hidden text to an image where none was created in the OCR conversion?

The answer to most (maybe all) of your questions is probably to use 'proper' OCR software like Abbyyy Finereader or ReadIris. Trial downloads are available from their web pages. Output can be in the same pdf format that you need.

Similar Messages

  • Editing Hidden Text with PDF

    We currently are in the process of scanning our historic documents and we had them scanned with OCR and now we are using a indexer that looks at the hidden text for indexing.  We will be using this for searching for documents.  However some of these documents are older.  Which means OCR did not work and great as expected.  I want to see if there is a way to edit that hidden text from the OCR to change minor things in order to ensure the ability to find the documents.  Any hints would be greatly appricated.
    Thank you,
    Jeff

    Jeff,
    Historical documents - implies you'd not want to adversely effect the scanned image. If so, then OCR Searchable Image (Exact) (SIE) is desired.
    As to editing the OCR output. Neither Searchable Image (Exact) nor Searchable Image lend themselves to this.
    Yes, there are work around's; but... labor intensive and awkward from within Acrobat.
    If you use SIE, consider exporting out the OCR of each PDF to a text file. Referencing the PDF or the source paper you can edit this text (migrated to a word processor perhaps). Output a second PDF. Use these for the catalog index. Link each second PDF to the first PDF.
    Search gets you to PDF 2, the link gets you to the scanned PDF.
    Alternatively, hold off for Acrobat X.
    Today's Adobe Acrobat X: First Look eSeminar demonstrates how Acrobat X can export an OCR'd scanned image directly to Word with impressive retention of layout and format (without coming from a Tagged PDF). This process would permit getting cleaned up text back into PDF(s) to serve as the source of a Cataloged index.
    The eSeminar will be presented again Thurs., Oct. 21.
    See: http://acrobatusers.com/events/49361/adobe-acrobat-x-first-look
    Be well...

  • OCR and hidden text in PDF scans of historic documents

    I need to edit the hidden text behind a scanned PDF image of a document.  The image must remain as an “exact” copy of the original scanned document.
    I used Acrobat Pro (versions 7 and 9) to make PDF images of old typed documents from the 1940’s.  When I open those images and run OCR in version 9, then examine the hidden (invisible) text layer behind the image, there are errors.  For example, the word “book” has been picked-up by the OCR as the word “look.”  I need to change the “l” to a “b” in order to make the PDF accurate when it is searched at a later date. 
    I have checked many user forums.  Most people imply that hidden text can be viewed, but NOT edited in Acrobat Pro 7 and 9.  (Hidden text can be viewed in Version 9 by selecting “Document” “Examine Document” and then clicking on the “+” symbol next to “Hidden Text,” then clicking “Show preview.”)  Some say to use Adobe Capture 3.0 to edit hidden text.  Others say to use Photoshop or Illustrator to edit hidden text (I think these folks may have been confused, because Photoshop and Illustrator would be used, logically, to edit the image ON TOP OF the hidden text).  Yet another person seemed to say that a hidden text editor was added to Acrobat 8, but was taken away in Acrobat 9.  (I can’t verify that because I don’t have version 8.)
    The closest answer I was able to find involved using the Text Touch Up Tool on top of the image to edit hidden text behind it, but when you do that you are typing “blind.”  In other words, you highlight a spot on the image (top layer) where you THINK the error MIGHT be, and you type the correction without being able to see what you are typing over.  Then, you go back to the “Examine Document” procedure (described above) to see if you “hit” your mark, and if not, you redo it until you do “hit” your mark.  With the number of documents and corrections that we have, that procedure would be too labor intensive and thus a budget breaker.
    If we have to buy more software, my preference would be to buy a genuine Adobe product because I have experienced problems in the past switching back and forth between Adobe products and other PDF manipulation software.
    Can anyone answer any of these questions: 
    (1) Is there a way in Acrobat versions 7, 8 or 9 to edit hidden text, and if so, how? 
    (2) What Adobe software (other than Acrobat) will edit hidden text behind a PDF image? 
    (3) Assuming no Adobe product will edit hidden text behind a PDF image, is there any non-Adobe products that will do that?
    Thank you!

    Hi,
    Unless you use Acrobat 8 Pro's Formatted Text & Graphics" or Acrobat 9 Pro's ClearScan you will find that there is no
    practicable means of editing the OCR "hidden text" in a PDF.
    The TouchUp text tool (Advanced Editing toolbar) is reliant upon the selected text having an available system font to use during touchup. However, both Searchable Image and Searchable Image (Exact)  OCR output is of text rendering mode 3 (invisible text) that is provided from within Acrobat and not any installed system or other application installed font.
    With Searchable Image (Exact) you have the untouched image augmented by the invisible text which is provided as a user aid for search or find with Adobe Reader or Acrobat. The invisible text is not intended to support word processor like editing.
    To your questions:
    #1. There is no practicable way to edit invisible text (text rendering mode 3) with Acrobat (any past or current release).
    #2. None.
    #3. A good question. Perhaps a specialty program. Keep in mind, many products provide a promise but those those that actually deliver tend to be expensive.
    Something to play with. Using Acrobat 9 Pro or Pro Extended, try the Preflight Fixup to embed hidden text.
    Then try using the TouchUp Text tool. You may also want to see if you can change the font type of this newly embedded font.
    (use copies of the "real" files - just in case <g>).
    Be well...

  • How do you edit document text in pdf in Acrobat Pro 11 Mac?

    How do you edit document text in pdf in Acrobat Pro 11 Mac? I know I can do it in the Windows version, but can't find same tool in Mac.

    Should be the same but it isn't. I have included a screenshot of my tools choices in both my Mac and PC versions - they are totally different.
    Mac version
    PC Version

  • How to remove a hidden text in pdf file with Acrobat Pro 9. How to save pdf file and remove hidden text?

    I
    I made this file in indesign, the highlited empty spaces indicates that their is a hidden text and it pop up when searching for some words in pdf file. so how can I save pdf file to keep only the seen text ???

    Dear lrosenth,
    I went through some codes/suggestions in internet and I found that I need to have cmap file and cid font file for the respective font since pdf doesn't support unicode fonts directly.
    Can you help me to know where can I get cmap file and cid font file for tamil language font Latha(TrueType) microsoft font.
    Regards,
    Safiq

  • Hidden Text in PDF file generated from Ai

    One of my clients (an Ad Agency) has a problem with a PDF file.
    They make the layout in Adobe Illustrator them (to send the file to the newspaper) use the "Save as" menu and use the prepress setting.
    The designer use "Helvetica Neue" the TrueType that came with MacOSX.
    But for a weird reason one letter in the headline dissappear... this one>>>> "É"
    When I check the file in Acrobat 9 and X reports a "Hidden Text".
    Any idea what happen there???
    Thanks a lot

    "Save as PDF" occasionally writes internal links as external links (pointing to a file with the current PDF file name). Such links won't work after the PDF is renamed, even if the PDF is a stand-alone PDF.
    Try printing the book to a .ps file and distilling, instead of "Save as PDF".
    Also see: http://www.microtype.com/Hmmms.html#0702
    Shlomo Perets
    MicroType, FrameMaker/Acrobat training & consulting
    "24 easy ways to improve your PDFs with FrameMaker-to-Acrobat TimeSavers/Assistants",
    http://www.microtype.com/ImprovePDF.html

  • Editing Actual Text in PDF File

    Is there any way or a third party software that will allow me to actually edit text in a PDF???
    Right now, I have a PDF file that was emailed to me for printing and some text in the file needs to be modified. I would just like to edit the actual text instead of waiting for the sender to edit the text in whatever program they used to create the PDF from.
    I presently have Adobe Acrobat 7.0 Standard.
    Any help would be greatly appreciated!!!
    Thanks!
    Mike

    The Adobe Reader for SymbianOS isn't a PDF editor. You'd need Acrobat
    for that, using the Text Touch-up Tool.
    Aandi Inston

  • Tools-protection-remove hidden information = how to create report where is hidden text???

    Acrobat can show preview hidden text in pdf. Is it possible to export this preview window in some html like report or something?

    I find something about this. I make screenshoots of this with combo mouse recorder pro and screenpresso. mouse recorder click on arrow (next page number) in acrobat hidden text window and than click printscreen which trigger screenpresso to make picture into folder. Not comfortable but work.

  • How can I correct "hidden" text in a searchable PDF file?

    This seems like a simple question. However, the answers are invariably complex, do not yield the desired result, and often answer a different question entirely. I say all that just to warn people up front that the "problem" is easier than how many people and PDF application developers, including Adobe, typically understand it while the proposed "solutions" are invariably a total...well, botch is a reasonable word if a bit understated.
    Here is the actual problem:
    I have "searchable" PDF files created by scanning documents and running them through an OCR process. I create "searchable" PDF files in order to archive, index, and eventually enable searching for the documents scanned. A "searchable" PDF satisfies those criteria better than any other commonly used, "portable" archive format -- though I would be happy if someone could point out an obvious alternative I may have overlooked. I do not need perfect OCR results. If I need a document to edit or perhaps feed into a spreadsheet or database, I expect to be able to reprocess the page images in a given "searchable" PDF file to OCR and convert the contents to Word, RTF, Excel, or another file format as necessary with more care for the results than for the archived document itself. Therefore, the "searchable" PDF document is the scanned page images which compose it while the OCR generated "searchable" text is secondary, but still important. Therefore, each file must contain scanned page images of sufficient detail to be efficiently converted by OCR if possible and legible enough for whoever views the images to be able to work out what an OCR process may fail to understand. Once scanned, those pages are the "document" and therefore "immutable." However, OCR is imperfect. For a searchable document archive, it does not have to be, but some errors are significant in that they may prevent the document from being found by a search. Therefore, there must be a way to view and, if necessary, edit the "hidden" text in a "searchable" PDF without altering the visual display of a document or how it is printed. No strike-throughs. No visible "corrections." None of the stuff PDF editors want to insert into a PDF file when editing it. I do not want to edit the document without exporting it to a format appropriate for an editable document. I just want adequately "correct" hidden text in a "searchable" PDF file.
    I apologize for the length and redundancy in my description of the problem. However, past attempts to explain my problem and objectives as well as what I have seen in reply to similar queries across the Internet indicate that most people trying to answer this question come at it from the same point of view shared by most, if not all, PDF tool or application vendors. They seem to think that any desire to edit a PDF file is a desire to have a PDF word processor of some sort. Or, they assume that the OCR process employed may need tweaking of the means by which people apply it and then a process like "find suspects" is adequate to deal with any errors. But no, those are not what I am trying to accomplish and answers which address those topics do not answer this question.
    In short, which tool or application from any vendor will reveal the "searchable" hidden text in a PDF produced by any OCR or other process and then enable corrections to the hidden text without changing any document display parameters at all? Note, hidden text typically includes bounding box information denoting the portion of the image from which the text was recognized. That information must not be lost or changed when editing the "searchable" text.
    So, any tools or applications capable of doing this? If Adobe Acrobat XI Pro can (use of a trial copy demonstrated that the hidden text content can be reviewed, but editing did not work by any straight-forward means I could work out while trying out the application), fine. However, $500.00 list or even a $200.00 possible upgrade from a copy of Adobe Acrobat X Standard which came with my scanner is a lot of money for personal use when review and edit of the OCR generated hidden text in a "searchable" PDF file is the only function I require. Therefore, other suggested tools or applications which do what I need for less would be greatly appreciated.

    My "claim"? Actually I've made no "claim" such as you've mentioned.
    Simply stated your OP has foundational premises that presume as factual what is not.
    Here, we're in Adobe's hosted user forum for Acrobat.
    Any other application use is not material. 
    Acrobat XI provides 3 OCR methods.
    Searchable Image, Searchable Image (Exact) & ClearScan.
    Only the first two provide the "hidden" text output.
    (Glyphs have no stroke, no fill)
    From back to the Acrobat 3 product family the design functionality of Searchable Image and Searchable Image (Exact) has been to facilitate the use of Find / Search.
    The "hidden" text is can be touched up. Acrobat Pro provides the facility to view the hidden text.
    So you can see what the OCR output that correlates to the bit-map images of the characters that are present.  
    With Acrobat XI Pro use Tools - Protection -Remove Hidden Information
    In the Remove Hidden Information pane select "Hidden text" then "Show preview".
    The default for the preview is "Show Only Hidden Text".
    Back in the PDF --
    You'd select some of the hidden text and retype what you suspect is the correct string of characters.
    Save and return to the preview of the hidden text.
    If you got it right, good. Continue.
    If not, darn - try again.
    Plug 'n chug -- somewhere over the rainbow it'll be done eh.
    Full disclosure -- this is something I've done (enquiring minds don't you know).
    I've found it to be a rather Sisypean undertaking.
    So, "doable" but not practicable.
    This is to be expected because such touchups are not the concern / focus of the output from Searchable Image or Searchable Image (Exact) - (the names tell it all).
    To have touchup "editablity" of an OCR output using Acrobat make use of ClearScan.
    ClearScan replaces recognized character bit-maps with a character from an Acrobat internal font.
    The character strings can be selected to change to a generic, system available font.
    Something that is good to know when embarking on the "tweak the PDF" journey is that PDF (the file format / technology as defined by its ISO Standard, ISO 32000-1) does not tolerate "editing". PDF is decidely not a word processor file format and "editing" can quickly render a PDF unusable.
    Minor touchups can be made and your best "tool" for this is still Acrobat Pro. (Save As often and periodically "bank" the PDF via some file rename scheme.) 
    Be well...

  • Adding text as hidden layer in PDF's

    Hi, I have some hand written documents (Old genealogy letters) which I would like to be made searchable. Can I scan the documents as PDF’s, then manually word process the documents and add this text as a hidden text layer? Thanks Doctor Keo

    About Acrobat OCR.
    Three methods.
    #1 Searchable Image
    #2 Searchable Image (Exact)
    #3 -
    (a) Formatted Text and Graphics (prior to Acrobat 9)
    (b) ClearScan (Acrobat 9)
    #1 - Provides OCR output as a hidden text layer. Will perform some "adjustment" to the image.
    #2. - Provides OCR output as a hidden text layer. Will not "adjustment" to the image.
    #3. a & b
    If process thinks it "knows" what the character is then it replaces the image of the character.
    If process is not sure what the character is then it flags the character(s) as "suspects".
    End-user can edit "suspects".
    If process does not know what the character is the character's image is left alone as a bit-mapped image.
    Note that "ICR" vice "OCR" is meant for handwritten material that has been scanned.
    Acrobat does not provide "ICR".
    However, text from a typewritter typically provides accurate OCR provided the scan is at high enough resolution (typically, 300 ppi).
    If #1 or #2 is used you can always Save As to a *.txt file.
    This can be brought into a text editor, word processor, page layout application, etc.
    There, you can create a "clean" copy from which a PDF can be made.
    Provide the Scan of the original and use a PDF Bookmark or a Button Field having a link action to go to the copy having the corrected content with renderable text. Make a Catalog index of the cleaned up text PDFs to support advanced search.
    For all practicable purposes, there is no manipulation/edits/etc. to the hidden layer of OCR output.
    Be well...

  • Can you edit the text of a PDF that is placed in inDesign

    I have placed a PDF in inDesign and am curious if I can edit the text of that PDF while in inDesign.

    Edit it with Adobe Acrobat Pro, indesign can't. Option for Indesign is:
    of course overlay text with new text
    There is specialized software for editing PDF files, though the choices are much more limited and often more expensive than creating and editing standard editable document formats. Version 0.46 and later of Inkscape allows PDF editing through an intermediate translation step involving Poppler.
    Serif PagePlus can open, edit and save existing PDF documents, as well as publishing of documents created in the package.
    Enfocus PitStop Pro, a plugin for Acrobat, allows manual and automatic editing of PDF files, while the free Enfocus Browser makes it possible to edit the low-level structure of a PDF.
    In Acrobat you should use the TouchUp Text Tool.
    But in the end, PDF-files are not made for editing.

  • Paid to allow edit text in PDF, and not working

    I wanted to 'edit' text in PDF and followed the instructions, it told be to subscribe and pay to do this.  I did subscribe and pay, but the functionality is still not working, keeps directing me to subscribe and pay.

    Hi rrobati,
    I checked your account,your Export PDF subscription is not confirmed yet at our end.
    Once it gets confirmed you will be able to use it hassle free.
    Regards,
    Florence

  • How do I edit the text in a pdf file that has been converted to a Word doc

    How do I edit the text in a pdf file that has been converted to a Word doc?

    Hi BridgetteJean,
    Please go through this video this explains how to edit text in a pdf document.
    http://tv.adobe.com/watch/acrobat-tips-and-tricks/editing-text-with-the-typewriter-tool/

  • Can't make Photoshop PDF with editable / vector text.

    Hi,
    I'm trying to File > Save As an Adobe Photoshop CS6 PDF and then be able to open it and edit the text in Adobe Acrobat X.
    Whenever I attempt to edit the text it is a raster image and it doesnt matter what I do in the photoshop pdf settings.
    I want to be able to do this so that the text is able to be searched by google / search engines when I make the PDF available online.
    -Steve.

    That what I did by using CS6 and it worked for me. Hope it helps you too
    Step1) Moved All Graphics (Images/Backgrounds) in one folder (Folder-Layers)
    Step2) Moved All texts (title, Headings, main text etc.) in another folder (Folder-Text)
    Step3) Merge the first folder (Folder-Layers) and made a single layer by right click & Merge Group
                               OR Select folder > Layer > Rasterize > Layer
                               Now I have only one Background Layer (Graphics) and a text folder
    Step4) Go to - File > Save As > Choose Photoshop PDF –
    Check* Use Proof Setup: Working CMYK then SAVE (If you want print)      You will get a message “The settings you choose in the save Adobe PDF dialog can override your    Current settings in the Save As dialog box. “- OK
    Step 5) Save Adobe PDF Dialogue Box
                  Choose settings- 
    Adobe PDF Preset: Adobe PDF Preset 1
    Standard:        PDF/X-4:2010
    Compatibility:    Acrobat 7(PDF 1. 6)
    General
    Check- Optimize for fast Web Preview
    Check- View PDF after Saving
    Compression
    Just change Compression box None (No Zip, No JPEG., No JPEG2000)
    Don’t touch any settings. and then SAVE PDF
    Then open in Acrobat Reader and do the text changes.

  • Need to edit text in pdf file created in Illustrator CS6

    I need for someone else to be able to edit text in a pdf I created from Illustrator. They have full acrobat and should be able to edit it. I have tried it myself to test before sending it but it's not working. The original file was created in Illustrator CS6 and saved as a pdf (text has not been created outlines). I opened the file in acrobat and saved it out as an extended file to be able to edit the text but no luck. Need to be able to highlight existing text and change it. I have tried edit text tool but instead of a selection box around each line of text, it puts a selection box around all of the text. I can highlight the text and delete it but am unable to change it. Each individual text line needs to be editable. Is there something I need to do differently in Illustrator to get it to work correctly in Acrobat?
    If it helps- the file is a calendar and they need to be able to change the dates and events that are already there.

    This really isn't a suitable use of Acrobat. You should all be using the same editing tool (i.e. Illustrator).
    Extending files reduces the amount of editing Acrobat can do, it does not increase it.
    The tool to edit text from Illustrator would be Edit Document Text. There is no guarantee it will break up nicely into lines, it's trying to help by making paragraph text.
    Conceivably you could use a form for this (fill in the blank for dates) but really, you should be using the same tool. Not necessarily Illustrator, more Excel or specialist Calendar software.

Maybe you are looking for

  • Inserting on Master Detail form

    Hello, I have created two tables. One I'm using as my master form and the other for my detail form. Table 1 CREATE TABLE AGENCY ( AGY_ID NUMBER (9) NOT NULL, AGY VARCHAR2 (1) NOT NULL, AGY_DESC VARCHAR2 (10) NOT NULL, AGY_DESC_LONG VARCHAR2 (100), CO

  • 15" Core Duo Macbook Pro Distorted Display after installing Boot Camp

    I just put Leopard on and decided to use Boot Camp to install Windows XP Pro. After Windows Installed and I put on all the drivers I booted up to the Mac Side and then shut it down. After a while I booted it back up again and the display was all dist

  • Error KI 280 while converting PR in to PO

    Hi Experts , When we create a sales order with an item of item category (TAS)(Third part sales) a purchase requisition is automatically created and these requisitions pick a G/L account which is defined for third party sales We have been using this G

  • Hosting webservice created in ECC5.0

    Hi there, I am trying to create and host a webservice using the wizard available from BAPI transaction code. Now that i have completed creation, but unaware how to call it from browser. the URL info that i get to see via Transaction WSCONFIG is: /sap

  • 4 laptops in the house and only one give me good d...

    All four laptops running ok but only one give me 35.20 mbps download the rest sit around 7.00 to 14.00 mbps Please why is this anyone know? Iv had my line checked bt BT and they say its OK Only had infinity a week i now think its going to be a no goo