Preview corrupts PDF documents when saving

There appears to be a serious bug in Preview and/or OS X's PDF creation libraries, that causes PDFs to be "invisibly" corrupted.
My situation is: I have a number of PDFs that have been created using Windows-based OCR software. They are standard PDF/A documents, and when I open them in Preview they display fine. More importantly, I can "copy" the text from the document to the clipboard and it works as you would expect.
However, if the PDF document it edited in any way - page order changed, a page from another document moved into the document etc - and then saved, the resulting PDF is corrupt. Although the text appears on the screen normally, any text copied to the clipboard is garbled: for example, the displayed text:
AUTUMN SPECIAL!
appears in one of my documents. However, highlighting it and copying it results in
*)﴿*%&(﴾'"!# $ 
in the clipboard.
I now have hundreds of documents that are effectively useless, as I cannot accurately copy the document text. I know others have had the same issue (see the posts in this Superuser.com thread for examples). The issue would appear to go as far back as Lion and possibly before then.
Is this a "known issue"? And has anyone come up with work-arounds - other than re-OCRing the files (usually by exporting as TIFFs, reOCRing etc)?

This issue is still happening on Mavericks and it also happens with Adobe Acrobat Reader. If you highlight some lines on a PDF text and then save it, the OCR becomes unreadable. If you try to redo the OCR on Acrobat Pro, this is impossible because the pages 'contain renderable text'. The only solution I can fathom is not to use annotation on any PDFs. The strange thing is that I cannot find any solutions to this, or any bugs submitted on Acrobat's forums, where they should also be, because this is not just a problem with Apple.

Similar Messages

  • Why can't I save text I've entered into a PDF file?  When I hit "Save As", only the PDF document is saved, but not the text i typed into the document.  I'm using Windows 8.

    Why can't I save text I've entered into a PDF file?  When I hit "Save As", only the PDF document is saved, but not the text i typed into the document.  I'm using Windows 8.

    THANK YOU!
    Jan Whitfield
    The College Planning Center
    250 Palladio Parkway, Suite 1311
    Folsom, CA 95630
    (916) 985-0453
    www.TheCollegePlanningCenter.com

  • How can I rotate a PDF document when viewing in adobe reader?

    How can I rotate a PDF document when iewing in adobe reader?

    That hardware button can actually be toggled to work either as a mute button, or a lock orientation button. This can be edited at the settings screen.

  • Firefox doesn't include PDF extension when saving pdf doc

    Problem .. FireFox strips PDF file extension from file name when application option "Portable Document Format (PDF)" is set to "Save File" and when downloading a Fidelity Investment's statement or trade confirmation pdf document.
    When I attempt to download and save without previewing a PDF doc on Fidelity Investment website, Firefox does not save the file with the PDF extension. Which results in the file loosing its association with its appropriate viewing app.
    However if the firefox \tools\options\applications\"Portable Document Format (PDF)" is changed to either "Always Ask" or "Preview in Firefox" the problem doesn't occur.
    Or if the user right clicks the PDF link on the source html page and selects "save link as" the dialog box will ask to save the doc as a PDF and correctly add the PDF extension to the saved file name.
    This problem occurs with firefox 28.0 for windows and firefox 28.01 for android.
    The problem doesn't appear to occur with chrome or IE. However they don't offer a means to do a single click save.

    Yes I noted that method works already..
    I was trying to get the "single click save" to function as designed.

  • Unable to open PDF document when run through task/workflow

    Hi all,
    I am currently usign a background method to send a smartform as a PDF document to all the agents who approve.
    When i send the smartform in mail as a PDF documetn throught the task i get an error message when opening the PDF document which says that 'The file could nto be opened because it was either not a supported file type or because the file was damaged(for example, it was sent as an email attachment and wasn't decoded correectly'.
    The below is the code i am using to send the smartform
    { CALL FUNCTION 'SSF_FUNCTION_MODULE_NAME'
        EXPORTING
          formname           = 'ZFI_ASSET_PDA_NOTIF'
        IMPORTING
          fm_name            = lv_form_name
        EXCEPTIONS
          no_form            = 1
          no_function_module = 2
          OTHERS             = 3.
      IF sy-subrc NE 0.
        MESSAGE ID sy-msgid TYPE 'S' NUMBER sy-msgno
        WITH sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
      ENDIF.}
    { ls_ctrlop-getotf    = 'X'.
      ls_ctrlop-no_dialog = ''.
      ls_compop-tdnoprev  = 'X'.
    ls_ctrlop-DEVICE = 'LOCL'.
      ls_ctrlop-preview   = 'X'.
      ls_ctrlop-no_dialog = 'X'.}
      CALL FUNCTION lv_form_name
        EXPORTING
          control_parameters = ls_ctrlop
          output_options     = ls_compop
          user_settings      = 'X'
          gv_prctr_txt       = gv_prctr_txt
          gv_cc_text         = gv_cc_text
          gv_class_txt       = gv_class_txt
          gv_lname           = gv_lname
          gv_fname           = gv_fname
          gv_position        = gv_position
          gv_initiator       = gv_initiator
          gv_bookcost        = gv_bookcost
          gv_bookvalue       = gv_bookvalue
          gv_date            = gv_date
          gv_prctr           = gv_prctr
          gs_anla            = gs_anla
          gv_emv             = gv_emv
          gv_cond            = gv_cond
          gv_operable        = gv_operable
          gv_action          = gv_action
          gv_justify         = gv_justify
          gv_replace         = gv_replace
          gv_replace_comment = gv_replace_comment
          gv_process         = gv_process
        IMPORTING
          job_output_info    = lt_job_output
        EXCEPTIONS
          usage_error        = 1
          system_error       = 2
          internal_error     = 3
          OTHERS             = 4.
      IF sy-subrc <> 0.
        MESSAGE ID sy-msgid TYPE 'S'  NUMBER sy-msgno
                WITH sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
      ENDIF.
      LOOP AT lt_job_output-otfdata INTO ls_otf.
        lt_otf = ls_otf.
        APPEND lt_otf.
        CLEAR lt_otf.
      ENDLOOP.
      CALL FUNCTION 'CONVERT_OTF'
        EXPORTING
          format                = 'PDF'
          max_linewidth         = 132
        IMPORTING
          bin_filesize          = lv_len_in
        TABLES
          otf                   = lt_otf
          lines                 = lt_tline
        EXCEPTIONS
          err_max_linewidth     = 1
          err_format            = 2
          err_conv_not_possible = 3
          OTHERS                = 4.
    Fehlerhandling
      IF sy-subrc EQ 0.
      ENDIF.
      LOOP AT lt_tline.
        TRANSLATE lt_tline USING '~'.
        CONCATENATE ls_buffer lt_tline INTO ls_buffer.
      ENDLOOP.
      TRANSLATE ls_buffer USING '~'.
      append ls_buffer TO lt_objbin.
    DATA  lv_counter         TYPE i.
      DO.
        lt_record = ls_buffer.
        APPEND lt_record.
        SHIFT ls_buffer LEFT BY 255 PLACES.
        IF ls_buffer IS INITIAL.
          EXIT.
        ENDIF.
      ENDDO.
    Attachment
      REFRESH:
      lt_reclist,
      lt_objtxt,
      lt_objbin,
      lt_objpack.
      CLEAR ls_objhead.
      lt_objbin[] = lt_record[].
    lt_objhex[] = lt_record[].
      CLEAR lv_descp.
      CONCATENATE 'Asset'
                  gs_anla-anln1
                  'Approval details'
             INTO lv_descp
    SEPARATED BY space.
    Create Message Body
    Title and Description
      CONCATENATE  'Please find attached the approval details of asset : '
                   gs_anla-anln1
              INTO lv_mail_descp
             SEPARATED BY space .
      lt_objtxt = lv_mail_descp.
      APPEND lt_objtxt.
      CLEAR  lv_mail_descp .
    appending space lines into text of mail.
    DO 3 times.
      CLEAR lt_objtxt.
      APPEND lt_objtxt .
    ENDDo.
    CONCATENATE 'This is an automatically generated message.'
                 'Please do not reply to this mail.'
           INTO lv_mail_descp
         SEPARATED BY space .
    lt_objtxt = lv_mail_descp.
    APPEND lt_objtxt.
      DESCRIBE TABLE lt_objtxt LINES lv_lines_txt.
      READ     TABLE lt_objtxt INDEX lv_lines_txt.
      ls_doc_chng-obj_name = 'smartform'.
      ls_doc_chng-expiry_dat = sy-datum + 10.
      ls_doc_chng-obj_descr = lv_descp.
      ls_doc_chng-sensitivty = 'F'.
      ls_doc_chng-doc_size = ( lv_lines_txt - 1 ) * 255  + STRLEN( lt_objtxt ) .
    Main Text
    wa_doc_chng-doc_size = ( v_lines_txt - 1 ) * 255 + strlen( i_objtxt ).
      CLEAR lt_objpack-transf_bin.
      lt_objpack-head_start = 1.
      lt_objpack-head_num = 0.
      lt_objpack-body_start = 1.
      lt_objpack-body_num = lv_lines_txt.
    lt_objpack-doc_type = 'RAW'.
        lt_objpack-doc_type = 'TXT'.
      APPEND lt_objpack.
    Attachment
    (pdf-Attachment)
      lt_objpack-transf_bin = 'X'.
      lt_objpack-head_start = 1.
      lt_objpack-head_num   = 0.
      lt_objpack-body_start = 1.
      DESCRIBE TABLE lt_objbin LINES lv_lines_bin.
      READ TABLE lt_objbin INDEX lv_lines_bin.
      lt_objpack-doc_size = ( lv_lines_bin - 1 ) * 255 + STRLEN( lt_objbin ) .
      lt_objpack-body_num = lv_lines_bin.
      lt_objpack-doc_type = 'PDF'.
      lt_objpack-obj_name = 'ATTACHMENT'.
      lt_objpack-obj_descr = lv_descp.
      APPEND lt_objpack.
      LOOP AT t_mail_addr.
        CLEAR lt_reclist.
        lt_reclist-receiver = t_mail_addr-mailid.
        lt_reclist-rec_type = 'U'.
        APPEND lt_reclist.
      ENDLOOP.
      IF t_mail_addr[] IS INITIAL.
        CLEAR lt_reclist.
        lt_reclist-receiver = mail id.
        lt_reclist-rec_type = 'U'.
        APPEND lt_reclist.
      ENDIF.
      CALL FUNCTION 'SO_NEW_DOCUMENT_ATT_SEND_API1'
        EXPORTING
          document_data              = ls_doc_chng
          put_in_outbox              = 'X'
          commit_work                = 'X'
        TABLES
          packing_list               = lt_objpack
          object_header              = ls_objhead
          contents_bin               = lt_objbin
          contents_txt               = lt_objtxt
          receivers                  = lt_reclist
         CONTENTS_HEX                = lt_objbin
        EXCEPTIONS
          too_many_receivers         = 1
          document_not_sent          = 2
          document_type_not_exist    = 3
          operation_no_authorization = 4
          parameter_error            = 5
          x_error                    = 6
          enqueue_error              = 7
          OTHERS                     = 8.
      DATA: l TYPE sy-subrc.
      l = sy-subrc.
    When i run the above code in Se38, I get an email with a attachment that i am able to open. The problem seems to be with user WF_BATCH.
    I have already checked all the existing threads on this. I did this coding as per one of the SDN threads itself.
    I am able to open the PDF attachment even when i test the method. The problem is only when it is run through a task/workflow.
    Please advice.
    Thanks and Regards,
    Soumya Gayatri.
    Edited by: Soumya Gayatri on Apr 14, 2009 8:40 PM
    Edited by: Soumya Gayatri on Apr 14, 2009 8:44 PM
    Edited by: Soumya Gayatri on Apr 14, 2009 8:45 PM
    Edited by: Soumya Gayatri on Apr 14, 2009 8:47 PM
    Edited by: Soumya Gayatri on Apr 14, 2009 8:47 PM

    Hello,
    You should see an "Edit" in the top right of your post, you can use that to edit it.
    As I said, use  before and after the coding, but without the spaces (after "{" and before "}" ).
    Try it!
    Testing:
    this is code
    regards
    Rick Bakker
    Hanabi Technology

  • Modifying pages of a pdf document and saving so keeps separate pages

    I could use some advice on how to do this. I have a pdf document that has 8 pages. I want to change some things on 4 of the pages. I open in Photoshop CS4 and it shows me the 8 pages. I select the first page I want to modify, change it but when I go to save the changes, my pdf document now saves as just that single page.
    I need to be able to modify and save the individual pages but keep the concept of the original pdf file that has 8 separate pages. How do I change a page and "reinsert" it back into the original document and maintain the pages so when I send it as a pdf to a client they can open in Acrobat and scroll thru the pages.
    Thanks.

    Mylenium’s assessment that Photoshop is not the tool of choice for pdf-editing is correct.
    One exception is if the pages consist of no vector-data but (best only one) image/s.
    In which case alt-double-clicking the image in Acrobat with the TouchUp Object Tool should open it in Photoshop and on saving there it should be updated in the pdf in Acrobat.

  • Trouble with flash preview of PDF documents, embedding.

    Yesterday, I noticed that the embedded PDF documents that I have posted on my blog weren't loading. The message in the embed window reads: 'Unable to Create Flash Preview. Please download to view in the native application.' In Acrobat.com, when I try to open the file for viewing it has a message that says: 'This file cannot be previewed. Please download instead.'
    Thinking that there might be a problem with my PDF files, I uploaded a Word .doc, and used Acrobat.com to create the PDF. That PDF wouldn't preview either. I signed into my wife's Acrobat.com account, and had the same problem.
    Any help would be appreciated! Cheers, Chris. I will attach a copy of the PDF I am trying to upload and embed, in case there are problems with the file itself (?)

    Michelle-
    Thanks for getting back to me so quickly. It's weird, eh? I tried the flash update, tried opening everything in FireFox... Same problems. I am attaching a few screencaps so that you can see what it is doing.
    #1 - This is what it looks like when I click on the file of Acrobat.com - http://i78.photobucket.com/albums/j94/nilscrasher/Picture1.jpg
    #2 - Here's a working PDF that I uploaded two days ago - http://i78.photobucket.com/albums/j94/nilscrasher/Picture2.jpg
    #3 - Uploading a new PDF, Acrobat.com displays this before going to #1 - http://i78.photobucket.com/albums/j94/nilscrasher/Picture3.jpg
    #4 - Here is a PDF that I uploaded two days ago, embedded (and working) on my website http://i78.photobucket.com/albums/j94/nilscrasher/Picture4.jpg
    #5 - Here's the failed embed in iWeb. (Looks the same on the page.) - http://i78.photobucket.com/albums/j94/nilscrasher/Picture5.jpg
    I attached the PDF file to the previous post. It's strange. I also logged in to my wife's Acrobat.com account and got the same problem.
    Cheers,
    -Chris.

  • Reset PDF bookmarks when saving as PDF from structured FM - Book 11.0 with fm components (*.book).

    Hi,
    I've read the forum discussions/solutions on setting PDF bookmarks, but I'm afraid the various solutions appear to only work if you are consistently working in .fm book files, not if your source files are in structured .fm format.
    For example, I have to constantly reset the bookmark settings in FrameMaker when I follow our PDF process of saving the structured FM files via the File -> Save Ditamap As -> Book 11.0 with fm components (*.book) route. I can set the bookmarks in the first file of the book, and/or set them using the Format -> Document -> PDF Setup menu options. But an hour later, if I discover I need to make a change in our source material (.ditamap/.xml) and create a new PDF again, I must make the same exact bookmark settings at the .fm level in this process (i.e. creating new .fm files from the .ditamap/xml files overwrites the previous .fm files, requiring bookmark setting, again).
    I think the only possible solution for a short-cut in this situation is to write a script to set those bookmarks each time we go from .xml to .fm. Does anyone see another way around this?
    Thanks!
    Diana

    Hi Diana...
    You're right that only FM binary files can store PDF setup information (in theory XML files could store this data, but that's not the way it's currently set up). In order to have this data available in files generated from XML, you'll need to set it up in the structure application template(s). You also need to make sure that all files in the book use the same tag names. The following topic was written for DITA-FMx users, but the concept should apply to regular FM-DITA as well ..
    http://docs.leximation.com/dita-fmx/2.0/?ditafmx_setuppdfbookmarks.html
    I hope that helps.
    …scott

  • 4500 envy cuts off right hand side of pdf document when I print to letter size paper

    This is a new printer for me.  It is connected to my laptop with a USB cord.  I am attempting to print an e-mailed PDF document .  When I preview, I see that the right-hand side of the document is cut off  by more than an inch.  How do I get the printer to format the document to fit on letter-sized paper? Reducing the document size to under 100 % does not help.

    Thank you for your response!  Try the steps below in order, and try printing your email again. Let me know what happens! Reset the printing systemRepair disk permissionsRestart the MacClick here to download and install the printers Full Driver: HP ENVY 4500 e-All-in-One PrinterTry printing your email and also from Text Edit. Good luck! 

  • Border getting added to PDF document when printing using ePrint

    We have HP LASERJET PRO 200 COLOR MFP M275nw printer. We are using the Version 2.3.1 of the HP ePrint Android application to print a PDF document from an Android tablet. When document gets printed, it has borders all around and the document. Because of this the document appears "shrunk". If we use Adobe Reader application on PC and print the document at "Actual Size" then no borders get added to the document and the document appears to be printed at exact scale. Is this a limitation of the printer or a bug in the ePrint application. Is there a HP printer (other than what we are using) where using ePrint application we can print "borderless" PDF documents from Android tablet?
    Sanjay

    SanjayDandekar wrote:
    ..  When we print from Android, it adds borders.
    Hi,
    You have to wait for Android (Google) or HP to produce an app which can print borderless.
    Regards.
    BH
    **Click the KUDOS thumb up on the left to say 'Thanks'**
    Make it easier for other people to find solutions by marking a Reply 'Accept as Solution' if it solves your problem.

  • Change Field of FI Document when saving MIRO or MIR7

    Hi.
    i want to change an field of an FI-Dokument when saving  the MM-Dokument.
    i use a badi for this, but i can't find the BKPF stucture.
    what must i do`?
    thanks
    marcus

    Hi Marcus,
    For SD to FI document, i once checked the BADI   AC_DOCUMENT. Check if this is triggered for your transactions.
    But there is a limitation on the Accounting header fields that can be changed
    According to the below note only the field  BKTXT is released for change at header level
    Note 1025810 - Field BKPF-BKTXT (doc header text) empty in the FI document
    What is the BADI which you are using ??
    Regards

  • Officejet L7680, corrupt PDF file when scanning

    When scanning on my L7680, I have just started to have issues with a "General Error" message (on and off) and also a corrupt PDF file error.  What can I do to fix this? 
    HELP!!!!

    Hi 1193,
    Welcome to the HP Forums!
    I see that you cannot scan with your HP Officejet L7680, and I am happy to help you with this scanning issue!
    For further assistance, I will need to know the following:
    If you are using a Windows or Mac Operating System, and the version number. To find the exact version, visit this link. Whatsmyos.
    If the printer is connected, Wireless, Ethernet, or USB.
    If the power cable is plugged into a surge protector, or directly to the wall outlet. Issues when Connected to an Uninterruptible Power Supply/Power Strip/Surge Protector. This applies to Inkjet printers as well.
    If the printer is able to make copies by itself.
    If you are using Windows, please try our HP Print and Scan Doctor, and let me know what happens!
    Hope to hear from you, and have a great day!
    RnRMusicMan
    I work on behalf of HP
    Please click “Accept as Solution ” if you feel my post solved your issue, it will help others find the solution.
    Click the “Kudos Thumbs Up" to say “Thanks” for helping!

  • MIGO printing material document when saving.

    Hi,
    how enabling automatically the printing of materials document in MIGO when saving ?
    Best regards

    The steps are the following:
    trx M706 or SPRO->Matl Mgmt->Inv Mgmt and Phy Inv->Output Determination->Maintain Output Types, for the Output types WE01, WE02 and WE03;
    Default Values: Dispatch Time is 3 or 4 as per reqmt. and Tr medium is 1, Print Parameter is 7;
    trx MN21, for output type WE01, WE02, WE03 select Tr Type WE, Print Version 1, 2, 3, maintain Print Item       as 1.
    For enabling printing when user post a goods receipt is necessary set in trx SU01 for the users the parameter NDR and activate it with an X.
    This will set the tick in the field XNAPR in MIGO.
    At every logon system will propose the field checked
    If the user will remove the tick or will put again the tick during a session system will remember
    the last setting.
    At the following logon system will propose again the above field checked
    System will also remember if user has set in MIGO "GR Note Vers1 or GR Note Vers 2 or GR Note Vers3" for the entire session and furthers logons.

  • Any way to password protect pdf document when emailing?

    Is there any way to password protect an adobe document when emailing like you would a word document?

    Hi bcanino,
    You can apply security to a PDF using Acrobat (apply an Open Document password, a Document Permissions password, or both). For more information, see PDF passwords, protected PDF, file permissions | Adobe Acrobat XI
    I hope that helps.
    Best,
    Sara

  • Signing digitally changes fonts in document when saved

    I have just been sent a couple of document to sign - something I have done many times in the past few weeks.
    When I save these documents after signing them - much of the  text in the documents changes to something difficult to read (I suspect it may be courier).
    This seems like a serious error to me as a lot of the point of digital signatures is that you sign and return a document that you can't otherwise have made  changes to.
    Is there anything I can do to stop this happening?
    I'm using Acrobat Reader 10.1.9 on Mac OSX 10.9.2 (Mavericks). The documents were created on a PC.
    I've attached a limited screen shot - although the document is fairly confidential so I can't upload most of it.

    Hi Steve,
    Thanks for the reply. Unfortunately, my question was posted so long ago that I no longer have access to the documents. I printed and signed paper copies and deleted them. So the problem has gone away for me. However, I DO think this is a very serious issue.
    I am a bit rusty now but I used to work a lot with Postscript and PDF. In my day, you could only realistically modify the content of a pdf document by appending new content to the file and then appending  a new index. That ensured that, although the visual appearance changed, the underlying electronic document did not and even deleted items were still present - just no longer indexed and hence not displayed. That was a great system as it maintained an audit trail. I  assumed that was how signatures worked which is why I was surprised to see the symptom I sent you.
    I fully understand all you say about font substitution and multi-master fonts but I fail to see why it is relevant. If adding a signature is coded in a “well behaved” manner, adding it could not affect the rest of the document - including embedded or non embedded fonts. If it is not coded in that way, and some sort of active editing of the original content is going on, I am concerned that the legality of signed documents is seriously compromised as, if you can modify one bit of the original to add a signature you can edit any other bit of the document and subtly change it’s meaning WITHOUT AN AUDIT TRAIL. If that is what is happening, I for one would be wary of accepting important documents as signed PDFs instead of paper documents without thoroughly scrutinising the whole thing to see if any subtle changes have been introduced.
    All the best
    Dave

Maybe you are looking for