[SOLVED] Wanted: extract pdf annotations created in Mac OSX preview

I just had a collegue return a pdf marked up with "annotations".
I first struggled with finding a viewer that views the annotations, and those that could (evince, okular) all have more dependencies that I'd want/need.
I decided there must be an lighter weight way to extract the text of annotations from pdfs.  I searched, read, and learned about pdf annotation formats and I figured out how to extract adobe annotations from a pdf.  Only then did I realize that my collegue didn't use adobe.  The annotations were created in Mac OSX's preview.  Preview, it would seem, does not use the adobe xfpdf format for annotations, it uses some other means to embed the annotations.
I searched the pdf file in text and hex editors, but I couldn't find anything resembling what I knew to be in some of the annotations.  However Preview does it, the text must be encoded and unreadable in the "raw" file.  This is in contrast, it would seem, with adobe's annotations.
My question, then, is two-fold.  First, are there any text-annotation extractor tools that can get notes created in OSX's preview?  If not, does anyone know of any documentation outlining how Preview embeds this information?  I've been googling the latter question, but I'll I'm getting is "where-to-click" level tutorials on how to DO annotations in Preview.  I can't find any documentation of how that information is embeded into the file.
Note: evince does get the text of the annotations, but I'd prefer not to keep that installed.  Evince-gtk is only every so slightly lighter on dependencies.  Also, in evince I get flooded with "sticky notes" with all the annotation text.  It would take a while to move those around and close them one at a time to be able to actually read them.  I'm hoping to be able to extract all annotation text and dump it into a text file.
Last edited by Trilby (2012-05-29 00:38:32)

Holy crap, it works.  A couple hours of reading ugly docs and 32 lines of code later, I have exactly what I was looking for.
I'd love it if some volunteers would try this out, beat it up, and see what breaks first so I can improve it.  After a little polishing I might put it up in the AUR.
Get the code from my dropbox here
The Ugly Patchwork Makefile and a very brief TODO list are also posted.
I'll put together a PKGBUILD once this is in better order for distribution and installation.  I just got the darn thing to work, it's time to celebrate, not code more.
Note to Mods: as in my "report", please move this thread to Community Contributions.
Last edited by Trilby (2012-05-29 00:34:38)

Similar Messages

  • Wanted! Website Thumbnail Capture Tool mac osx

    Wanted! Website Thumbnail Capture Tool mac osx/Safari
    ...plenty of them for PC. just enter the URL and it creates a JPG (various sizes) of that URL.

    Greetings,
    What you're looking for is a utility to capture individual web pages, not thumbnails. Safari can already do that using the Save As… command, and choose Web Archive as the format. It will save all the elements of that page which you can open at any time in Safari.
    If you're wanting an image of the page, you can use Paparazzi!

  • Archive creates empty MAC OSX folders and second folder with documents...

    I have been sending .zip documents to customers for some time. I archive using the ctrl + click method (choosing CREATE ARCHIVE...)
    Customers have been saying that they can't find the files, that the folders are empty. I didn't know what they were talking about...until I archived a folder, transfered it to a LACIE external hard drive on the Ethernet network, decompressed it and saw what they were talking about! What is that MAC OSX folder? Why does CREATE ARCHIVE create 2 folders with the document in one of them? I frankly don't get it. The external drive we have is a LaCie Mini 250 MB formatted in ext 3 (Linux?) before it was formatted HFS+ but it created problems and the LaCie technicians suggested that we format the disk in Ext 3 (Ext 3, 13 35, I don't quite remember). The disk was originally formatted in Fat32 (Windows) which didn't work (filenames, unable to copy or delete, etc).
    Is CREATE ARCHIVE the problem? or is it network disks which are PC disks or PC-originally-intended disks?

    Hi, Louis. Welcome to the Discussions.
    See my post here and the related comments concerning Create Archive in that topic. I believe that will help you understand what's happening.
    Good luck!
    Dr. Smoke
    Author: Troubleshooting Mac® OS X

  • Open PDF from Projector on Mac OSX

    Hi,
    I need to make a hybrid CD-Rom that has PDFs. I want to
    launch the PDF in Acrobat, not a browser (they interactive and have
    bookmarks etc). I can get it to work on the PC side with no problem
    but am having trouble on the Mac.
    I've made an applescript that works, and have it in the
    "fscommand" folder.
    However, I can't get the Flash projector to launch it. I'm
    using this code on a button;
    quote:
    on (release) {
    fscommand("exec", "testing.app");
    It's a disk image. I can't work out why it doesn't work and
    it is driving me insane.
    Any help greatly appreciated.

    HI
    Read this
    http://www.flashjester.com/?section=faq&cPath=28_41_58
    Regards
    FlashJester Support Team
    e. - [email protected]
    w. - www.flashjester.com
    "This has been one of the most impressive and thoroughly
    pleasant
    experiences of customer support I have ever come across -
    astounding!"
    Director - hedgeapple

  • Problem with PDF files created by Indesign

    PDF files created by Adobe Indesign come up as blank pages whereas PDF files created by Mac OS X or Apple's Pages display OK. Is there a difference in the way PDF files are created?

    No problem with Adobe Reader 8 or 9 .

  • How do I install Mac OSX Lion client on the new Mac Mini Server?

    If Apple would have had the quad-core processor option for the non-server Mac Mini, I would have just bought that, but I wanted the quad-core. I do not, however, need the server software. I found this article talking about how to disable the server functionality, but this article highlights how little that method actually does.
    Ultimately, I just want to do a clean reinstall like I used to do prior to Lion. This process used to be so easy. Create a disk image of the desired operating system on USB, option boot, and you're done. Now it appears that Apple purposely impedes this process, as every time I boot from the USB I created with Mac OSX Lion client, I get a circle with a line through it. Is there any way around this restriction? Editing firmware, editing .plist files, etc?
    Very disappointed that Apple is limiting what used to be such a simple process on Hardware and Software that I paid for but now can't get the functionality I want.

    Sorry to be the bearer of bad news, but what I don't think you can do what you're asking for.  Closest thing is going to be disabling the server components like it says in the article, but again, that doesn't do much.

  • Windows or Mac osx Lion?

    I have mac book pro and want to install windows 7 and mac osx lion , so which OS  first install?

    Depends entirely on the VM program you choose to use:
    https://discussions.apple.com/docs/DOC-2741

  • I can't print a PDF created in MAC with window 8.1?

    En pdf-fil skapad i MAC går inte att skriva ut med Wondows 8.1

    Hi.
    The problem is now fixed by help of HP.
    Best regards
    Lars Hansson
    Bergsgatan 24 A
    83241 Frösön
    Tfn: 070-3853088
    Från: Ajlan huda 
    Skickat: den 7 november 2014 08:28
    Till: Lars Hansson
    Ämne:  I can't print a PDF created in MAC with window 8.1?
    I can't print a PDF created in MAC with window 8.1?
    created by Ajlan huda <https://forums.adobe.com/people/Ajlan+huda>  in Adobe Reader - View the full discussion <https://forums.adobe.com/message/6908045#6908045>

  • Copying content from a PDF created from Mac os x10.6.7 quartzpdf context to a MS-Word document

    When I select the text from the PDF document, choose the copy option, then paste, I see garbled text in the Word document. I tried OCR recognition using Acrobat Professional 9
    but that failed stating "This page has graphics other than images and text. It cannot be captured"
    I checked the properties of the PDF document, and noticed that it was created from Mac os x10.6.7 quartz pdf context and PDF version was 1.4.
    The text I am trying to copy is not an image and the document has no protection at all.
    What else can I try?

    If you look at the fonts (ctrl-D>fonts) you will probably see the fonts may not be embedded, or if they are you do not have them on your system. For instance, a MAC uses Helvetica and Windows uses Arial. They look similar, but are different fonts. Likely when you copy to WORD, you are having the problem that the fonts are not on your system. Why not just ask the author for the original document if you need to copy the information. You should asking anyway if you are going to use their info.

  • Firefox does not create a pdf properly using "print Save as pdf" in Mac OSX. Why?

    When I "Print > Save As Pdf" in Mac OSX, Firefox will always clip the right margin of the page, leaving out alot of information from the document. Safari creates the same page effortlessly, but I generally prefer Firefox.
    == This happened ==
    Every time Firefox opened
    == ? After updating to Firefox 3.5.8 - maybe?

    I am able to command-print-save as PDF and get the whole page but Firefox crashes every time.

  • Full screen mode PDF created on Mac not working on PC?

    I've created a PDF presentation on a Mac in Adobe Acrobat 8 Professional with about 80 pages. I've set it to open in full screen mode when viewed, which works fine on Mac when tested. When opened on a PC however, in my clients corporate environment, it does not work? Has anyone else come across this problem or have any suggestions to remedy it? Could it be something to do with the system it is being opened on blocking full screen mode?

    Check this out in Acrobat help type : Full Screen Mode
    look at the following topics:
    Defining initial View as Full Screen Mode
    Defining initial View
    It appears Fullscreen mode is based on OS Specifications So you need to reopen on a PC version of Acrobat  and set it again.
    Looks like they could have both settings in both programs that is a Line that says set full screen mode for Mac PC Unix Linux set for all.

  • I want read PDF file from SAP directory and create a spool request or print

    Hi all,
    I want read PDF file from SAP directory and create a spool request or print the pdf through SAP. Can any body  help me in this.
    Also please write to me if its possible to open PDF from SAP directory to adobe pdf reader.
    Thanks in advance,
    Sunny

    Hi Sunny,
    Check these links.
    http://www.sapdevelopment.co.uk/reporting/rep_spooltopdf.htm
    http://www.erpgenie.com/sap/abap/pdf_creation.htm
    http://www.geocities.com/mpioud/Z_EMAIL_ABAP_REPORT.html
    http://www.thespot4sap.com/Articles/SAP_Mail_SO_Object_Send.asp
    http://www.sapdevelopment.co.uk/reporting/email/attach_xls.htm
    Hope this resolves your query.
    Reward all the helpful answers.
    Regards

  • Document Properties - Extract Application that Created the Pdf

    Hello,
    I need to extract the original application that a pdf was created from in javascript.  eg in the document properties the application is shown as follows:
    Application:  AutoCAD 2010
    The console script I have is as follows:
    console.println('Application: '
    + this.info.Application +' ');
    When the script is run it does not extract the application which the pdf was created from.
    Can anyone please advise how the script can be modified to extract the application from the document properties?
    Any assistance will be most appreciated.
    Thank you.

    Thank you very much for your help, most appreciated .
    I can now extract the application which created the pdf as follows:
    console.println('Application that create pdf: '
    + this.info.Creator +' ');
    Thanks, Again.

  • How do I extract an audio file from a PDF in Pro? (Mac)

    How do I extract an audio file from a PDF in Pro? (Mac) . Properties say it's a .m4a file.
    Thanks

    Hi Tim,
    That isn't a supported feature. If the PDF is unsecured and you have the legal rights to extract the audio, it would be possible to locate and decompress the raw file data from the PDF file using third-party tools to read the low-level file structure (a hex editor or Cos editor) - but Acrobat can't help you.
    Regards,
    Rave

  • Problems opening .pdf files created in Windows in MAC

    My clients that are using Macs are having problems opening password protected PDFs that I created in Windows. Any suggestions?

    Are your clients using Adobe Reader on Mac, or the built-in Mac OS Preview?  Preview does NOT support the full PDF standard :(.
    From: Adobe Forums <[email protected]<mailto:[email protected]>>
    Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>
    Date: Mon, 26 Sep 2011 15:04:18 -0700
    To: Leonard Rosenthol <[email protected]<mailto:[email protected]>>
    Subject: Problems opening .pdf files created in Windows in MAC
    Problems opening .pdf files created in Windows in MAC
    created by Hollcy<http://forums.adobe.com/people/Hollcy> in PDF Language and Specifications - View the full discussion<http://forums.adobe.com/message/3939175#3939175

Maybe you are looking for

  • PO Print of Free Goods

    Vendor has send free goods but excise duty and freight, etc we have to pay. If I put free item tick in the PO, the condition tab disappears. I cannot enter the basic price and any other condition in the PO. therefore tax and other conditions cannot b

  • GETTING ROW COUNTS OF ALL TABLES AT A TIME

    Is there any column in any Data dictionary table which gives the row counts for particular table.. My scenario is...i need to get row counts of some 100 tables in our database... instead of doing select count(*) for each table....is there any way i c

  • The weblogic.security.Security.runAs() and JAAS Subject

    Let say that I have Java client with some JAAS code that authenticates the user. The LoginContext generates a Subject containing the Principal name of the authenticated user, but also some private credentials that makes the Subject secure. Now I want

  • Non Financial Accounts,and Balance,Balance Recurring,Types of data and type

    Hi Can Any one make me clear for the following? 1.In HFM Account Types we can find Balance and Balance Recurring what it means? is it completely relating to Finance and Accounts Topics? if so give me few good examples to understand as regards to HFM

  • SPRO config field for Material Master & Outline agreement Contracts

    Dear Gurus, I want the Field AUTO PO should be mandatory field - for creating New materials - Can you please suggest me in SPRO configuration? For outline Agreement ( contracts) - i want to make the Plant field as mandatiry - Can you please suggest m