Globally delete text from a PDF?

Hi,
Firstly, my apologies for wading in with a question that may have been asked. The forum search didn't turn it up but I'm not sure I searched with the right terms.
I'm trying to remove all the text from a PDF to leave the vector images behind - that way I can use the images as a test. I don't mind if it's a solution like making all the text white, just as long as it's invisible in the document.
I had thought this would be reasonably straightforward but I've been looking around for about a day now and no luck. I installed the pdfbox library which had an example code snippet to do this, but my abilities ran short.
Any ideas?
I'm using Acrobat 9 Pro v9.4.6 on Mac OS X 10.7.2 if that helps.
Thanks for reading, and for any help.
Cheers,
Matthew.

George - THANK YOU!
When I pulled up the Profiles in the Preflight dialog, under Create PDF Layers there is a setting for "Put all text objects on a layer". I double-clicked this, gave it an output file name and off it went - lifted out tens of thousands of text objects from a several-hundred page PDF. Took an hour or so - but not too bad after all that searching and hassle. The layer turns off nicely - I presume that will carry over when I print it!
I had no idea that Preflight was capable of this kind of manipulation, but in hindsight it makes sense.
THANK YOU again.

Similar Messages

  • How to extract text from a PDF file?

    Hello Suners,
    i need to know how to extract text from a pdf file?
    does anyone know what is the character encoding in pdf file, when i use an input stream to read the file it gives encrypted characters not the original text in the file.
    is there any procedures i should do while reading a pdf file,
    File f=new File("D:/File.pdf");
                   FileReader fr=new FileReader(f);
                   BufferedReader br=new BufferedReader(fr);
                   String s=br.readLine();any help will be deeply appreciated.

    jverd wrote:
    First, you set i once, and then loop without ever changing it. So your loop body will execute either 0 times or infinitely many times, writing the same byte every time. Actually, maybe it'll execute once and then throw an ArrayIndexOutOfBoundsException. That's basic java looping, and you're going to need a firm grip on that before you try to do anything as advanced as PDF reading. the case.oops you are absolutely right that was a silly mistake to forget that,
    Second, what do the docs for getPageContent say? Do they say that it simply gives you the text on the page as if the thing were a simple text doc? I'd be surprised if that's the case.getPageContent return array of bytes so the question will be:
    how to get text from this array? i was thinking of :
        private void jButton1_actionPerformed(ActionEvent e) {
            PdfReader read;
            StringBuffer buff=new StringBuffer();
            try {
                read = new PdfReader("d:/getjobid2727.pdf");
                read.getMetaData();
                byte[] data=read.getPageContent(1);
                int i=0;
                while(i>-1){ 
                    buff.append(data);
    i++;
    String str=buff.toString();
    FileOutputStream fos = new FileOutputStream("D:/test.txt");
    Writer out = new OutputStreamWriter(fos, "UTF8");
    out.write(str);
    out.close();
    read.close();
    } catch (Exception f) {
    f.printStackTrace();
    "D:/test.txt"  hasn't been created!! when i ran the program,
    is my steps right?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • How to extract text from a PDF file using php?

    How to extract text from a PDF file using php?
    thanks
    fabio

    > Do you know of any other way this can be done?
    There are many ways. But this out of scope of this forum. You can try this forum: http://forum.planetpdf.com/

  • Hi I've a big problem with adobe acrobat reader XI pro and I hope you can help me. The problem is; when I past copied text from some pdf books (not all of them) it past symbols only! wherever I past it! and even if I coped that text from another pdf reade

    Hi
    I've a big problem with adobe acrobat reader XI pro and I hope you can help me.
    The problem is; when I past copied text from some pdf books (not all of them) it past symbols only! wherever I past it! and even if I coped that text from another pdf reader (adobe pdf reader, internet browsers, ...etc.).
    This problem started to happen since yesterday when I installed adobe acrobat reader XI pro to try it before I buy it, and before that when I was using the free adobe pdf reader I was totally able to copy any text from any pdf and past it anywhere with nothing wrong.
    What can I do?
    thank you a lot.

    There is no product called Adobe Acrobat Reader Pro. There is
    - Adobe Acrobat Pro ($$)
    - Adobe Reader (free)
    Which do you have? And are you a programmer?

  • I am trying to delete pages from a PDF file. I opened the bookmarks, selected the pages to delete and choose Edit Delete. The selected pages are not deleted. Note: I have to open the file using a passport provided by an external party.

    I am trying to delete pages from a PDF file. I opened the bookmarks in the PDF file, selected the pages to delete and choose Edit > Delete. The selected pages are not deleted. Note: I have to open the file using a passport provided by an external party.

    Resolved

  • Deleting Text from a Table

    I want to know the shortcut for deleting text from a table on a macbook pro with the latest version of Microsoft Word. I want to keep the table and it's formatting but I want the current information deleted and don't want to have to do this cell by cell. I highlight all the text I want deleted and then.........???????
    Please help, this is very frustrating!!!

    Thanks David! You are the best!

  • Using browser javascript to copy selected text from a pdf file opened in Air app.

    I have posted this question on reader forum as well, but I think it is more suited here...
    I am trying to create a note-taking application in air. I want to extract selected text from pdf file as a string object or to the clipboard.
    Obviously, all pdfs in my local storage will not be scripted to recieve postMessages and act accordingly, and that is not practical either. So, my problem is, how can I copy the selected text in the pdf file (opened as an object in htmlloader within my Air app) to clipboard or directly in another control by say clicking a button in air application? I suppose, this is possible using javascript, however, I don't know which reader methods are exposed to the wrapper htmlloader control. In short, I want to execute app.execMenuItem("Copy") command through htmlloader javascript. Any alternate solutions are also welcome.
    This is similar to passing inbuilt commands/methods/functions (of adobe reader) to pdf-reader plugin embedded in a webpage via javascript. This is possible in IE where the pdf is rendered as activex object, and hence JSObject interface of pdf document/reader is accessible to the browser javascript. I have also read that this same JSObject is accessible to VB as interface for IAC, so as the Air is Adobe's own product, I was wondering if equivalent of JSObject is accessible to htmlloader control as well.
    Thanks in advance...
    Mits

    Thank you Thom for your reply...
    from
    http://www.adobe.com/devnet/acrobat/javascript.html
    ...Through JavaScript extensions, the viewer application and its plug-ins expose much of their functionality to document authors, form designers, and plug-in developers...
    As it is explicitly mentioned, that the functionality of adobe reader are exposed for plugin development, I thought someone here might have used external javascript to execute some safe methods in adobe reader. The functionality (i.e. external javascript interface-JSObject) is already available for VB programmers to develop IAC. Further, the Acrobat SDK example called "AcroPDFinHML" shows how one can embed a pdf-reader in a html page and execute some safe methods (like gotonextpage(), zooming etc.) in IE as ActiveX plugin. I have checked it myself for adobe reader 9, and it works perfectly, so there is no security issue as such to implement the same for another browser (like in my case, the htmlloader control in flex/air app).
    I intend to create a note taking application in air, where it is very much required that I should be able to copy selected text from various pdf documents, that are open in my app, and subsequently paste/collect/save the collected notes and process them afterwords (offcourse, from the pdfs that allow me copying text). However, it is not happening for me here. As the pdfs are opened through adobe reader plugin, it does not register the copy command executed by my air app. It registers the system level copy command (by keyboard shortcut Ctrl+C), but my air app has no way to execute the system level copy command programmatically. So I am kind of stuck here...
    Thanks again for your reply. Having known what am I intend to accomplish, any other (may be alternative) solutions will be appreciated nonetheless...
    Mits

  • A problem with copying text from english pdf to a word file

    i have a problem with copying text from english pdf to a word file. the english text of pdf turns to be unknown signs when i copy them to word file .
    i illustrated what i mean in the picture i attached . note that i have adobe acrobat reader 9 . so please help cause i need to copy text to translate it .

    Is this an e-book? Does it allow for copying? It is possible that the pdf file is a scan of a book?

  • How can I copy text from a PDF document in bb pbk?

    I've tried to copy a text from a PDF document but adobe reader doesn't give me that option. How can I do it? or there is a better reader for PDF that allows to copy, make bookmarks, to highlights?

    If the PDF is not an IMAGE, you can using a free program called PDF-XChange Viewer from Tracker Software. If the PDF was done as an image then you will not be able to select the text.
    Bold 9000 on Rogers Network - Company BES
    Playbook 16G WiFi Only

  • How do i disable copy and paste so a reader can not copy text from my pdf document?

    how do i disable copy and paste so a reader can not copy text from my pdf document? i have gone into my security preferences but can not find out how to change the settings so i can disable the copying option.

    See http://www.adobe.com/content/dam/Adobe/en/products/acrobat/pdfs/adobe-acrobat-xi-protect-p df-file-with-permissions-tutorial-ue.pdf

  • How can i copy text from a pdf while in firefox?

    i no longer seem to be able to highlight and copy text from a pdf, while in firefox 33.0, using adobe to view pdfs.
    how do i turn on the select tool, so i can do this?

    When viewing a PDF with a browser you are using a browse add-on (Adobe's, Firefox's, or some other provider).
    An add-on is not the desktop application; add-ons do not provide the full functionality of the desktop application (Reader or Acrobat).
    To have full use of all Comment and Drawing Markup tools save the PDF to the local machine's HDD.
    Launch Adobe Reader XI (pre-XI releases do not have the full set of these tools).
    Open the PDF. Annotate as desired.
    Be well...

  • How can I delete pages from a pdf file

    How can I delete pages from a pdf file

    Hi timk,
    None of the Acrobat.com online services allow you to edit a PDF file. But, in checking your account, I see that you purchased an Acrobat Standard subscription, and you can certainly use that to edit a PDF file.
    Have you downloaded and installed Acrobat Standard? It sounds to me as though you may be opening the PDF file in Adobe Reader, rather than Acrobat. (Reader doesn't allow you to edit PDF files.)
    If you haven't already, download and install Acrobat from https://cloud.acrobat.com/acrobat. Then, make sure that you're opening your PDF files in that rather than Reader, when you want to edit them.
    Please let us know how it goes!
    Best,
    Sara

  • How do i delete text from a table cell

    I want to delete text from a cell or table but not the cell or table themselves. How do I do this? The 'delete' function does not work.

    Mac Word 2011. i found the answer: 'fn' (function') key + 'Delete' key. if i had a full desktop keyboard it would be the 'Del' key, NOT the 'Delete' key.

  • How do I delete text from a textArea?

    How do I delete text from a textArea? I basically want the opposite of append(String)... I want something that removes a line from my TextArea. Anyone know?>

    If you want to remove only a specific part of the text, you have to get the String, take the parts before and after the removed part, append them back together and use setText to replace it in the textArea.

  • How do you delete pages from a PDF?

    I have an 8 page Adobe PDF file that I need 4 pages deleted from. How can perform this task?
    Thanks!

    Hi, rommerei.
    Deleting pages from a PDF requires Adobe Acrobat. You can download a trial here.
    If this is a feature you'd like to see added to our online service, feel free to submit an Idea.
    Dave

Maybe you are looking for

  • Itunes 10.3 will not launch

    itunes will not load, it seems like it cant find the library, i tried removing the quicktime helpers suggested here to no avail. i am using lion build 4 but i do not think it is the problem, it has been working with build 10.72 and up just fine. here

  • How to validate an xml with a schema w/o specifying the schema in the xml

    I have done xml validation with xml schemas, where the xml points to the xsd to use. However, I would like to not have to specifiy the xml schema in the xml document (and can't ensure that the xml coming to us has that in it). How do I, in the java c

  • Auto Population of a field

    Hi, I need a text field to be auto populated as the sum of 2 fields when the user enters them.How can this be done? Regards, Vignesh

  • Expired RMAN backupset?

    Dear all, I copied all the RMAN backupset and found this is expired. Is it possible to make this valid, NOT EXPIRED. backup piece handle=/backup/db/backup/RMAN/backup_PROD_748148532_40523_1_ibm9flpk_1_1.bck recid=40515 stamp=748148534 crosschecked ba

  • FM/Bapi for MIR4

    Hi experts, i need to do the button SaveAsCompleted on parked invoice; is there a FM/Bapi to do it? I try to use the FM MRM_INVOICE_PREPARE_AND_POST, but the state of invoice isn't changed. Could you help me, please? Thanks and regards Antonella