Reccomendations for java API -- reading pdf files?

Hi,
I need to write an application that will extract how many pages a particular pdf file has, and some other information about it -- size, date created and so on, but the number of pages is the most important.
I googled and came across iText, PDF Box and a few others, and I was wondering if anyone else had used those api's for the purpose I described.
I searched the forums, but the replies I found had to do with writing pdf's.
thanks!
bp

Actually, in case anyone else runs across this, this seemed to work from PDF Box:
               PDDocument pd = PDDocument.load(new File("pdfFiles/test.pdf"));
               System.out.println("pd info: " + pd.getNumberOfPages());

Similar Messages

  • Reading PDF files in java

    Hi,
    can any one help me on how to read pdf files in java using itext. I have written some piece of code but it is of no use. It is giving some garbage.
    import java.io.*;
    import java.util.*;
    import java.lang.*;
    import com.lowagie.text.pdf.PdfReader;
    public class PdfAccess
    public static void main(String[] args)
    try {
    String pdfFile = args[0];
    PdfReader reader = new PdfReader(pdfFile);
    int pageCount = reader.getNumberOfPages();
    System.out.println(pageCount);
    String content = " ";
    for(int i=1;i<=pageCount;i++) {
    byte[] pageContent = reader.getPageContent(i);
    content = content+(pageContent.toString());
    System.out.println(content.trim());
    } catch(Exception e) { }
    can any one help me on how to get contents of the file. Are there examples avalilable??

    * Try this by PDFBOX , it will execute well as per ur request..........
        public void getPdfText(String fileName) throws IOException {
            StringWriter sw = new StringWriter();
            PDDocument doc = null;
            try {
                doc = PDDocument.load(fileName);
                PDFTextStripper stripper = new PDFTextStripper();
                stripper.setStartPage(1);
                stripper.setEndPage(Integer.MAX_VALUE);
                stripper.writeText(doc, sw);
                OutputStream out=new FileOutputStream(new File("d://PDFText.txt"));
                PrintStream write=new PrintStream(out,true,"UTF-8");
                write.print(sw.toString());
                //System.out.println(sw.toString());
            } finally {
                if (doc != null) {
                    doc.close();
    Can..Can...If we Try...!

  • How to read pdf files using java.io package classes

    Dear All,
    I have a certain requirement that i should read and write PDF files at runtime. With normal java file IO reading is not working. Can any one suggest me how to proceed probably with sample code block
    Thanks in advance.

    hi I also have the pbm. to read pdf file using JAVA
    can any body help meWhy is it so difficult to read the thread you posted in? They say: java.io is pointless, use iText. So why don't you?
    or also I want to read a binary encoded data into
    ascii,
    can anybody give me a hint how to do it.Depends on what you mean with "binary encoding". ASCII's binary encoding, too, basically.

  • How do I make Adobe Reader NOT my default program for reading pdf files?

    I installed Adobe Reader and made it my default program for reading pdf files.  I'd like to undo that and go back to "preview" as my default.

    Sure you are following instructions:

  • Voiceover support for ABBYY pro and reading pdf files

    I want to know about voiceover support for ABBYY pro in macbook pro. and also want to know whether voiceover support reading pdf files. If so, which is the best pdf reader.

    Not sure if this is what you want, but for reading PDF files out loud, Apple Preview has a command Edit/Speech/Start Speaking, and Adobe Reader has View/Read Out Loud/Activate Read Out Loud.

  • I am getting messages that I can't download and read .pdf files since I have the wrong Adobe reader. I know about their security disasters of course, but I downloaded the latest version of Adobe Reader from the Adobe web site and I have other ,pdf file re

    I am getting messages that I can't download and read .pdf files since I have the wrong Adobe reader. I know about their security disasters of course, but I downloaded the latest version of Adobe Reader from the Adobe web site and I have other ,pdf file readers as well, and for some reason they won't work either. I have 5 computers running top end processors and RAM. By this I mean I have one, this one which I am using that has an AMD Phenom Black 3.2 Quad-core with 8 GBs of Corsair top DDR2 RAM, my other two AMD have either an Athlon II triple core with 4 GBs of DDR2 Corsair RAM, one with the Phenom X4 965 3.4 GHz Quad-core with 8 GBs of their best DDR2 RAM, and two Intels with the i7 920 Processors using the triple channel 1366 socket processors and one with 8 GBs of low latency DDR3 RAM and the other with 4 GBs of the same RAM. I am getting the message on this one, which has a fresh install of XP Pro X64 operating system, as do the other 4 as well. I have run Avast Business Pro Anti-virus on this one, which I am getting the message on with a single result which I deleted, and also both Spybot Search and Destroy, which came back clean as well as Malwarebytes Antimalware, which got a lot of tracing cookies now removed, and SuperAntiSpware which also found a few cookies also now deleted. Can you tell me what I need to do to get these files to show as .pdf files rather than as a clean blank page. One other issue is that I wish to know how to turn off my downloads so they are saved and Mozilla will give me the option of returning them instead of me losing them all together as it does now. Thanks for your assistance. If there is another Adobe reader I should download and install, could you provide me with the link to it? I appreciate your assistance here
    == When I download and try to read a .pdf file and when I am asked to turn off all Firefox files and if I do, I lose them since I need to know how to save them without rebooting my computer.

    Brilliant! Problem solved! Thanks so much.

  • How can I read pdf files from LabVIEW with different versions of Acrobat reader?

    How can I read pdf files from LabVIEW with different versions of Acrobat reader?
    I have made a LabVIEW program where I have possibility to read a PDF document.  When I made this LabVIEW program it was Acrobat Reader 5.0.5 that was installed on the PC. Lather when the Acrobat Reader was upgraded to version 6.0, there was an error when VI tries to launch the LabVIEW program. And Later again when we upgraded to Acrobat Reader 7.0.5 I must again do some changes and rebuild the EXE files again
    It isn't so very big job to do the changes in one single LabVIEW program, but we have built a lot of LabVIEW programs so this take time to due changes every time vi update Acrobat Reader. (We have build EXE files.)
    The job is to right click the ActiveX container and Click "Insert ActiveX Object", then I can brows the computer for the new version of acrobat Reader. After this I must rebuild all the "methods" in the Activex call to make the VI executable again.
    Is there a way to build LabVIEW program so I don't have to do this job every time we update Acrobat Reader?
    This LabVIEW program is written in LabVIEW 6.1, but I se the problem is the same in LabVIEW 8.2.
    Jan Inge Gustavsen
    Attachments:
    Show PDF-file - Adobe Reader 7-0-5 - LV61.vi ‏43 KB
    Read PDF file.jpg ‏201 KB
    Show PDF-file - Adobe Reader 5-0-5 - LV61.vi ‏42 KB

    hi there
    try the vi
    ..vi.lib\platform\browser.llb\Open Acrobat Document.vi
    it uses DDE or the command line to run an external application (e.g. Adobe Acrobat)
    Best regards
    chris
    CL(A)Dly bending G-Force with LabVIEW
    famous last words: "oh my god, it is full of stars!"

  • Adobe flash player and reading pdf files.

         I've recently been having issues with Safari reading online pdf files and loading up sites using adobe flash player.
         I've somehow tracked down a temporary solve by going into Safari's "preferences", then the security tab.
         Where it shows "internet plug-ins" I have it checked so it can load adobe flash. But when it comes to reading pdf files (adobe reader)
         I have to uncheck "internet plug ins". So in turn I can read the pdfs, but not use adobe flash player.
         There seems to be a conflict between the adobe reader and flash player that I can use both at the same time.
         Any suggestions on how to correct this issue?
         Here's a screenshot of what I have. And if I have to delete one of these plug-ins how can I do this as I don't see any options to do so.

    Back up all data before making any changes. Please take each of the following steps until the problem is resolved.
    Step 1
    If Adobe Reader or Acrobat is installed, and the problem is just that you can't print PDF's displayed in Safari, you may be able to print by moving the cursor to the the bottom edge of the page, somewhere near the middle. A black toolbar may appear under the cursor. Click the printer icon.
    Step 2
    There should be a setting in its preferences of the Adobe application such as Display PDF in Browser. I don't use those applications myself, so I can't be more precise. Deselect that setting, if it's selected.
    Step 3
    If you get a message such as ""Adobe Reader blocked for this website," then from the Safari menu bar, select
    Safari ▹ Preferences... ▹ Security
    and check the box marked
    Allow Plug-ins
    Then click
    Manage Website Settings...
    and make any required changes to the security settings for the Adobe PDF plugin.
    Step 4
    Triple-click anywhere in the line of text below on this page to select it, the copy the selected text to the Clipboard by pressing the key combination command-C:
    /Library/Internet Plug-ins
    In the Finder, select
    Go ▹ Go to Folder
    from the menu bar, or press the key combination shift-command-G. Paste into the text box that opens (command-V), then press return.
    From the folder that opens, move to the Trash any items that have "Adobe" or “PDF” in the name. You may be prompted for your login password. Then quit and relaunch Safari.
    Step 5
    The "Silverlight" web plugin distributed by Microsoft can interfere with PDF display in Safari, so you may need to remove it, if it's present. The same goes for a plugin called "iGetter," and perhaps others — I don't have a complete list. Don't remove Silverlight if you use the "Netflix" video-streaming service.
    Step 6
    Do as in Step 4 with this line:
    ~/Library/Internet Plug-ins
    If you don’t like the results of this procedure, restore the items from the backup you made before you started. Relaunch Safari.

  • Install Adobe Reader but cannot read PDF file

    Hi all,
    I am new and not familiar yet with forum so excuse me for a duplicate posting at
    http://www.adobeforums.com/webx/.59b65e89/6
    which is a thread "cannot install Adobe Reader". Since mine is a different problem, am transferring what I posted there into a new thread, which is as follows:
    ===============================
    I cannot read PDF files with web browser so uninstalled and reinstalled with Adobe Reader 9
    I still cannot read PDF file with web browser. After clicking on a link that is a link to a PDF file, I just get a blank page
    Why and what can I do?

    AOL used to have a lot of problems with PDFs, hope that has not returned. Anyway, the PDF was probably sent as a text file rather than encoded as a binary file. That would make it corrupt and is a problem on the senders end. Older versions of Outlook also had problems with PDFs that would cause a problem on your end.

  • How do I download a free version of Adobe to read pdf files?

    How do I download a free version of Adobe to read pdf files?

    If you go to the following page you can download the free Acrobat Reader program. Deselect any potentially unwanted offerings being pre-checked for inclusion with the download to avoid having them installed as well.
    Adobe - Adobe Reader download - All versions

  • How can I read pdf files in linux?

    I know the pdf files can be read by Adodb acrobat,but i can not find the software for linux,How can I download the similar tools ?
    Thanks for advance.

    You can download AcrobatReader fro Linux from adobe.com like the versions of rother OS'es.
    however if you want a nicer UI than Acrobat consider using gv with ghostscript. They might already be installed on your system, since they are part of most Linux distributions.
    It's worthwhile to upgrade Ghostscript to the most recent version, othervise you will be unable to read pdf files that utilizes the newest fads from Adobe.

  • Why can I no longer download and read .pdf files

    I have never had any problems downloading and reading .pdf files until I 'upgraded' the Adobe reader to version 11. Now all attempted downloads fail because "I don't have the necessary helper application or even C:\DOCUME~1\Will-o\LOCALS~1\Temp\Back Channel II - The Vietnam Betrayal - Chaps 1&2.pdf could not be opened, because the associated helper application does not exist. Change the association in your preferences. What in the world is going on and how do I fix it?
    Under 'Tools - Applications - Adobe Acrobat Document, the I've selected 'Use Adobe Reader.' However since I also have Adobe Acrobat 8.3, I've tried substituting that for the Reader, but that doesn't work either. Please help.

    Brilliant! Problem solved! Thanks so much.

  • How to invoke alt-text for images in a PDF file by Automation

    Hi,
    Can any one help me?
    How to invoke Alt-text for Images in a PDF file using script?
    Thanks for looking into this.
    Regards,
    Sudhakar

    What do you mean "invoke" alt-text?  If Alt-text is there, then it will be presented to a screen reader.

  • Searching for links across multiple pdf files

    We have thousands of pdf files that are being moved to a new website. Some of these pdf files have links within them (either as text or as a hyperlink). This number is unknown.
    The issue is how to programmatically search across multiple pdf files (numbering in the thousands) looking for links using a regular expression or part of a path. This will have to be able to search behind the text and search for the link url.
    We first need to identify the number of files with links and create a list of the files with links that need modifying. If the number is too great to modify manually, then we would need the ability to programmatically edit these links.
    The pdf files are stored in a database. Also, the pdf files are different versions and some are password protected.
    Is there an Adobe product that will perform this? If not, are there any 3rd party vendor products that will accomplish this?
    Thanks in advance for your help.

    I have no solution, but a thought: the database factor may seem to be
    a killer. But you could look for a solution designed to read PDF files
    from a web site (by spidering or from a list), which would presumably
    load them.
    Or could do a one off extraction of the files from the database into a
    directory and use that for your process. Probably a very good idea,
    since extracting all files from the database is likely to be costly
    and hammer the server (but can be scheduled at a sensible pace), while
    the search process will (if it is possible at all) doubtless need to
    be run countless times.
    Aandi Inston

  • Read PDF files in Safari on 10.5 --- HELP PLEASE!

    I accidently deleted the Internet Plug-in to read PDF files in 10.5. The plug in is located in /Macintosh HD/Library/Internet Plug-Ins/
    I believe it is called "Adobe Acrobat ???" (or something like this)
    If someone could post it or e-mail it to me, I'd be one grateful person!
    Thanks!!
    e-mail to deut3221 AT mac.com
    Thanks.

    you can download Adobe Reader for free from Adobe.com. It will read PDF files, you shouldn't need someone to email you a copy.

Maybe you are looking for