Reccomendations for java API -- reading pdf files?
Hi,
I need to write an application that will extract how many pages a particular pdf file has, and some other information about it -- size, date created and so on, but the number of pages is the most important.
I googled and came across iText, PDF Box and a few others, and I was wondering if anyone else had used those api's for the purpose I described.
I searched the forums, but the replies I found had to do with writing pdf's.
thanks!
bp
Actually, in case anyone else runs across this, this seemed to work from PDF Box:
PDDocument pd = PDDocument.load(new File("pdfFiles/test.pdf"));
System.out.println("pd info: " + pd.getNumberOfPages());
Similar Messages
-
Hi,
can any one help me on how to read pdf files in java using itext. I have written some piece of code but it is of no use. It is giving some garbage.
import java.io.*;
import java.util.*;
import java.lang.*;
import com.lowagie.text.pdf.PdfReader;
public class PdfAccess
public static void main(String[] args)
try {
String pdfFile = args[0];
PdfReader reader = new PdfReader(pdfFile);
int pageCount = reader.getNumberOfPages();
System.out.println(pageCount);
String content = " ";
for(int i=1;i<=pageCount;i++) {
byte[] pageContent = reader.getPageContent(i);
content = content+(pageContent.toString());
System.out.println(content.trim());
} catch(Exception e) { }
can any one help me on how to get contents of the file. Are there examples avalilable??* Try this by PDFBOX , it will execute well as per ur request..........
public void getPdfText(String fileName) throws IOException {
StringWriter sw = new StringWriter();
PDDocument doc = null;
try {
doc = PDDocument.load(fileName);
PDFTextStripper stripper = new PDFTextStripper();
stripper.setStartPage(1);
stripper.setEndPage(Integer.MAX_VALUE);
stripper.writeText(doc, sw);
OutputStream out=new FileOutputStream(new File("d://PDFText.txt"));
PrintStream write=new PrintStream(out,true,"UTF-8");
write.print(sw.toString());
//System.out.println(sw.toString());
} finally {
if (doc != null) {
doc.close();
Can..Can...If we Try...! -
How to read pdf files using java.io package classes
Dear All,
I have a certain requirement that i should read and write PDF files at runtime. With normal java file IO reading is not working. Can any one suggest me how to proceed probably with sample code block
Thanks in advance.hi I also have the pbm. to read pdf file using JAVA
can any body help meWhy is it so difficult to read the thread you posted in? They say: java.io is pointless, use iText. So why don't you?
or also I want to read a binary encoded data into
ascii,
can anybody give me a hint how to do it.Depends on what you mean with "binary encoding". ASCII's binary encoding, too, basically. -
How do I make Adobe Reader NOT my default program for reading pdf files?
I installed Adobe Reader and made it my default program for reading pdf files. I'd like to undo that and go back to "preview" as my default.
Sure you are following instructions:
-
Voiceover support for ABBYY pro and reading pdf files
I want to know about voiceover support for ABBYY pro in macbook pro. and also want to know whether voiceover support reading pdf files. If so, which is the best pdf reader.
Not sure if this is what you want, but for reading PDF files out loud, Apple Preview has a command Edit/Speech/Start Speaking, and Adobe Reader has View/Read Out Loud/Activate Read Out Loud.
-
I am getting messages that I can't download and read .pdf files since I have the wrong Adobe reader. I know about their security disasters of course, but I downloaded the latest version of Adobe Reader from the Adobe web site and I have other ,pdf file readers as well, and for some reason they won't work either. I have 5 computers running top end processors and RAM. By this I mean I have one, this one which I am using that has an AMD Phenom Black 3.2 Quad-core with 8 GBs of Corsair top DDR2 RAM, my other two AMD have either an Athlon II triple core with 4 GBs of DDR2 Corsair RAM, one with the Phenom X4 965 3.4 GHz Quad-core with 8 GBs of their best DDR2 RAM, and two Intels with the i7 920 Processors using the triple channel 1366 socket processors and one with 8 GBs of low latency DDR3 RAM and the other with 4 GBs of the same RAM. I am getting the message on this one, which has a fresh install of XP Pro X64 operating system, as do the other 4 as well. I have run Avast Business Pro Anti-virus on this one, which I am getting the message on with a single result which I deleted, and also both Spybot Search and Destroy, which came back clean as well as Malwarebytes Antimalware, which got a lot of tracing cookies now removed, and SuperAntiSpware which also found a few cookies also now deleted. Can you tell me what I need to do to get these files to show as .pdf files rather than as a clean blank page. One other issue is that I wish to know how to turn off my downloads so they are saved and Mozilla will give me the option of returning them instead of me losing them all together as it does now. Thanks for your assistance. If there is another Adobe reader I should download and install, could you provide me with the link to it? I appreciate your assistance here
== When I download and try to read a .pdf file and when I am asked to turn off all Firefox files and if I do, I lose them since I need to know how to save them without rebooting my computer.Brilliant! Problem solved! Thanks so much.
-
How can I read pdf files from LabVIEW with different versions of Acrobat reader?
How can I read pdf files from LabVIEW with different versions of Acrobat reader?
I have made a LabVIEW program where I have possibility to read a PDF document. When I made this LabVIEW program it was Acrobat Reader 5.0.5 that was installed on the PC. Lather when the Acrobat Reader was upgraded to version 6.0, there was an error when VI tries to launch the LabVIEW program. And Later again when we upgraded to Acrobat Reader 7.0.5 I must again do some changes and rebuild the EXE files again
It isn't so very big job to do the changes in one single LabVIEW program, but we have built a lot of LabVIEW programs so this take time to due changes every time vi update Acrobat Reader. (We have build EXE files.)
The job is to right click the ActiveX container and Click "Insert ActiveX Object", then I can brows the computer for the new version of acrobat Reader. After this I must rebuild all the "methods" in the Activex call to make the VI executable again.
Is there a way to build LabVIEW program so I don't have to do this job every time we update Acrobat Reader?
This LabVIEW program is written in LabVIEW 6.1, but I se the problem is the same in LabVIEW 8.2.
Jan Inge Gustavsen
Attachments:
Show PDF-file - Adobe Reader 7-0-5 - LV61.vi 43 KB
Read PDF file.jpg 201 KB
Show PDF-file - Adobe Reader 5-0-5 - LV61.vi 42 KBhi there
try the vi
..vi.lib\platform\browser.llb\Open Acrobat Document.vi
it uses DDE or the command line to run an external application (e.g. Adobe Acrobat)
Best regards
chris
CL(A)Dly bending G-Force with LabVIEW
famous last words: "oh my god, it is full of stars!" -
Adobe flash player and reading pdf files.
I've recently been having issues with Safari reading online pdf files and loading up sites using adobe flash player.
I've somehow tracked down a temporary solve by going into Safari's "preferences", then the security tab.
Where it shows "internet plug-ins" I have it checked so it can load adobe flash. But when it comes to reading pdf files (adobe reader)
I have to uncheck "internet plug ins". So in turn I can read the pdfs, but not use adobe flash player.
There seems to be a conflict between the adobe reader and flash player that I can use both at the same time.
Any suggestions on how to correct this issue?
Here's a screenshot of what I have. And if I have to delete one of these plug-ins how can I do this as I don't see any options to do so.Back up all data before making any changes. Please take each of the following steps until the problem is resolved.
Step 1
If Adobe Reader or Acrobat is installed, and the problem is just that you can't print PDF's displayed in Safari, you may be able to print by moving the cursor to the the bottom edge of the page, somewhere near the middle. A black toolbar may appear under the cursor. Click the printer icon.
Step 2
There should be a setting in its preferences of the Adobe application such as Display PDF in Browser. I don't use those applications myself, so I can't be more precise. Deselect that setting, if it's selected.
Step 3
If you get a message such as ""Adobe Reader blocked for this website," then from the Safari menu bar, select
Safari ▹ Preferences... ▹ Security
and check the box marked
Allow Plug-ins
Then click
Manage Website Settings...
and make any required changes to the security settings for the Adobe PDF plugin.
Step 4
Triple-click anywhere in the line of text below on this page to select it, the copy the selected text to the Clipboard by pressing the key combination command-C:
/Library/Internet Plug-ins
In the Finder, select
Go ▹ Go to Folder
from the menu bar, or press the key combination shift-command-G. Paste into the text box that opens (command-V), then press return.
From the folder that opens, move to the Trash any items that have "Adobe" or “PDF” in the name. You may be prompted for your login password. Then quit and relaunch Safari.
Step 5
The "Silverlight" web plugin distributed by Microsoft can interfere with PDF display in Safari, so you may need to remove it, if it's present. The same goes for a plugin called "iGetter," and perhaps others — I don't have a complete list. Don't remove Silverlight if you use the "Netflix" video-streaming service.
Step 6
Do as in Step 4 with this line:
~/Library/Internet Plug-ins
If you don’t like the results of this procedure, restore the items from the backup you made before you started. Relaunch Safari. -
Install Adobe Reader but cannot read PDF file
Hi all,
I am new and not familiar yet with forum so excuse me for a duplicate posting at
http://www.adobeforums.com/webx/.59b65e89/6
which is a thread "cannot install Adobe Reader". Since mine is a different problem, am transferring what I posted there into a new thread, which is as follows:
===============================
I cannot read PDF files with web browser so uninstalled and reinstalled with Adobe Reader 9
I still cannot read PDF file with web browser. After clicking on a link that is a link to a PDF file, I just get a blank page
Why and what can I do?AOL used to have a lot of problems with PDFs, hope that has not returned. Anyway, the PDF was probably sent as a text file rather than encoded as a binary file. That would make it corrupt and is a problem on the senders end. Older versions of Outlook also had problems with PDFs that would cause a problem on your end.
-
How do I download a free version of Adobe to read pdf files?
How do I download a free version of Adobe to read pdf files?
If you go to the following page you can download the free Acrobat Reader program. Deselect any potentially unwanted offerings being pre-checked for inclusion with the download to avoid having them installed as well.
Adobe - Adobe Reader download - All versions -
How can I read pdf files in linux?
I know the pdf files can be read by Adodb acrobat,but i can not find the software for linux,How can I download the similar tools ?
Thanks for advance.You can download AcrobatReader fro Linux from adobe.com like the versions of rother OS'es.
however if you want a nicer UI than Acrobat consider using gv with ghostscript. They might already be installed on your system, since they are part of most Linux distributions.
It's worthwhile to upgrade Ghostscript to the most recent version, othervise you will be unable to read pdf files that utilizes the newest fads from Adobe. -
Why can I no longer download and read .pdf files
I have never had any problems downloading and reading .pdf files until I 'upgraded' the Adobe reader to version 11. Now all attempted downloads fail because "I don't have the necessary helper application or even C:\DOCUME~1\Will-o\LOCALS~1\Temp\Back Channel II - The Vietnam Betrayal - Chaps 1&2.pdf could not be opened, because the associated helper application does not exist. Change the association in your preferences. What in the world is going on and how do I fix it?
Under 'Tools - Applications - Adobe Acrobat Document, the I've selected 'Use Adobe Reader.' However since I also have Adobe Acrobat 8.3, I've tried substituting that for the Reader, but that doesn't work either. Please help.Brilliant! Problem solved! Thanks so much.
-
How to invoke alt-text for images in a PDF file by Automation
Hi,
Can any one help me?
How to invoke Alt-text for Images in a PDF file using script?
Thanks for looking into this.
Regards,
SudhakarWhat do you mean "invoke" alt-text? If Alt-text is there, then it will be presented to a screen reader.
-
Searching for links across multiple pdf files
We have thousands of pdf files that are being moved to a new website. Some of these pdf files have links within them (either as text or as a hyperlink). This number is unknown.
The issue is how to programmatically search across multiple pdf files (numbering in the thousands) looking for links using a regular expression or part of a path. This will have to be able to search behind the text and search for the link url.
We first need to identify the number of files with links and create a list of the files with links that need modifying. If the number is too great to modify manually, then we would need the ability to programmatically edit these links.
The pdf files are stored in a database. Also, the pdf files are different versions and some are password protected.
Is there an Adobe product that will perform this? If not, are there any 3rd party vendor products that will accomplish this?
Thanks in advance for your help.I have no solution, but a thought: the database factor may seem to be
a killer. But you could look for a solution designed to read PDF files
from a web site (by spidering or from a list), which would presumably
load them.
Or could do a one off extraction of the files from the database into a
directory and use that for your process. Probably a very good idea,
since extracting all files from the database is likely to be costly
and hammer the server (but can be scheduled at a sensible pace), while
the search process will (if it is possible at all) doubtless need to
be run countless times.
Aandi Inston -
Read PDF files in Safari on 10.5 --- HELP PLEASE!
I accidently deleted the Internet Plug-in to read PDF files in 10.5. The plug in is located in /Macintosh HD/Library/Internet Plug-Ins/
I believe it is called "Adobe Acrobat ???" (or something like this)
If someone could post it or e-mail it to me, I'd be one grateful person!
Thanks!!
e-mail to deut3221 AT mac.com
Thanks.you can download Adobe Reader for free from Adobe.com. It will read PDF files, you shouldn't need someone to email you a copy.
Maybe you are looking for
-
my nephew was playing a practical joke on me and changed my passcode for my Iphone. Unfortunately, the joke backfired cause he misspelled the passcode when typing it in and now we can't open my iphone. Is there anyway to fix this? I have an icloud
-
Keeps asking for Audible book verification on iMac each time use remote on iTv
I bought multiple books from Audible.com but resigned 2 years ago. I have their audio books, they have my money, so why do I get prompted on my iMac to verify my Audible account for certain books--some I didn't even buy from them (uploaded from Disc
-
Value not displaying in report template
I am attempting to use a template for the first time and am having difficulty when using the following template to format the output of an ApEx 3 report. All values in the template come from the main table but TRADE comes from a joined table. The out
-
Hinted streaming, fast start and the results.
Hi, I put some clips onto a website recently, some were 5 min movies. I was pleased with the results, as the movies started almost immediately on several computers I tried. Unfortunately, the client had to wait over 4 min to see the video. My video w
-
PDF XFA form/internet browser/JAVA application/Drag and Drop
Hi guys, I need to be able to open a PDF XFA form (a PDF document where I have some input controls and javascripts for validation) inside a JPanel. If this is not doable am thinking if i can open this pdf form in a web browser by launching it from my