Reading PDF file Using java.

I tried to read the pdf file using FileInputStream. but it gives the Juncked charectars.
How can i read(means content) the pdf file using Java.

I just found the "Multivalent" library, it is free and will do exactly what you want: http://www.cs.berkeley.edu/~phelps/Multivalent/
Check out the source of the tools/ExtractText.java file
Ed

Similar Messages

  • How to read pdf files using java.io package classes

    Dear All,
    I have a certain requirement that i should read and write PDF files at runtime. With normal java file IO reading is not working. Can any one suggest me how to proceed probably with sample code block
    Thanks in advance.

    hi I also have the pbm. to read pdf file using JAVA
    can any body help meWhy is it so difficult to read the thread you posted in? They say: java.io is pointless, use iText. So why don't you?
    or also I want to read a binary encoded data into
    ascii,
    can anybody give me a hint how to do it.Depends on what you mean with "binary encoding". ASCII's binary encoding, too, basically.

  • Reading/writing PDF files using JAVA

    how to read/write a PDF file using java,
    while i read a pdf file using BUfferedReader class it gives a list of char. which is not readable.
    how to convert those files to readable format.?
    is there any special class for doin that.?
    plz explain..?

    is there any special class for doin that.?Yes, I'm sure Google knows a few libraries that ca do that.

  • Reading PDF files in java

    Hi,
    can any one help me on how to read pdf files in java using itext. I have written some piece of code but it is of no use. It is giving some garbage.
    import java.io.*;
    import java.util.*;
    import java.lang.*;
    import com.lowagie.text.pdf.PdfReader;
    public class PdfAccess
    public static void main(String[] args)
    try {
    String pdfFile = args[0];
    PdfReader reader = new PdfReader(pdfFile);
    int pageCount = reader.getNumberOfPages();
    System.out.println(pageCount);
    String content = " ";
    for(int i=1;i<=pageCount;i++) {
    byte[] pageContent = reader.getPageContent(i);
    content = content+(pageContent.toString());
    System.out.println(content.trim());
    } catch(Exception e) { }
    can any one help me on how to get contents of the file. Are there examples avalilable??

    * Try this by PDFBOX , it will execute well as per ur request..........
        public void getPdfText(String fileName) throws IOException {
            StringWriter sw = new StringWriter();
            PDDocument doc = null;
            try {
                doc = PDDocument.load(fileName);
                PDFTextStripper stripper = new PDFTextStripper();
                stripper.setStartPage(1);
                stripper.setEndPage(Integer.MAX_VALUE);
                stripper.writeText(doc, sw);
                OutputStream out=new FileOutputStream(new File("d://PDFText.txt"));
                PrintStream write=new PrintStream(out,true,"UTF-8");
                write.print(sw.toString());
                //System.out.println(sw.toString());
            } finally {
                if (doc != null) {
                    doc.close();
    Can..Can...If we Try...!

  • Query on processing a PDF file using Java mapping

    Hi All,
    i am trying to process a XML and PDF file using Java mapping, it is successful in XML but unable to do for PDF.
    below is the code i am using... can any one guide me how to process PDF's..
    byte byte1 = 0;     
    java.io.ByteArrayOutputStream bos = (ByteArrayOutputStream)outputstream;
    while((byte1=(byte)inputstream.read())!=-1){
    bos.write(byte1);
    bos.close(); 
    Thank You,
    Madhav

    Hi Madhav,
    I think instead of going with JAVA mapping you can write a custom adapter module for it.
    Ref:  /people/sap.user72/blog/2005/07/31/xi-read-data-from-pdf-file-in-sender-adapter
    Also check : Re: PI 7.1 : Taking a input PDF file and mapping it to a hexBinary attribute
    /people/shabarish.vijayakumar/blog/2009/05/17/trouble-writing-out-a-pdf-in-xipi
    Thanks,
    Edited by: Hareenkumar on Dec 21, 2010 11:12 AM

  • How to print PDF files using java print API

    Hi,
    I was goign throw lot of discusion and reading lot of forums related to print pdf files using java api. but nothing seems to be working for me. Can any one tell me how to print pdf files using java api.
    Thanks in advance

    Mike,
    Can't seem to get hold of the example described in your reply below. If you could let us have the URL to get then it would be great.
    My GUI application creates a pdf document which I need to print. I want to achieve this using the standard Java class PrinterJob (no 3rd party APIs I'm afraid, commercial restraints etc ..). I had a stab at it using the following code. When executed I get the pretty printer dialog then when I click ok to print, nothing happens!
    boolean showPrintDialog=true;
    PrinterJob printJob = PrinterJob.getPrinterJob ();
    printJob.setJobName ("Contract.pdf");
    try {
    if (showPrintDialog) {
    if (printJob.printDialog()) {
    printJob.print();
    else
    printJob.print ();
    } catch (Exception PrintException) {
                   PrintException.printStackTrace();
    Thank you and a happy new year.
    Cheers,
    Chris

  • Printing PDF files using java

    Is there a class or utility that already does this?
    Thx in advance.

    Do you want to create PDF files using Java? If so, there is a library available at http://www.lowagie.com/iText/docs.html Check out this site. There are many more similar PDFGEnerator tools that you can use..

  • How to read pdf file using file adapter

    Hi..
        How to read pdf file using file adapter?
    regards
    Arun

    Hi
    This may help you
    /people/sap.user72/blog/2005/07/27/xi-generate-pdf-file-out-of-file-adapter
    /people/alessandro.guarneri/blog/2007/02/21/sap-xi-acting-as-a-huge-file-mover
    ---Ram

  • I am unable to read pdf files using Acrobat Reader on my MAC OS X. Any suggestions?

    How do I read pdf files using my MAC OS X?

    You can read pdf files with preview. Select a pdf document, right click on it, and choose Preview. You can also get info and select to choose Preview for all.
    You can also install Adobe Reader 10.1.3 for Lion. It will also install a plug-in to read pdf docs in the browser.
    http://www.adobe.com/support/downloads/detail.jsp?ftpID=5360

  • How to enable commenting into pdf files using java?

    Hi All,
    Is there any way available to enable comments into pdf files through java. I have a Adobe Reader 9 and also I want put some comments into pdf file, but the reader is not allowed to place a comment into pdf file before enabling the Comments into pdf. After enabling the comments in pdf file then only we can place the comments and we can the pdf file with comments.
    Is there any way available to enable comments into pdf file to view acrobat reader.
    Thanks in advance.

    The end user who uses the web application has Acrobat installed in their mahcine needs it only for the purpose of enabling comments in pdf. If this task of enabling comments in pdf is done through my application there is no need for acrobat in such user's machine which was requested as a cost cuttnig measure.
    For this purpose, i need to know about how to enable comments in pdf through java api.
    I have used itext java api for other pdf related manipulation in java. But it does not have feature as per the requirement posted above. Can any body suggest relevant java api to achieve the task requested?
    Tahnks in Advance.

  • Read Text file using Java Script

    Hi,
    I am trying to read a text file using Java Script within the webroot of MII as .HTML file. I have provided the path as below but where I am not able to open the file. Any clue to provide the relative path or any changes required on the below path ?
    var FileOpener = new ActiveXObject("Scripting.FileSystemObject");
    var FilePointer = FileOpener.OpenTextFile("E:\\usr\\sap\\MID\\J00\\j2ee\\cluster\\apps\\sap.com\\xapps~xmii~ear\\servlet_jsp\\XMII\\root\\CM\\OCTAL\\TestTV\\Test.txt", 1, true);
    FileContents = FilePointer.ReadAll(); // we can use FilePointer.ReadAll() to read all the lines
    The Error Log shows as :
    Path not found
    Regards,
    Mohamed

    Hi Mohamed,
    I tried above code after importing JQuery Library through script Tag. It worked for me . Pls check.
    Note : You can place Jquery1.xx.xx.js file in the same folder where you saved this IRPT/HTML file.
    <HTML>
    <HEAD>
        <TITLE>Your Title Here</TITLE>
        <SCRIPT type="text/javascript" src="jquery-1.9.1.js"></SCRIPT>
        <script language="javascript">
    function Read()
    $.get( "http://ldcimfb.wdf.sap.corp:50100/XMII/CM/Regression_15.0/CrossTab.txt", function( data ) {
      $(".result").html(data);
      alert(data);
    // The file content is available in this variable "data"
    </script>
    </HEAD>
    <BODY onLoad="Read()">
    </BODY>
    </HTML>

  • Can I create MS-Word files or PDF files using java

    We are developnig a web based application for banks and in this at some point we need to provide reports to the Banks for downloading and printing . Now I do not want to make simple .txt files ,I wan to use some sophsticated file format which look nice.
    So is it possible for the server to create files in say .doc format or in say PDF format etc which the banks may download and print.
    I am using java servlets and jsp as front end and java classes in middle tier and oracle is my database.

    Please try this code:
    import god.java.pdf.*;
    PDFWriter pw=new PDFWriter(new OutputFile(new String("sample.pdf")));
    pw.print("This is your text.");
    pw.close();

  • Error  while  reading pdf file using adobe reader 8

    Hi ,
    I am using itextsharp for creating pdf file which contains 300 pages.
    When I tried to open that pdf file in adobe reader 5 , there is no issue . Its opening correctly. But When I 've tried to open it using adobe reader 8 . Its opening the file but I could read upto 156 pages out of 300. I couldn't read beyond that. Its displaying the following error message "An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem" , "could not find the XObject named 'Xf2'." .
    Please help me to resolve the issue.
    Thanks,
    Tamil.

    Hi ,
    I am using itextsharp for creating pdf file which contains 300 pages.
    When I tried to open that pdf file in adobe reader 5 , there is no issue . Its opening correctly. But When I 've tried to open it using adobe reader 8 . Its opening the file but I could read upto 156 pages out of 300. I couldn't read beyond that. Its displaying the following error message "An error exists on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF document to correct the problem" , "could not find the XObject named 'Xf2'." .
    Please help me to resolve the issue.
    Thanks,
    Tamil.

  • How to read SGML files using Java

    I've got a text categorisation test collection called Reuters-21578 for my Information Retrieval project. It is distributed in 22 files. Each of the first 21 files (reut2-000.sgm through reut2-020.sgm) contains 1000 documents, while the last (reut2-021.sgm) contains 578 documents. The files are in SGML format. Each of the 22 files begins with a document type declaration line:
    <!DOCTYPE lewis SYSTEM "lewis.dtd"> The DTD file lewis.dtd is included in the distribution. Following the document type declaration line are individual Reuters articles marked up with SGML tags.
    My questions is how to write a java program to read those 21578 documents or transform them into 21578 seperated text files.

    I guess I missed something. What is Renes link?. The
    parser stuff isn't really what I'm looking for. I'm
    a new at and just learning java and I just want to
    know the easiest way to read a SGML file. Should I
    use a buffered Reader with a Pushback Input Stream?Hang on.....you want to just read the file without intelligently extracting the SGML data contained within and so have no need of a parser?
    Well, in that case, its just text.....so just use BufferedReader or whatever to read the text data. If I understand you correctly, all you really wanted to ask was "how do I read a text file?"

  • Read text file using Java(streamTokenizer)

    Hi, all,
    I am lost when trying to read data from a text file to a Java prgram. The text file looks like the following:
    106,62,2322,8159,1
    106,62,3658,8333,1
    106,62,4215,8334,2
    Each number is seperated by "," and each line representing one row of data. I was thinking about using streamTokenizer to read the data into a multi-dimentional array. Since I am new to Java and just read something about the streamTokenizer from book, I would like to get some help from someone who is more experienced with that.
    Thanks for your help!
    Kevin

    Hi Kevin,
    try this:
    import java.io.BufferedReader;
    import java.io.FileReader;
    import java.io.IOException;
    import java.util.ArrayList;
    import java.util.List;
    import java.util.StringTokenizer;
    public class Answer {
            public static void main(String[] args) {
            List data = new ArrayList();
            try {
                BufferedReader in = new BufferedReader(new FileReader("...your text file ..."));
                String line;
                // reading the file line by line
                while ((line = in.readLine()) != null) {
                    // splitting the line into token
                    StringTokenizer st = new StringTokenizer(line, ",");
                    List row = new ArrayList();
                    while (st.hasMoreTokens()) {
                        row.add(new Integer(st.nextToken()));
                    data.add(row); // adding the row of data
                in.close();
            } catch (IOException e) {
                e.printStackTrace();
            // test result
            System.out.println(data);
    }I don't like to use arrays, because when I start reading the file, I don't know yet, how many rows of data it is containing. Therefore a java.util.List is much more convenient (you don't have to initialize). Your result is now a java.util.List containing elements of java.util.List containig elements of Integer.
    Harri

Maybe you are looking for

  • How to delete the messages from sxmb_moni with status cancelled

    Hi , how to delete the messages from sxmb_moni with status cancelled Is there any report for this Regards Suman

  • How to form a URL dynamically in my portlet

    hi all, i have a requirement in weblogic portal 10.2,how do i form my URL dyncamically in my portlet which is as below I have a requirement to dynamically form my URL. Further suppose I have a left navigation portlet which has 3 links and right navig

  • Split of tax amount

    Dear All, I have a very urgent requirement. I want to know which the split of total tax amount into Base tax, Surcharge and Education cess is getting stored. In the basic table BSET, this split is not there Please help me urgently. Regards Suresh

  • How to retrieve TXT document attached to the document (CV03N)

    Hi, Can some one share their knowledge on how an attached document is retreived using CV03N->doc. data tab.I want to know the logic behind the attachment retrieval. Is ther a function module that pulls the file or any standard tables that conatin the

  • Cor1(process order)

    HI All, My requirement is to create a idoc for the process order which is released from COR1.(idoc part is completed) Can any one suggest me a badi which will triger while releasing the process order from COR1 (not cor5), i worked with workorder_upda