Working with .pdf files and JAVA

Hi,
does anyone have an answer to how I can find more information on .pdf files?
I would like to convert .pdf files to textfiles and/or xml files. I can not find it in the j2se Edition, and someone told me it can be found in the j2ee edition, but I can not find anything there either. Please help..
thanks,
R.

thanks for your reply. What tools do you mean? I know lots of tools for converting text to a .pdf file, but no tools for the other direction. There is an API available (commercial), that lets you work with PDF in JAVA, but i am interesting in the other possibilities.
Regards

Similar Messages

  • Quicklook does not work with WMV files and quick look no longer maintains resized views when viewing from a folder using the up/down arrows

    Quicklook does not work with WMV files and quick look no longer maintains resized views when viewing from a folder using the up/down arrows. Any fixes?

    Same problem here...

  • Creating DVD with PDF files and web links

    Hi all, first I'd like to say that these forums are a big help. I've spent DAYS scouring through topics learning. Of course, I know this opens it up for someone to post a link to a thread where my question has already been answered. Unfortunately, I haven't been able to find the specific help I need and would like to open a dialogue with experts.
    I am creating a marketing DVD for a product. We produced a video for it, but the client also wants the audience to have access to a large amount of research in this specific field. This exists as PDF files and links to websites.
    His previous Marketing CD was just that, a CD made with FileMaker and had the files and links and only worked in PC computers. I do not want to go back in that directions.
    I want to make an informative DVD with the video and a few pages of selling points and cool tricks (I discovered multilayered menus working on this!) for those viewing on TV or Computer, and then an option for Computer users to click for more info.
    How do I put PDF files on the disc and how do I put web links on there?
    Thanks,
    Byron

    DVDSP uses a tool called DVD@Access. It enables a user to link to URL and call such documents as pdfs. The problem is that its never was reliable - especially on the PC side of things.
    There have been many of posts in the last 2 days about its use. Do a search and you'll see.
    DVD was not designed with the web in mind - it was conceived long before that time. linking to "outside" documents requires a third party tool to take over.
    Just beware! - in fact if your client came to me I would refuse to do the job. I've seen the problems that exist doing work like this especially if your distributing to a large audience with different OS sets ups. If one of the users has Vista you can forget about it working at all.
    My suggestion would be to design a menu that tells the user the file paths to your pdfs or URLs on the disc.

  • Working with pdf files in swing applications

    Hi,
    I have a swing application which displays a pdf file and contains a text box. i want to display the current page number of the pdf file in the text box.
    Can any one please guide me how to implement the above functionality.
    Regards,
    Tommy

    How can i downsave pdf file in CC 2014?
    This is very unfortune, because we use some VB script together with illustrator. That process is stopping now because of this message!!!
    Dont know how i can solve this issue!

  • Change in behavior when working with PDF files in illustrator CC and CC2014. HELP IS NEEDED!

    Make a new CC file. Save in CC as pdf. Open same pdf file in CC 2014, make a change to file. Save file. Open same file in CC again. Now a dialogbox is displayed. This file is made in a newer version of illustrator!. This new behavior is totally stopping our entire production! What to do? NEED HELP ASAP
    Cheers
    Jesper G

    How can i downsave pdf file in CC 2014?
    This is very unfortune, because we use some VB script together with illustrator. That process is stopping now because of this message!!!
    Dont know how i can solve this issue!

  • Full-Text search is not working with PDF files - SQL Server 2012 64 bit

    Hi,
    We are in the process of storing PDF files in SQL Server 2012 with Full-Text search capability.
    I followed the steps as below and it works fine with word document but not for PDF files. I tried with PDF ifiler 11 & 9 and both are unsuccessful.
    Server/DB Level Settings:
    1)
    Enable FileStream
    2)
    Install Full-Text
    then restart
    3)
    Use [specific db]
    alter
    database [db name]
    add
    filegroup Files
    contains filestream;
    alter
    database [db name]
    add
    file (
    name = N'Files',
    filename =
    N'D:\SQL\DATA') to
    filegroup [Files];
    3)
    Database level
    Settings:
    FileStream:
    FileStream
    Directory name:
    [Set the name]
    FileStream
    non-transacted
    Access: [set Appropriate]
    3a)
    Add a
    datafile to DB
    with filestreamdata
    filetype.
    4)
    Share D:\SQL\DATA
    directory and
    add specific accounts
    with read/write
    access
    5)
    Give bulkadmin
    access to those
    specific accounts
    at server
    level
    6)
    From the
    page (link)
    download and
    install the *.pdf
    IFilter for
    FTS. Link:
    http://www.adobe.com/support/downloads/detail.jsp?ftpID=5542
    7)
    To the
    PATH global system
    variable add
    path to the
    catalog,
    where you installed
    the plugin.
    Default for
    this version is:
    C:\Program
    Files\Adobe\Adobe
    PDF iFilter 9
    for 64-bit
    platforms\bin
    8)
    From the
    page (link)
    download a
    FilterPackx64.exe
    and install
    it. Link:
    http://www.microsoft.com/en-us/download/confirmation.aspx?id=20109
    9)
    Now from
    SSMS execute the following
    procedures:
    -sp_fulltext_service
    'load_os_resources',1
    -sp_fulltext_service
    'verify_signature', 0
    EXEC
    sp_fulltext_service
    'update_languages';
    -- update language list
    EXEC
    sp_fulltext_service
    'restart_all_fdhosts';
    -- restart daemon
    reconfigure
    with override;
    10)
    Restart the
    server
    11)
    select document_type,
    path from
    sys.fulltext_document_types
    where document_type
    = '.pdf'
    -select
    document_type,
    path from sys.fulltext_document_types
    where document_type
    = '.docx'
    12) Results are OK.
    Following is my Table /Index/ catalog script:
    CREATE
    TABLE dbo.DocumentFilesTest
    DocumentId  INT
    IDENTITY(1,1)
    NOT NULL
    PRIMARY KEY,
    AddDate datetime
    NOT NULL,
    Name nvarchar(50)
    NOT NULL,
    Extension nvarchar(10)
    NOT NULL,
    Description nvarchar(1000)
    NULL,
    FileStream_Id UNIQUEIDENTIFIER
    ROWGUIDCOL NOT
    NULL UNIQUE DEFAULT
    NEWSEQUENTIALID(),
    FileSource varbinary(MAX)
    FILESTREAM DEFAULT(0x)
    go
    --Add default add date for document   
    ALTER
    TABLE dbo.DocumentFilesTest
    ADD CONSTRAINT
    DF_DocumentFilesTest_AddDate
    DEFAULT sysdatetime()
    FOR AddDate
    EXEC
    sp_fulltext_database
    'enable'
    GO
    IF
    NOT EXISTS
    (SELECT
    TOP 1 1 FROM sys.fulltext_catalogs
    WHERE name
    = 'Ducuments_Catalog_test')
    BEGIN
    EXEC sp_fulltext_catalog
    'Ducuments_Catalog_test',
    'create',
    'D:\SQL\PDFBlob';
    END
    --EXEC sp_fulltext_catalog 'Ducuments_Catalog_test', 'drop'
    DECLARE
    @indexName nvarchar(255)
    = (SELECT
    Top 1 i.Name
    from sys.indexes
    i
    Join sys.tables
    t on 
    i.object_id
    = t.object_id
    WHERE t.Name
    = 'DocumentFilesTest'
    AND i.type_desc
    = 'CLUSTERED')
    PRINT @indexName
    EXEC
    sp_fulltext_table
    'DocumentFilesTest',
    'create',
    'Ducuments_Catalog_test', 
    @indexName
    EXEC
    sp_fulltext_column
    'DocumentFilesTest',
    'FileSource',
    'add', 0,
    'Extension'
    EXEC
    sp_fulltext_table
    'DocumentFilesTest',
    'activate'
    EXEC
    sp_fulltext_catalog
    'Ducuments_Catalog_test',
    'start_full'
    ALTER
    FULLTEXT INDEX
    ON [dbo].[DocumentFilesTest]
    ENABLE
    ALTER
    FULLTEXT INDEX
    ON [dbo].[DocumentFilesTest]
    SET CHANGE_TRACKING
    = AUTO
    ALTER
    FULLTEXT CATALOG
    Ducuments_Catalog_test REBUILD
    WITH ACCENT_SENSITIVITY=OFF;
    INSERT
    INTO DocumentFilesTest(Extension,
    Name,
    FileSource)
    SELECT
     'pdf'
    'BOL12006553.pdf'
    * FROM
    OPENROWSET(BULK
    'd:\SQL\PDFBlob\BOL12006553.pdf',
    SINGLE_BLOB)
    AS BLOB;
    GO
    INSERT
    INTO DocumentFilesTest(Extension,
    Name,
    FileSource)
    SELECT
     'docx'
    'test.docx'
    * FROM
    OPENROWSET(BULK
    'd:\SQL\PDFBlob\test.docx',
    SINGLE_BLOB)
    AS Document;
    GO
    SELECT
    d.*
    FROM dbo.DocumentFilesTest
    d WHERE
    Contains(d.FileSource,
    'BILL')
    Returns nothing. it should come from PDF file
    SELECT
    d.*
    FROM dbo.DocumentFilesTest
    d WHERE
    Contains(d.FileSource,
    'TEST')
    Returns from word document as follows:
    2           2014-06-04 10:11:41.393            test.docx docx           
    NULL   [BINARY Value]  [Binary Value]
    Any help is appreciated. Its been a long wait.
    Thanks,
    Vel
    Vel Thavasi

    Hello,
    Did you check the fulltext log files for more details about the errors. If the filter isn’t working, there should be errors in the error log file.
    The following thread is about similar issue, please refer to:
    http://social.msdn.microsoft.com/forums/sqlserver/en-US/69535dbc-c7ef-402d-a347-d3d3e4860d72/sql-server-2008-64bit-fulltext-indexing-pdf-not-working-cant-find-ifilter
    Regards,
    Fanny Liu
    If you have any feedback on our support, please click here.
    Fanny Liu
    TechNet Community Support

  • File, Place only works with PDF files...why?

    I create documents in Mac Pages that I want to then create an interactive PDF (mainly navigation).  I am using the demo copy of Indesign to see if it fits the bill.
    The mac pages doocument is a fully formated and ready for export to a static PDF.  As a test, I took a few pages of it and exported it to pdf, word and rtf.
    The only file format that InDesign would import/place is PDF (pages, word and rtf were all grayed out and could not be selected via ID place).
    I had hoped that ID would inport/place pages directly, but I cannot seem to get it to import any format other than PDF.
    I tried some other .doc files (actaully created with WORD) and they were selectable but only imported the table of contents (no red arrow an lower right of text box to continue place).
    Any suggestions?
    thanks
    bob

    ID is certainly capable of placing RTF as well as native Word files (DOC and DOCX). What you're seeing is quite unusual.
    Try trashing your preferences.
    For the other DOC files, you need to hold down the shift key when click the page to place them.
    Bob

  • Working with PDF files

    Hello, we would like to write some functionality that generates PDF files from our Java application and additionally, some functionality that reads them into the app also. What is the best API to use for this? Would it be iText?

    Aha,show my code and say nothing[
    ............................................................................................................................./b]
    1�Bjacob  for  taking out  pdf ,word and  excel.
    jacob is a bridage�Cwhich connects java and com or win32 functions.It nees a dll,but the authoe of the jacob provide it�B
    jacob�Fhttp://www.matrix.org.cn/down_view.asp?id=13
    put dll under path,jar file under classpath  ,   import java.io.File;
    import com.jacob.com.*;
    import com.jacob.activeX.*;
    public class FileExtracter{
    public static void main(String[] args) {
    ActiveXComponent app = new ActiveXComponent("Word.Application");
    String inFile = "c:\\test.doc";
    String tpFile = "c:\\temp.htm";
    String otFile = "c:\\temp.xml";
    boolean flag = false;
    try {
    app.setProperty("Visible", new Variant(false));
    Object docs = app.getProperty("Documents").toDispatch();
    Object doc = Dispatch.invoke(docs,"Open", Dispatch.Method, new Object[]{inFile,new Variant(false), new Variant(true)}, new int[1]).toDispatch();
    Dispatch.invoke(doc,"SaveAs", Dispatch.Method, new Object[]{tpFile,new Variant(8)}, new int[1]);
    Variant f = new Variant(false);
    Dispatch.call(doc, "Close", f);
    flag = true;
    } catch (Exception e) {
    e.printStackTrace();
    } finally {
    app.invoke("Quit", new Variant[] {});
    }2)
    apache's poi  takes out  word�Cexcel�B
    poi package�Fhttp://www.matrix.org.cn/down_view.asp?id=14
    put it under classpath.
    import java.io.*;
    import org.textmining.text.extraction.WordExtractor;
    * <p>Title: pdf extraction</p>
    * <p>Description: email:[email protected]</p>
    * <p>Copyright: Matrix Copyright (c) 2003</p>
    * <p>Company: Matrix.org.cn</p>
    * @author chris
    * @version 1.0,who use this example pls remain the declare
    public class PdfExtractor {
    public PdfExtractor() {
    public static void main(String args[]) throws Exception
    FileInputStream in = new FileInputStream ("c:\\a.doc");
    WordExtractor extractor = new WordExtractor();
    String str = extractor.extractText(in);
    System.out.println("the result length is"+str.length());
    System.out.println("the result is"+str);
    }3)
    3�Bpdfbox  for   pdf 
    http://www.matrix.org.cn/down_view.asp?id=12
    import org.pdfbox.pdmodel.PDDocument;
    import org.pdfbox.pdfparser.PDFParser;
    import java.io.*;
    import org.pdfbox.util.PDFTextStripper;
    import java.util.Date;
    * <p>Title: pdf extraction</p>
    * <p>Description: email:[email protected]</p>
    * <p>Copyright: Matrix Copyright (c) 2003</p>
    * <p>Company: Matrix.org.cn</p>
    * @author chris
    * @version 1.0,who use this example pls remain the declare
    public class PdfExtracter{
    public PdfExtracter(){
    public String GetTextFromPdf(String filename) throws Exception
    String temp=null;
    PDDocument pdfdocument=null;
    FileInputStream is=new FileInputStream(filename);
    PDFParser parser = new PDFParser( is );
    parser.parse();
    pdfdocument = parser.getPDDocument();
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    OutputStreamWriter writer = new OutputStreamWriter( out );
    PDFTextStripper stripper = new PDFTextStripper();
    stripper.writeText(pdfdocument.getDocument(), writer );
    writer.close();
    byte[] contents = out.toByteArray();
    String ts=new String(contents);
    System.out.println("the string length is"+contents.length+"\n");
    return ts;
    public static void main(String args[])
    PdfExtracter pf=new PdfExtracter();
    PDDocument pdfDocument = null;
    try{
    String ts=pf.GetTextFromPdf("c:\\a.pdf");
    System.out.println(ts);
    catch(Exception e)
    e.printStackTrace();

  • Issues with .pdf files and Firefox Android Beta

    I read the local paper online and the individual pages present themselves as .pdf files. With the Firefox for Android NON beta clicking on said file downloadable ads it and automatically calls up the default .pdf reader which opens the page. Some recent change in the Beta version simply downloads the file. I then have to manually click to open. Tried clearing defaults on both Firefox and the Kindle app, force stopping both, all to no avail.....suggestions?
    Bob

    the change was made in firefox 25 - that's why you're currently seeing this behaviour only in firefox beta. the general release for firefox 25 is scheduled for next week. <br>i originally also thought to recommend the pdf.js-addon as an alternative but i found it to be very slow on my android too. unfortunately i don't know of any other solution currently.

  • Best way to work with FCP files and Adobe Premiere

    Hi,
    I have a bit of a technical problem. I'm editing a doco for broadcast on TV and have shot on HDV and downloaded onto our system as PAL, Apple Pro Res 422, 25 fps. oowever, Our graphics editor is on Adobe Premiere CS4.
    What is the best way to get it onto her system and back off again and onto ours, with the mimimum or no loss of quality?
    I've heard that an XML might work, but will it work on Adobe CS4? I don't know how this will pan out if it's XML. Will we be able to open it all back up in FCP when she hands it back. What sort of file do we get back from the Adobe Prem Pro Cs4 system?
    I'm a bit nervous about it all.
    Hope you can help.
    Cheers,
    Margie

    This got me curious Eagleray as I do this all the time within my Mac FCS to CS all the time. But to a PC got me a googlin. Hope this discussion _http://www.animotion.nl/en/tutorials/fcp-xml-premiere-pro-cs5/_ helps as quite a few students seem to have Macs at school and PCs at home. Also do be aware of the different formatting of some PC drives like FAT16, FAT32, NTFS vs the Mac GUID. The Mac can read them all with the proper system addons/drivers, they show up in System Preferences Preference Panes under "OTHER". I use NTFS-3G to mount NTFS PC drives. I think it was free, if not there is another one I used to use that is free from some open source NTFS group. FAT has the size limit issue of I think 2 or 4gb (you can't copy anything bigger than that at one time to a Mac from a FAT drive). I'm on Snow Leopard on my 8core. My partner who edits in PP CS3 on a PC brought over his Maxtor USB the other day for me to copy some video files from and it's been having problems mounting, I believe since I upped to SL. I threw it on my 6 year old PowerBook G4 running 10.5.8 and it mounted right away. Just another weird gotcha. But take a quick look, tho the title says CS5 in it, a few students were using CS4 on a PC.
    Message was edited by: TimeKoder13
    Message was edited by: TimeKoder13

  • Working with RAW Files and JPG

    I have a high end digital camera that takes pictures in RAW format as well as JPG format. When I import the photos, it only imports the JPG format. Is there a way to import RAW photo's into iPhoto?

    It does (and should by default). However after it stores them in the ORIGINALS folder, it creates a .jpg copy in the MODIFIED folder and uses that by the looks of things. (Use Spotlight or Finder to see if you've actually imported any raw's).
    You should see the RAW's displayed in the iPhoto window after import. If not, check the connection settings in your camera menu ... and make sure you are using a program or creative mode (my 20D does not take raw's in the preset or Green mode).
    To work on RAW's you'd need to go Adobe Photoshop with Adobe Camera Raw (incl.in CS2) or Apple Aperture.... now significantly cheaper.

  • Adobe reader not associating with pdf files

    i've installed adobe reader X after reinstalling windows, but reader is not associating with pdf files and also not showing in "open with.." list, even after browsing and try to associate it manually it's not visible in recommended program list and in other program list. to open pdf files i've to open adobe reader and then open files.

    Please use the "Adobe Reader and Acrobat Cleaner Tool" from http://labs.adobe.com/downloads/acrobatcleaner.html and remove any traces of Adobe Reader already on your system.
    Post successfully removing the application, please re-install Adobe Reader from: http://get.adobe.com/reader
    Hope this helps
    Ankit

  • How to open a pdf file and then attach it with images

    I am new to Indesign Server.
    I'm currently working on a pdf.
    I have a white blank pdf template.
    that I want to attach/glue it with images.
    How to open a pdf file and then attach it with images.
    Please, help me.
    Thanks.

    First step would be to make yourself familiar with InDesign desktop version.
    Whatever you intend to achieve, do it there manually. (see regular app docs or forums)
    Then try to automate your steps with scripting (see scripting docs or forum)
    If you can do it with a script in the desktop version, that script will likely also run in ID Server. (see server forum).
    If you can specify missing features not achievable thru scripting or manual use, reconsider to write a plugin (this forum).
    A seasoned C++ programmer will need a few months to learn the basics, wade thru tons of documentation etc. Alternatively consider to hire a consultant to do the development work for you.
    Dirk

  • I tried opening a pdf file and I set it to use always and firefox which did not work, I was wondering how to undo that and what I should use to open pdf files in the future?

    I was attempting to open a document from a frequently used site, I had never been on, on my new mac however. I attempted to open the pdf file and was given the option of which program to use to open it, I mistakenly clicked use always and firefox as the program with which to open pdf files. I do not know how to undo this or what I should do in the future to access the pdf files.

    Hi Melfour-
    Here is a Support article detailing how to work with your Firefox PDF preferences:
    [[Opening PDF files within Firefox]]
    Hope that helps.

  • Issue with PDF links and opened Adobe files

    I have tested and IE 8 opens correct links inside PDF's.
    When I open link in Mozilla or Chrome it will not work.
    Is this related to Adobe Plugin which is installed inside Add-ons?
    Strange is that link inside PDF shows the following link:resource://pdf.js/web/
    After checking support issues there was quoted in the past:
    The current Firefox 27 beta release still doesn't show the links properly using version 0.8.641, but the current Aurora 28.0a2 build does show the links to the Appendices properly using version 0.8.759.
    So that would mean that when Firefox 28 gets released in March it will work for users with this Firefox version.

    This is likely a problem with the way those links are coded in the PDF file, so the file may have to be saved again using different settings to make it work with PDF Viewers other than the Adobe Reader.

Maybe you are looking for

  • Serial ports in Solaris 10

    Hi all. Please help me to solve my little problem. I need to get serial ports names (com-ports) registered in my system. In Linux I use command "dmesg | grep ttyS" for that, but in Solaris dmesg output don't contains any data about ttya, ttyb etc...

  • Saving as PDF, from Word documents - not good quality?

    Why is it that when i create letterheads, adn i save as pdf this word document, it comes out such poor quality? Is there a setting to save with high quality output from acrobat, word to pdf or also how about print pdf? Is anyone able to help us out?

  • How to building a multi-channels analog output task in visual c++ 6.0 (without Measurement Studio)?

    Hello!  I have a PCI 6251 card, and using DAQmx C function to generate a wavwform. (single channel ). But, how to creating a multi-channels analog output task, and had a different frequence in each channels? Thanks.

  • Table view help required

    hi, I've a htmlb tableview which gets the data from an internal table which is a data member of a model. I've two column fields in this view which are editable. What are the steps I need to follow in order to get the internal table updated with the v

  • E-Mail synch issues

    Hi All, I have recently added my office Comcast e-mail account to my iPhone 5.  Everything seems to be working fine except for one issue -  When I open an e-mail on my phone, it is still showing as unread in my Outlook inbox; and vica versa.  I've do