Solutions for converting HTML to PDF programatically?

to start off i should say that i am rather new to programming in java.
here's what i am attempting to do.
i need to write a java class that will get an HTML string as input... and needs to spit out a PDF file (or outputstream) as output. i have spent the last week or trying to accomplish this using XSL-FO and the FOP library. this solution does not work too well because XSL-FO and FOP do not handle complex table layouts very well (since they require the number of columns and column widths to be known in advance). it seems that FOP (and XSL-FO) is better suited to handling structured XML input... not something unstructured and complex as HTML.
are there any other libraries/APIs that are out there that are specifically well suited to HTML -> PDF conversion?
remember this needs to be done programatically, and will probably be invoked as a webservice.
thanks,
vivek

#1 There are definite copyright issues with your
software. Before you go live with anything like this,
make sure you're not gonna get reamed.Ehh? I didn't see anything from the OP's question that implied this. Yes, if he uses it to mine commercial web sites and convert them to PDF's there's a problem, but aside from that, where's the danger?
#2 The PDF part is the easy part. As the other poster
said, lowagie iText can do PDF. The rendered HTML is
a much bigger question. The smaller issue is that web
pages are defined to fit your browser window, so
you've got to choose a size. The much tougher problem
is finding a decent HTML renderer in Java. In truth,
I don't think there is one; JEditorPane is a piece of
****, and opera is really not a lot better. Not at all. The OP specifically mentioned web services, so we don't need to assume that Swing is involved. You can, using a 3rd party library (google for java pdf), have a servlet or jsp render its output as a PDF document.

Similar Messages

  • Jar file for converting html to pdf

    Does anybody have jar file for converting a html document to pdf?

    Are u particular about using jar file ?
    I have developed form which converts any type of files especially word, txt ,html to pdf. Let me check if I have that
    Rajesh ALex

  • Using PD4ML library for convertin HTML to PDF

    Hi all
    i am using API s from PD4ML library for converting HTML to PDF.
    what i am sending via HTML is a single HTML table.but the problem is i need to identify the number of lines which i need to give for single PDF page for the case when HTML table is too large to fit into single page.i can decide the number of lines before hand like 31 for portrait and 52 for landscape.but in this case problem arises when page size is too large.for eg page size is A1 and i have specified number of lines as 31.
    In this case lot of empty space is left at the end of page.
    so i actually need to calculate NUMBER OF LINES for single page as a FUNCTION of PAGE SIZE and FONT SIZE . font size because it also affects the number of lines that fit into single page.
    would be really grateful if u guys can help fast

    #1 There are definite copyright issues with your
    software. Before you go live with anything like this,
    make sure you're not gonna get reamed.Ehh? I didn't see anything from the OP's question that implied this. Yes, if he uses it to mine commercial web sites and convert them to PDF's there's a problem, but aside from that, where's the danger?
    #2 The PDF part is the easy part. As the other poster
    said, lowagie iText can do PDF. The rendered HTML is
    a much bigger question. The smaller issue is that web
    pages are defined to fit your browser window, so
    you've got to choose a size. The much tougher problem
    is finding a decent HTML renderer in Java. In truth,
    I don't think there is one; JEditorPane is a piece of
    ****, and opera is really not a lot better. Not at all. The OP specifically mentioned web services, so we don't need to assume that Swing is involved. You can, using a 3rd party library (google for java pdf), have a servlet or jsp render its output as a PDF document.

  • Problem with converting html to pdf using LiveCycle ES Java API

    I am using this code to convert html to pdf.
    * 1. adobe-generatepdf-client.jar
    * 2. adobe-livecycle-client.jar
    * 3. adobe-usermanager-client.jar
    * 4. adobe-utilities.jar
    * 5. wlclient.jar
    import java.io.File;
    import java.util.Properties;
    import com.adobe.idp.Document;
    import com.adobe.idp.dsc.clientsdk.ServiceClientFactory;
    import com.adobe.idp.dsc.clientsdk.ServiceClientFactoryProperties;
    import com.adobe.livecycle.generatepdf.client.GeneratePdfServiceClient;
    import com.adobe.livecycle.generatepdf.client.HtmlToPdfResult;
    public class ConvertHTML {
       public static void main(String[] args)
            try{
            //Set connection properties required to invoke LiveCycle ES                             
            Properties connectionProps = new Properties();
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_DEFAULT_EJB_ENDPOINT, "t3://localhost:7001");
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_TRANSPORT_PROTOCOL,Service ClientFactoryProperties.DSC_EJB_PROTOCOL);       
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_SERVER_TYPE, "WebLogic");
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_CREDENTIAL_USERNAME, "administrator");
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_CREDENTIAL_PASSWORD, "password");
            //Create a ServiceClientFactory instance
            ServiceClientFactory factory = ServiceClientFactory.createInstance(connectionProps);
              //Create a GeneratePdfServiceClient object
            GeneratePdfServiceClient pdfGenClient = new GeneratePdfServiceClient(factory);
           //Get an HTML document to convert to a PDF document a
            String inputFileName = "http://www.adobe.com";
            //String inputFileName = "C:\\Documents and Settings\\venkat\\Desktop\\Adobe.htm";
            String securitySettings = "No Security";
            String fileTypeSettings = "Standard";
    System.out.println("one");
            //Convert HTML content to a PDF document
            HtmlToPdfResult result = pdfGenClient.htmlToPDF2(inputFileName, fileTypeSettings, securitySettings, null, null);
    System.out.println("two");         
            //Get the newly created document
            Document createdDocument = result.getCreatedDocument();
            //Save the PDF document as a PDF file
            createdDocument.copyToFile(new File("C:\\test.pdf"));
        catch (Exception e) {
            System.out.println("Error OCCURRED: " + e.getMessage());
            e.printStackTrace();
    I can able to compile this class but while running i am getting error like below.
    Error OCCURRED: Internal error.
    ALC-DSC-000-000: com.adobe.idp.dsc.DSCRuntimeException: Internal error.
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.doSend(EjbMessageDispatcher.java
    :160)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispat
    cher.java:57)
            at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:208)
            at com.adobe.livecycle.generatepdf.client.GeneratePdfServiceClient.htmlToPDF2(GeneratePdfSer
    viceClient.java:666)
            at ConvertHTML.main(ConvertHTML.java:84)
    Caused by: java.rmi.RemoteException: Remote EJBObject lookup failed for 'ejb/Invocation'; nested exc
    eption is:
            org.omg.CORBA.COMM_FAILURE:   vmcid: SUN  minor code: 203  completed: No
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initialise(EjbMessageDispatcher.
    java:101)
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.doSend(EjbMessageDispatcher.java
    :130)
            ... 4 more
    Caused by: org.omg.CORBA.COMM_FAILURE:   vmcid: SUN  minor code: 203  completed: No
            at com.sun.corba.se.impl.logging.ORBUtilSystemException.writeErrorSend(Unknown Source)
            at com.sun.corba.se.impl.logging.ORBUtilSystemException.writeErrorSend(Unknown Source)
            at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.writeLock(Unknown Source)
            at com.sun.corba.se.impl.encoding.BufferManagerWriteStream.sendFragment(Unknown Source)
            at com.sun.corba.se.impl.encoding.BufferManagerWriteStream.sendMessage(Unknown Source)
            at com.sun.corba.se.impl.encoding.CDROutputObject.finishSendingMessage(Unknown Source)
            at com.sun.corba.se.impl.protocol.CorbaMessageMediatorImpl.finishSendingRequest(Unknown Sour
    ce)
            at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.marshalingComplete1(Unkno
    wn Source)
            at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.marshalingComplete(Unknow
    n Source)
            at com.sun.corba.se.impl.protocol.CorbaClientDelegateImpl.invoke(Unknown Source)
            at com.sun.corba.se.impl.protocol.CorbaClientDelegateImpl.is_a(Unknown Source)
            at org.omg.CORBA.portable.ObjectImpl._is_a(Unknown Source)
            at weblogic.corba.j2ee.naming.Utils.narrowContext(Utils.java:126)
            at weblogic.corba.j2ee.naming.InitialContextFactoryImpl.getInitialContext(InitialContextFact
    oryImpl.java:94)
            at weblogic.corba.j2ee.naming.InitialContextFactoryImpl.getInitialContext(InitialContextFact
    oryImpl.java:31)
            at weblogic.jndi.WLInitialContextFactory.getInitialContext(WLInitialContextFactory.java:41)
            at javax.naming.spi.NamingManager.getInitialContext(Unknown Source)
            at javax.naming.InitialContext.getDefaultInitCtx(Unknown Source)
            at javax.naming.InitialContext.init(Unknown Source)
            at javax.naming.InitialContext.<init>(Unknown Source)
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initJndiContext(EjbMessageDispat
    cher.java:213)
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.getJndiContext(EjbMessageDispatc
    her.java:226)
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initialise(EjbMessageDispatcher.
    java:87)
            ... 5 more
    can u plz give me some way to do the convertion.

    Yes Sir.....Thanks for ur suggestion.....
    But i didn't find exact solution..well..yes i found some but not exactly there were not in the way i required...I jus need to convert HTML to PDF using iText API for java.....I already used some classes in that like HTMLParser.....etc..
    So Any thing else...Any one...Sure can help me in this................

  • How to convert html to pdf using acrobat sdk 8.0?

    hi
    I am a beginner of acrobat sdk .
    I want to know How to use acrobat sdk 8.0 to convert html to pdf?
    herere some questions :
    1:How to support navigation inside PDF file that generated using acrobat sdk 8.0? For example: theres catalog in the top of HTML file, customer hopes can navigate inside the PDF file just like navigating inside the HTML file.
    2:How to support operating some controls in the PDF file that generated using acrobat sdk 8.0? For example: therere some drop down list and text box in HTML file, customer hopes can input text in the text box, click the drop down list to see available options in it just like in HTML file.
    Thanks in advance for any help and suggestion.

    Hello,
    I want a system to re-brand my 37 pages PDF for affiliates.
    I want a php dynamic link in the PDF online in order to personalize automatically the PDF for each affiliate. I need to change 2 links each time. The affiliate ID and the Paypal email (payment button) in page 36.
    Can you help?
    Please let me know
    Thank you
    Alex
    PS My system is online and i can give you the url if it helps.

  • Oracle IBR : API for converting files in PDF

    IBR and WCC Version : 11.1.1.6.0
    Currently We have configured IBR with UCM for converting documents to PDF format. Is there ant IBR API available so that without checking in the document in WCC , we can call the IBR API and get the document converted to PDF and it would be done through synchronous call.
    thanks in advance.
    Yogesh
    Edited by: user10285200 on Mar 28, 2013 12:18 AM

    Hi Yogesh ,
    Without having the content in WCC server you can convert it to PDF using the OIT Modules which is the actual engine that does the processing in WCC as well . For doing this you will need to have the PDFExport module deployed on your client machine and then with that conversion can be done .
    This is the link for OIT PDF Export module : http://www.oracle.com/technetwork/middleware/content-management/downloads/oit-dl-otn-097435.html
    Infact all the modules of content processing / conversion can be done as an independent stand alone application using OIT . Each of those modules are available from the above link.
    Hope this helps .
    Thanks,
    Srinath

  • Is anybody programmatically converting HTML to PDF? If so, how?

    Is anybody programmatically converting HTML to PDF? If so, how?
    With InDesign, or something else?
    As long as the application (InDesign or something else) has a command-line interface, i'd like to know about it.
    Am using .NET, but we still want to know what you're doing even if you aren't.
    Source data is HTML pages from random sources, so it's not necessarily XHTML unfortunately, though i could tidy it into a consistent form.

    thanks, but what i'm looking for here is programmatic usage -- that is, scripted or command-line -- not having a human user choosing menu options, etc
    so as to your two suggestions ...
    this would appear to be NOT programmatic ...
    > And Acrobat will install a PDF convert toolbar for Internet Explorer to do this right from the browser.
    and this might or might not be possible to program -- i don't know if people are somehow running Acrobat programmatically, would appreciate further information
    > Acrobat has a Create PDF from Web Page function

  • Trouble in fop(convert html 2 pdf) source.

    to convert html 2 pdf file I found the article from javaworld.
    (http://www.javaworld.com/javaworld/jw-04-2006/jw-0410-html.html)
    unfortunately I can't seem to find the two classes below even though import all fop 0.94 library files.
    are there something I'm missing?
    Thanks.
    import org.apache.fop.apps.Driver;
    import org.apache.fop.tools.DocumentInputSource;

    Much thanks for your timely reply.  The scanner software requires that one must scan to an application.  The configuration has to point to an executable.  In program files I must select an executable,  in this case acrobat.exe.   Have been using this for years and never have seen a problem.  Continues guidance sought.

  • Solution for converting the UFF58--- UFF58b or UFF Ascii.

    Dear all,
    Rightnow, I'm working on vibration analysis by using LV signal express
    2010 which seem can be exported the measurement file as the UFF58. (Universal
    File Format)
    but my problem is; I have to use this measurement file with another program which accept only the UFF58b and UFF Ascii format.
    I try to find solutions for convert UFF58--->UFF58b or UFF Ascii.  Does anyone have an idea? Please help.
    Thank you in advance.

    It looks like the LabVIEW Sound and Measurement Suite would do the trick, or possibly Diadem, though I'm not sure if you have LabVIEW.  What program are you trying to open the file with?  It may also have a package you can download that supports all of the formats.  For instance, there appears to be a UFF file reader and writer if you Google "uff files" in the top result, but that's only if you were using a specific program.  But I suppose the easiest way would be if you already had the Sound and Measurement Suite.
    Regards,
    Jake G.
    National Instruments
    Applications Engineer

  • Is there any solution for convert document spreadsheet presentation to images with Office Web Apps?

    Hi there!
    Is there any solutions for convert document spreadsheet presentation to images with Office Web Apps?

    Hi,
    As far as I know, there is no build-in feature that convert Office file to image format in Office web app yet.
    I'll collect the information, and submit it with internal ways. Then, we could also submit the feedback here:
    http://office.microsoft.com/suggestions.aspx
    Regards,
    George Zhao
    TechNet Community Support
    It's recommended to download and install
    Configuration Analyzer Tool (OffCAT), which is developed by Microsoft Support teams. Once the tool is installed, you can run it at any time to scan for hundreds of known issues in Office
    programs.

  • I download itext  for convert jsp to PDF. How to set content type for PDF.

    I download itext for convert jsp to PDF. How to set content type for PDF. I try
    <%@ page contentType = "application/pdf;charset=TIS-620" %>
    , but the page does not PDF.
    Thank.

    PDF files are usually binary files, JSPs are not well-suited for binary content.
    (If you download the result of your JSP you'll see that it is not a valid PDF file; it will have probably a lot of whitespace and linefeeds, that will choke your PDF reader.). The first few characters must be
    "%PDF-" without whitespace.
    You can try using PDF files encoded as text - check if you can use text-encoded PDFs in iText.
    Try using a Servlet instead.

  • Can anyone post me a solution for converting gif containing text to a .txt

    Can anyone post me a solution for converting gif containing text to a .txt file using Java

    wow!!!
    not gonna put a full solution (since its huge!!!)
    but heres how you would do it
    open the gif in a bufferd image, then do some image recognition on it
    easy (hehe) case is if the gif only contains text (black on white of a standard font)
    then i would scan down the image until i find a raster line that isnt all white (top of a text line), then find the next raster line that is all white (bottom of text line.
    half(ish) way between these lines scan from the left until you find a black pixel (start <maybe> of next character) (be careful you dont aim for the gap in "i"
    from that point (x, y) test a set of pixels (x+n1, y+m1) to (x+nt, x+mt) where t should only have to be ~8 or 10 such that for each character in the font these test points return a unique combination, you then know what character is next, add it to a string and repeat along the line, then down the page
    tahh dahh there ya go
    you can write a program to learn the required test points by giving it a line of the full character set
    good piggin luck is what i say :-)

  • Convert HTML to PDF - API or utility

    Hi community,
    Our product generates HTML reports, after that the users can edit them, and finally they want to send them via e-mail to another party. They want to send PDF document generated from that HTML. So I need to convert the HTML to PDF. Till now we did that with FOP and a xsl file we found(I don't remember where from) and improved a bit. However it becomes hard to maintain.
    Searching around the forum and Google I found out about HTMLDoc, but it is not appropriate because FAQ states that currently it cannot embed other fonts than preset ones, and I need cyrillic font support. I tried several virtual printers that print to PDF file, but I want to escape from the HTML look - like table borders, etc.
    I need a pointer to an appropriate product. Preferrably a pure java library, cross-platform because we will soon migrate from Windows to Linux, with support for external font embedding (like fop and iText). I am not limited to using only opensource and free libraries, it can be a commercial licence one.
    Please share your experience in this area and guide me to a good library
    Thanks for your time
    Mike

    Thanks for that idea ChuckBing. I will download OpenOffice and try this, it sounds good because OpenOffice seems to support both Linux and Windows.
    Unfortunately the adobe online solution turned out not to be applicable for our case since there are customers that don't have access to Internet, besides there was a note on the site that currently only US and Canada are supported(but maybe I read it wrong)??
    Thanks to all - kylias, MOD, DrClap and ChuckBing - for your participation. If OpenOffice does not solve the problem I intend to continue following the FOP path.
    Mike

  • Need help - CONVERTING HTML to PDF

    A friend of mine has tasked me to research on a Java API that converts HTML files to PDF... Does anyone know about this?
    I've been browsing the net for an hour now and i still haven't got a plausible solution to his problem... Can you guys help me with this?!? Any help would be greatly appreciated... Thanks... :)

    Done, I'd be glad to send
    you the spanish MX properties
    version, where can I send it
    to you?great, please use email address [email protected], thanks
    Strangely, many options that
    work fine as standalone, appear
    to be disabled now, options
    like... in menu Edition: Undo,
    Redo; in menu Format: Table, Link;
    all the Table menu options.Undo and redo are active if there is something to undo or redo only. As well many table actions are only active when the caret is inside a table. See also the source code: class FrmMain contains almost all actions as inner classes (a design I changed in later applications). Each action class has a method 'update' which takes care of enabling or displaing an action depending on certain conditions.
    HTH
    Ulrich

  • Converting HTML to PDF substitutes fonts

    Hello!
    On one of our workstations that is running Acrobat 9 Pro, whenever the user converts from an HTML document to a PDF for proofing purposes, we're getting different fonts in the output to than we had in the input. For example, any text in Arial Black in the HTML document is Arial Bold in the resultant PDF. Attached are screenshots of the before and after.
    Before:
    After:
    As these are proofs that the client is supposed to be approving, this needs to be fixed quickly. All other machines in the office can convert these to PDF just fine, so it appears to be only the one machine. I uninstalled and reinstalled the software to no avail.
    Please advise.

    Does the errant machine actually have the Font available.
    Check the list of fonts avaiable in system in the machine acting up.
    Then check in the system  on a machine working.
    If there are differences add the ones missings on the defective machine from the good machine.
    Then try.
    If a font is missing Acrobat will attempt to substitute to nearest similar font it can find.

Maybe you are looking for