HTML to PDF Conversion in Linux env

Dear all,
Do you have any idea how to convert HTML to PDF using java in Linux environment.
Thanks
SS

HTML to PDF with Java, using OpenOffice.org - example here: [http://www.dancrintea.ro/html-to-pdf/|http://www.dancrintea.ro/html-to-pdf/]
You can use OpenOffice.org, running as a server and command it remotely for document convertion.
Besides HTML to PDF, there are also possible other convertions:
doc --> pdf, html, txt, rtf
xls --> pdf, html, csv
ppt --> pdf, swf
Code example:
import officetools.OfficeFile; // this is my tools package
FileInputStream fis = new FileInputStream(new File("c:/test.html"));
FileOutputStream fos = new FileOutputStream(new File("c:/test.pdf"));
// suppose OpenOffice.org runs on localhost, port 8100
OfficeFile f = new OfficeFile(fis,"localhost","8100", true);
f.convert(fos,"pdf");
-----------------------------------------------------------------------------------------------------------------------------------------

Similar Messages

  • Acrobat 9 HTML to PDF conversion sets all checkboxes to checked?

    When I convert an HTML file that contains checkboxes to PDF using Acrobat 9 Standard or Pro (fully updated) on Windows XP SP3, all of the checkboxes end up checked in the resulting PDF.  I've looked in settings menus but can't find anything that seems to be a relevant option to prevent this from happening.  I've attached a simple test case .html file to this post that you can use to repeat the problem.
    To convert the file, I right-click on the file in Windows Explorer, and click Convert to Adobe PDF.  I've tried "printing" the document to the Adobe PDF printer, but that introduces other issues and is not really an acceptable solution.
    Has anyone encountered this before, and/or have ideas how to fix it?

    Input elements have no such inheritance on the checked attribute.  Furthermore, the input elements in my test case are not grouped together.  They are each encapsulated within separate list item elements, and so no inheritance should take place after the first input element.
    Just for grins, I changed the order of the elements (moved the checked one below the unchecked one), but that did not make any difference in the Acrobat 9 HTML to PDF conversion.
    I did test this with Acrobat 8 Standard, and the HTML to PDF conversion preserved the correct checked status of the input elements.  It looks to me like this is a bug that was introduced in Acrobat 9.

  • HTML to PDF conversion - problems with page-breaks and bookmarks

    Hello,
    My company is currently considering updating your software (from Acrobat 9 Pro to Acrobat XI Pro) and I’ve been assigned to research its features and make sure that it is a right fit for our goals. Basically we want to automate the whole process as much as possible and we want to create PDF directly from HTML. We’re providing a lot of content in HTML and we need a fast way to transfer it into PDF format. There are however some guidelines:
    We want page-breaks in is this kind of documents, and thus - your app needs to be able to interpret HTML and put them where we want to;
    We need to have bookmarks in there. Converter must be able to make them based on headlines from HTML source or afterwards, directly in PDF by using some auto-bookmarks feature;
    There has to be table of content generated, based on HTML Link Tags if possible. Here’s sample of TOC structure that we have currently:
    <A NAME="redirect">sample_text</A>
    <A HREF="#redirect">sample_text</A>
    Of course we can modify HTML in any way you want us to. The important thing for us is to make it work in PDF without the need to make a lot of manual changes after conversion.
    I’ve been messing with Acrobat 9 Pro and reading some documentation that you have provided and I’m convinced that point 3 is not a problem. I’m aware that Acrobat 9 Pro is not having any difficulties with links in document and they work fine in PDF format that has been created from HTML.
    Page-breaks on the other hand are bothering me. Your app is apparently ignoring every HTML code that the Internet is advising me to use to force page-break where I want. Honestly - I’ve tested like ten ways to make them and not even one was working. That’s why I’m asking for your help.
    Another problematic subject for me is the bookmarks creation. I know that they are not a problem if I’m doing DOC to PDF conversion. Then I’m able to decide what header should be used as a curtain level of bookmarks and everything is working great at the end. However - with direct HTML to PDF conversion - I really don’t know how to generate bookmarks that are based on the source of the input document. Is there any way to make fully working 2 level bookmark tree in this case? Here’s an example of the structure we want at the end:
    header1
    header2
    header2
    header1
    header1
    header2
    Could you please help me in finding the solutions? Just like I’ve mentioned - we can modify input HTML in any way, but in the end we would like to achieve our goals as quickly as possible.
    Please excuse my English.
    I am looking forward to your response,
    Lucas

    Frankly - we would like to avoid using Word. We are using it currently but there are long-term plans of improving whole conversion process, eliminate any mid-steps and automate as much as possible even though conversion is not going to be done unattended on a server. Thank you for your response, but I hope that maybe someone else would have any idea?

  • Acrobat 9 HTML to PDF conversion sets all checkboxes to checked - Duplicate question

    I am converting html files to pdf using Acrobat 9 pro. All of the check boxes and radio buttons come out checked.
    This is exactly the same as the following thread:
    Acrobat 9 HTML to PDF conversion sets all checkboxes to checked?
    That thread is old, but not answered. Has there been any update to this?

    Hi Don,
    Have you tried updating to v 9.5.5 and checked.
    Which OS and browser are you using?
    Have you checked the behavior with a new sample form on the browser?
    Regards,
    Rave

  • HTML to PDF Conversion

    I have requirement to convert a document in HTML to PDF. I need to do this in the background programatically (in C++ preferably) and without any UI prompts to the user. Does the Acrobat/PDF SDK provide any such APIs.

    Do you want to do this on the desktop or server?
    Acrobat (and thus the Acrobat SDK) can NOT be used on a server, so you'd need to look at our LiveCycle/ADEP products.
    For the desktop, you can write a C++-based plugin for Acrobat to do this conversion.
    From: Adobe Forums <[email protected]<mailto:[email protected]>>
    Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>
    Date: Mon, 28 Nov 2011 03:21:52 -0800
    To: Leonard Rosenthol <[email protected]<mailto:[email protected]>>
    Subject: HTML to PDF Conversion
    HTML to PDF Conversion
    created by Subramanya P<http://forums.adobe.com/people/Subramanya+P> in Acrobat SDK - View the full discussion<http://forums.adobe.com/message/4049769#4049769

  • HTML to PDF conversion - Linux

    Hi,
    We are doing a proof of concept of PDF generator on a Linux environment (RHEL6) .This POC involves conversion of HTML into PDF.
    As per the installation documentation, PDF Generator requires Adobe Acrobat XI pro.
    Can we use trail version of Acrobat? If yes, Link to download acrobat XI for linux?
    Thanks
    Sudhi

    For converting HTML to PDF on linux, you need to install Open Office.
    I believe you are using LC ES4, so please refer below documentation for checking compatible versions:
    LiveCycle Help | Adobe LiveCycle ES4 Supported Platforms
    - Varun

  • HTML to PDF conversion tools?

    What are there good open-soucrce tools to convert HTML into PDF document?
    Thank you in advance.

    Here's a tutorial on setting up HTMLDOC for ColdFusion:
    http://tutorial135.easycfm.com
    Also the command line reference gives a good understanding on
    how the command line version works:
    http://www.easysw.com/htmldoc/docfiles/8-cmdref.html

  • Options for HTML to PDF conversion

    I need to convert HTML to PDF from a custom command line app.
    I see that Acrobat XI does a good job of converting HTML files to PDF but I can't seem to find the SDK.  I do see the Acrobat X SDK but I'm not sure if it has the capability I need.  Can anyone confirm if X has that capability?  If it does not, is there an SDK for XI?  If not, when might we expect one?
    Thanks.

    I don't quite follow. You say you can't find the SDK, then you say you do see the Acrobat X SDK. What SDK are you looking for?

  • IPhone: HTML to PDF conversion library?

    Is anybody aware of a way to convert an html page to pdf in code that runs on the iPhone? I have gotten UIWebView to render into a pdf file, but that just creates a bitmap, I would really prefer a normal pdf, with selectable/searchable text.
    I have a found a few libraries around for this, but so far mostly Java and .Net. Is there anything that could actually be embedded in an app? Preferably open source, but commercial might be ok as well.

    The method described at http://www.macresearch.org/cocoa-scientists-xxx-developing-iphone is all about displaying pdf's on the iPhone (which are created elsewhere). I want to convert html into a pdf on the phone. I do not actually need to display the pdf at all (I can just display the html).
    I have seen easy ways of doing the conversion in MacOS (using the same API used in printing to a pdf), but these do not appear to be available on the iPhone.
    An online service from Adobe might be a possible approach, but it seems awfully sub-optimal to have to upload the html (and embedded images) and download back a pdf.

  • Error while trying to export a report into PDF using JRC (Linux env)

    Hi all,
    I have my web app installed on a Linux environment. When trying to export a CR report into PDF using JRC
    PrintOutputController controller = reportClientDoc.getPrintOutputController();
              ByteArrayInputStream byteArrayInputStream = (ByteArrayInputStream) controller.export(ReportExportFormat.PDF);
    I got this error message:
    19/02/2009     10:21:37     b     INFO     PdfExporter: PdfDocumentModeller.modelPage (page 1)
    19/02/2009     10:21:37     b     INFO     PdfExporter: Modelling page
    19/02/2009     10:21:37     b     INFO     PdfExporter: Creating document manager, text modeller and image modeller.
    19/02/2009     10:21:37     b     ERROR     PdfExporter: Exception caught in PDFFormatter.formatPage (from PdfDocumentModeller.modelPage); aborting export
    java.lang.IllegalArgumentException: Data type is not supported.
         at java.awt.image.Raster.createInterleavedRaster(Raster.java:212)
         at java.awt.image.Raster.createInterleavedRaster(Raster.java:178)
         at java.awt.image.ComponentColorModel.createCompatibleWritableRaster(ComponentColorModel.java:2826)
         at java.awt.image.BufferedImage.<init>(BufferedImage.java:439)
         at com.crystaldecisions.reports.exporters.format.page.pdf.pdflib.u.<init>(Unknown Source)
         at com.crystaldecisions.reports.exporters.format.page.pdf.b.k.a(Unknown Source)
         at com.crystaldecisions.reports.exporters.format.page.pdf.b.k.a(Unknown Source)
         at com.crystaldecisions.reports.exporters.format.page.pdf.b.d(Unknown Source)
         at com.crystaldecisions.reports.exporters.format.page.pdf.b.a(Unknown Source)
         at com.crystaldecisions.reports.formatter.a.c.a(Unknown Source)
         at com.crystaldecisions.reports.formatter.a.c.if(Unknown Source)
         at com.crystaldecisions.reports.formatter.a.c.a(Unknown Source)
         at com.businessobjects.reports.sdk.b.b.int(Unknown Source)
         at com.businessobjects.reports.sdk.JRCCommunicationAdapter.request(Unknown Source)
         at com.crystaldecisions.proxy.remoteagent.x.a(Unknown Source)
         at com.crystaldecisions.proxy.remoteagent.q.a(Unknown Source)
         at com.crystaldecisions.sdk.occa.report.application.dd.a(Unknown Source)
         at com.crystaldecisions.sdk.occa.report.application.ReportSource.a(Unknown Source)
         at com.crystaldecisions.sdk.occa.report.application.ReportSource.a(Unknown Source)
         at com.crystaldecisions.sdk.occa.report.application.PrintOutputController.export(Unknown Source)
         at com.crystaldecisions.sdk.occa.report.application.PrintOutputController.export(Unknown Source)
         at com.crystaldecisions.reports.sdk.PrintOutputController.export(Unknown Source)
    On my Windows environment the JRC export into PDF work perfectly. I know that there are some known issues regarding the usage of JRC on Linux environments. Could this be one of those?
    Any solution (or workaround) would be highly appreciated!
    Thank you!
    PS Maybe this is relevant: I use a MySQL database!
    Edited by: Sandila Catalin on Feb 19, 2009 10:01 AM

    What kind of image do you have in the report?
    Do you have -Djava.awt.headless=true specified for the Java JVM option?
    Sincerely,
    Ted Ueda

  • HTML to PDF conversion problem

    Using Adobe Acrobat Pro X, I am having a problem converting HTML to interactive PDF, all of the HTML does not convert to the PDF.  My HTML code creates "tabbed" data when displayed via a browser.
    I have had success with converting one very LARGE tabbed, HTML file to PDF, but most will only convert the data on the 1st "tab".  Not sure what makes the other files different.  Please can anyone help?
    Thanks,
    Shalayne

    Shgreen3,
    Moving this to Acrobat forum, since this doesn't relate to our CreatePDF service.
    Dave

  • WHY are there pages missing in my HTML to PDF conversion?!?

    Hello all~
    upon converting/saving webpages to a PDF file (using the "Convert Webpage to PDF" command), there are pages occasionally missing after the conversion. Sometimes the pages are completely gone & sometimes you can see the pysical page, but it is emply/blank! I've adjustted the settings & such without any success. Please, if anyone can help me, it would be MUCH appreciated!
    Thanks in advance~ Joey

    Greetings~
    I perhaps should have been more specific, as to the particular details surrounding my problem. Actually, "graffity" got me pointing in the right direction. 1st of all, the webpage I was trying to convert to PDF was an email hosted by HotMail. So when I tried to print it using the print icon on my "Command Bar," I would had missing information. Then I noticed a printer icon contained within HotMail. Once I clicked on that icon, it reconfigured the page, whereby I was then able to select the "Convert Webpage to PDF" with the DESIRABLE RESULTS.
    In short, if anyone elst has problems saving ALL THE DATA contained within their HotMail, then you MUST 1st click on the printer icon contained WITHIN HotMail... & then you can either print to Adobe OR Convert Webpage to PDF (providing you have the toolbar add-on.
    Ciao~

  • HTML to PDF Using ActiveX

    Dear Expects,
                            I am using the PDCreater for HTML to PDF Conversion. PDFCreater have ActiveX Contol. but i am access the PDFCreater in LabVIEW some function is not working.
    Please some one help me, how can i change the AutoSave file Directory using ActiveX.
    Here i am attaching the LabVIEW code this method is correct or not?
    Attachments:
    PDF.vi ‏10 KB
    Default1.JPG ‏167 KB

    I don't understand the logic of what you are trying to do, but anytime code doesn't do what you expect, the first place to start looking is at any errors that are being generated - which you aren't collecting or displaying...
    Mike...
    Certified Professional Instructor
    Certified LabVIEW Architect
    LabVIEW Champion
    "... after all, He's not a tame lion..."
    Be thinking ahead and mark your dance card for NI Week 2015 now: TS 6139 - Object Oriented First Steps

  • HTML to PDF convertion tool

    Hi.
    I need a tool which let me to convert HTML pages to PDF documents. It would be great If I could find a Java library for that (free or to pay for). Does anyone know something about this kind of tool? Once again: I'm intersted in HTML to PDF conversion NOT to create PDF from pure Java code.
    Any help would be great...

    I don't think that you'll find a native HTML to PDF converter for Java. If you want something windows based I think you can take the HTML to PDF converter library for .NET from http://www.dotnet-reporting.com or from http://www.winnovative-software.com and build a ASP.NET 2.0 web service that you can furher call from your Java application.
    All the conversion can be done in a few lines for C# code:
           // Create the PDF converter. Optionally you can specify the virtual browser
            // width as parameter. 1024 pixels is default, 0 means autodetect
            PdfConverter pdfConverter = new PdfConverter();
            // set the license key
            pdfConverter.LicenseKey = "P38cBx6AWW7b9c81TjEGxnrazP+J7rOjs+9omJ3TUycauK+cLWdrITM5T59hdW5r";
            // set the converter options
            pdfConverter.PdfDocumentOptions.PdfPageSize = PdfPageSize.A4;
            pdfConverter.PdfDocumentOptions.PdfCompressionLevel = PdfCompressionLevel.Normal;
            pdfConverter.PdfDocumentOptions.PdfPageOrientation = PDFPageOrientation.Portrait;
            pdfConverter.PdfDocumentOptions.ShowHeader = false;
            pdfConverter.PdfDocumentOptions.ShowFooter = false;
            // set to generate selectable pdf or a pdf with embedded image
            pdfConverter.PdfDocumentOptions.GenerateSelectablePdf = selectablePDF;
            // Performs the conversion and get the pdf document bytes that you can further
            // save to a file or send as a browser response
            byte[] pdfBytes = pdfConverter.GetPdfFromUrlBytes(urlToConvert);

  • Convert HTML to PDF or AFP

    As part of the project we have to convert html documents to PDF or AFP. We tried with different tools like HTMLDOC and we are not able to get the perfect matching tool. Any help on finding the best tool for conversion of HTML to PDF or AFP will be appreciated.
    My basic requirement is
    1) The conversion process needs to be automated
    2) the tool has to run on Linux.
    3) Everything in the page (text, image etc) should be extracted in a single file
    Background
    A batch job which runs on Q&R cache servers to run every evening. The job has a list of 1500 symbols which iterates through and does an http get of the Stock Summary page for each ticker in the list. The next step is to launch HTMLDOC or another tool to convert to PDF, APF, or other format.
    Regards,
    Jags.

    I'm not sure it'll help you, but take a look at
    http://xml.apache.org/fop/index.html
    maybe you can go this way
    XHTML->XML->FOP->PDF
    ???

Maybe you are looking for