Solutions for converting HTML to PDF programatically?
to start off i should say that i am rather new to programming in java.
here's what i am attempting to do.
i need to write a java class that will get an HTML string as input... and needs to spit out a PDF file (or outputstream) as output. i have spent the last week or trying to accomplish this using XSL-FO and the FOP library. this solution does not work too well because XSL-FO and FOP do not handle complex table layouts very well (since they require the number of columns and column widths to be known in advance). it seems that FOP (and XSL-FO) is better suited to handling structured XML input... not something unstructured and complex as HTML.
are there any other libraries/APIs that are out there that are specifically well suited to HTML -> PDF conversion?
remember this needs to be done programatically, and will probably be invoked as a webservice.
thanks,
vivek
#1 There are definite copyright issues with your
software. Before you go live with anything like this,
make sure you're not gonna get reamed.Ehh? I didn't see anything from the OP's question that implied this. Yes, if he uses it to mine commercial web sites and convert them to PDF's there's a problem, but aside from that, where's the danger?
#2 The PDF part is the easy part. As the other poster
said, lowagie iText can do PDF. The rendered HTML is
a much bigger question. The smaller issue is that web
pages are defined to fit your browser window, so
you've got to choose a size. The much tougher problem
is finding a decent HTML renderer in Java. In truth,
I don't think there is one; JEditorPane is a piece of
****, and opera is really not a lot better. Not at all. The OP specifically mentioned web services, so we don't need to assume that Swing is involved. You can, using a 3rd party library (google for java pdf), have a servlet or jsp render its output as a PDF document.
Similar Messages
-
Jar file for converting html to pdf
Does anybody have jar file for converting a html document to pdf?
Are u particular about using jar file ?
I have developed form which converts any type of files especially word, txt ,html to pdf. Let me check if I have that
Rajesh ALex -
Using PD4ML library for convertin HTML to PDF
Hi all
i am using API s from PD4ML library for converting HTML to PDF.
what i am sending via HTML is a single HTML table.but the problem is i need to identify the number of lines which i need to give for single PDF page for the case when HTML table is too large to fit into single page.i can decide the number of lines before hand like 31 for portrait and 52 for landscape.but in this case problem arises when page size is too large.for eg page size is A1 and i have specified number of lines as 31.
In this case lot of empty space is left at the end of page.
so i actually need to calculate NUMBER OF LINES for single page as a FUNCTION of PAGE SIZE and FONT SIZE . font size because it also affects the number of lines that fit into single page.
would be really grateful if u guys can help fast#1 There are definite copyright issues with your
software. Before you go live with anything like this,
make sure you're not gonna get reamed.Ehh? I didn't see anything from the OP's question that implied this. Yes, if he uses it to mine commercial web sites and convert them to PDF's there's a problem, but aside from that, where's the danger?
#2 The PDF part is the easy part. As the other poster
said, lowagie iText can do PDF. The rendered HTML is
a much bigger question. The smaller issue is that web
pages are defined to fit your browser window, so
you've got to choose a size. The much tougher problem
is finding a decent HTML renderer in Java. In truth,
I don't think there is one; JEditorPane is a piece of
****, and opera is really not a lot better. Not at all. The OP specifically mentioned web services, so we don't need to assume that Swing is involved. You can, using a 3rd party library (google for java pdf), have a servlet or jsp render its output as a PDF document. -
Problem with converting html to pdf using LiveCycle ES Java API
I am using this code to convert html to pdf.
* 1. adobe-generatepdf-client.jar
* 2. adobe-livecycle-client.jar
* 3. adobe-usermanager-client.jar
* 4. adobe-utilities.jar
* 5. wlclient.jar
import java.io.File;
import java.util.Properties;
import com.adobe.idp.Document;
import com.adobe.idp.dsc.clientsdk.ServiceClientFactory;
import com.adobe.idp.dsc.clientsdk.ServiceClientFactoryProperties;
import com.adobe.livecycle.generatepdf.client.GeneratePdfServiceClient;
import com.adobe.livecycle.generatepdf.client.HtmlToPdfResult;
public class ConvertHTML {
public static void main(String[] args)
try{
//Set connection properties required to invoke LiveCycle ES
Properties connectionProps = new Properties();
connectionProps.setProperty(ServiceClientFactoryProperties.DSC_DEFAULT_EJB_ENDPOINT, "t3://localhost:7001");
connectionProps.setProperty(ServiceClientFactoryProperties.DSC_TRANSPORT_PROTOCOL,Service ClientFactoryProperties.DSC_EJB_PROTOCOL);
connectionProps.setProperty(ServiceClientFactoryProperties.DSC_SERVER_TYPE, "WebLogic");
connectionProps.setProperty(ServiceClientFactoryProperties.DSC_CREDENTIAL_USERNAME, "administrator");
connectionProps.setProperty(ServiceClientFactoryProperties.DSC_CREDENTIAL_PASSWORD, "password");
//Create a ServiceClientFactory instance
ServiceClientFactory factory = ServiceClientFactory.createInstance(connectionProps);
//Create a GeneratePdfServiceClient object
GeneratePdfServiceClient pdfGenClient = new GeneratePdfServiceClient(factory);
//Get an HTML document to convert to a PDF document a
String inputFileName = "http://www.adobe.com";
//String inputFileName = "C:\\Documents and Settings\\venkat\\Desktop\\Adobe.htm";
String securitySettings = "No Security";
String fileTypeSettings = "Standard";
System.out.println("one");
//Convert HTML content to a PDF document
HtmlToPdfResult result = pdfGenClient.htmlToPDF2(inputFileName, fileTypeSettings, securitySettings, null, null);
System.out.println("two");
//Get the newly created document
Document createdDocument = result.getCreatedDocument();
//Save the PDF document as a PDF file
createdDocument.copyToFile(new File("C:\\test.pdf"));
catch (Exception e) {
System.out.println("Error OCCURRED: " + e.getMessage());
e.printStackTrace();
I can able to compile this class but while running i am getting error like below.
Error OCCURRED: Internal error.
ALC-DSC-000-000: com.adobe.idp.dsc.DSCRuntimeException: Internal error.
at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.doSend(EjbMessageDispatcher.java
:160)
at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispat
cher.java:57)
at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:208)
at com.adobe.livecycle.generatepdf.client.GeneratePdfServiceClient.htmlToPDF2(GeneratePdfSer
viceClient.java:666)
at ConvertHTML.main(ConvertHTML.java:84)
Caused by: java.rmi.RemoteException: Remote EJBObject lookup failed for 'ejb/Invocation'; nested exc
eption is:
org.omg.CORBA.COMM_FAILURE: vmcid: SUN minor code: 203 completed: No
at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initialise(EjbMessageDispatcher.
java:101)
at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.doSend(EjbMessageDispatcher.java
:130)
... 4 more
Caused by: org.omg.CORBA.COMM_FAILURE: vmcid: SUN minor code: 203 completed: No
at com.sun.corba.se.impl.logging.ORBUtilSystemException.writeErrorSend(Unknown Source)
at com.sun.corba.se.impl.logging.ORBUtilSystemException.writeErrorSend(Unknown Source)
at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.writeLock(Unknown Source)
at com.sun.corba.se.impl.encoding.BufferManagerWriteStream.sendFragment(Unknown Source)
at com.sun.corba.se.impl.encoding.BufferManagerWriteStream.sendMessage(Unknown Source)
at com.sun.corba.se.impl.encoding.CDROutputObject.finishSendingMessage(Unknown Source)
at com.sun.corba.se.impl.protocol.CorbaMessageMediatorImpl.finishSendingRequest(Unknown Sour
ce)
at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.marshalingComplete1(Unkno
wn Source)
at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.marshalingComplete(Unknow
n Source)
at com.sun.corba.se.impl.protocol.CorbaClientDelegateImpl.invoke(Unknown Source)
at com.sun.corba.se.impl.protocol.CorbaClientDelegateImpl.is_a(Unknown Source)
at org.omg.CORBA.portable.ObjectImpl._is_a(Unknown Source)
at weblogic.corba.j2ee.naming.Utils.narrowContext(Utils.java:126)
at weblogic.corba.j2ee.naming.InitialContextFactoryImpl.getInitialContext(InitialContextFact
oryImpl.java:94)
at weblogic.corba.j2ee.naming.InitialContextFactoryImpl.getInitialContext(InitialContextFact
oryImpl.java:31)
at weblogic.jndi.WLInitialContextFactory.getInitialContext(WLInitialContextFactory.java:41)
at javax.naming.spi.NamingManager.getInitialContext(Unknown Source)
at javax.naming.InitialContext.getDefaultInitCtx(Unknown Source)
at javax.naming.InitialContext.init(Unknown Source)
at javax.naming.InitialContext.<init>(Unknown Source)
at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initJndiContext(EjbMessageDispat
cher.java:213)
at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.getJndiContext(EjbMessageDispatc
her.java:226)
at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initialise(EjbMessageDispatcher.
java:87)
... 5 more
can u plz give me some way to do the convertion.Yes Sir.....Thanks for ur suggestion.....
But i didn't find exact solution..well..yes i found some but not exactly there were not in the way i required...I jus need to convert HTML to PDF using iText API for java.....I already used some classes in that like HTMLParser.....etc..
So Any thing else...Any one...Sure can help me in this................ -
How to convert html to pdf using acrobat sdk 8.0?
hi
I am a beginner of acrobat sdk .
I want to know How to use acrobat sdk 8.0 to convert html to pdf?
herere some questions :
1:How to support navigation inside PDF file that generated using acrobat sdk 8.0? For example: theres catalog in the top of HTML file, customer hopes can navigate inside the PDF file just like navigating inside the HTML file.
2:How to support operating some controls in the PDF file that generated using acrobat sdk 8.0? For example: therere some drop down list and text box in HTML file, customer hopes can input text in the text box, click the drop down list to see available options in it just like in HTML file.
Thanks in advance for any help and suggestion.Hello,
I want a system to re-brand my 37 pages PDF for affiliates.
I want a php dynamic link in the PDF online in order to personalize automatically the PDF for each affiliate. I need to change 2 links each time. The affiliate ID and the Paypal email (payment button) in page 36.
Can you help?
Please let me know
Thank you
Alex
PS My system is online and i can give you the url if it helps. -
Oracle IBR : API for converting files in PDF
IBR and WCC Version : 11.1.1.6.0
Currently We have configured IBR with UCM for converting documents to PDF format. Is there ant IBR API available so that without checking in the document in WCC , we can call the IBR API and get the document converted to PDF and it would be done through synchronous call.
thanks in advance.
Yogesh
Edited by: user10285200 on Mar 28, 2013 12:18 AMHi Yogesh ,
Without having the content in WCC server you can convert it to PDF using the OIT Modules which is the actual engine that does the processing in WCC as well . For doing this you will need to have the PDFExport module deployed on your client machine and then with that conversion can be done .
This is the link for OIT PDF Export module : http://www.oracle.com/technetwork/middleware/content-management/downloads/oit-dl-otn-097435.html
Infact all the modules of content processing / conversion can be done as an independent stand alone application using OIT . Each of those modules are available from the above link.
Hope this helps .
Thanks,
Srinath -
Is anybody programmatically converting HTML to PDF? If so, how?
Is anybody programmatically converting HTML to PDF? If so, how?
With InDesign, or something else?
As long as the application (InDesign or something else) has a command-line interface, i'd like to know about it.
Am using .NET, but we still want to know what you're doing even if you aren't.
Source data is HTML pages from random sources, so it's not necessarily XHTML unfortunately, though i could tidy it into a consistent form.thanks, but what i'm looking for here is programmatic usage -- that is, scripted or command-line -- not having a human user choosing menu options, etc
so as to your two suggestions ...
this would appear to be NOT programmatic ...
> And Acrobat will install a PDF convert toolbar for Internet Explorer to do this right from the browser.
and this might or might not be possible to program -- i don't know if people are somehow running Acrobat programmatically, would appreciate further information
> Acrobat has a Create PDF from Web Page function -
Trouble in fop(convert html 2 pdf) source.
to convert html 2 pdf file I found the article from javaworld.
(http://www.javaworld.com/javaworld/jw-04-2006/jw-0410-html.html)
unfortunately I can't seem to find the two classes below even though import all fop 0.94 library files.
are there something I'm missing?
Thanks.
import org.apache.fop.apps.Driver;
import org.apache.fop.tools.DocumentInputSource;Much thanks for your timely reply. The scanner software requires that one must scan to an application. The configuration has to point to an executable. In program files I must select an executable, in this case acrobat.exe. Have been using this for years and never have seen a problem. Continues guidance sought.
-
Solution for converting the UFF58--- UFF58b or UFF Ascii.
Dear all,
Rightnow, I'm working on vibration analysis by using LV signal express
2010 which seem can be exported the measurement file as the UFF58. (Universal
File Format)
but my problem is; I have to use this measurement file with another program which accept only the UFF58b and UFF Ascii format.
I try to find solutions for convert UFF58--->UFF58b or UFF Ascii. Does anyone have an idea? Please help.
Thank you in advance.It looks like the LabVIEW Sound and Measurement Suite would do the trick, or possibly Diadem, though I'm not sure if you have LabVIEW. What program are you trying to open the file with? It may also have a package you can download that supports all of the formats. For instance, there appears to be a UFF file reader and writer if you Google "uff files" in the top result, but that's only if you were using a specific program. But I suppose the easiest way would be if you already had the Sound and Measurement Suite.
Regards,
Jake G.
National Instruments
Applications Engineer -
Hi there!
Is there any solutions for convert document spreadsheet presentation to images with Office Web Apps?Hi,
As far as I know, there is no build-in feature that convert Office file to image format in Office web app yet.
I'll collect the information, and submit it with internal ways. Then, we could also submit the feedback here:
http://office.microsoft.com/suggestions.aspx
Regards,
George Zhao
TechNet Community Support
It's recommended to download and install
Configuration Analyzer Tool (OffCAT), which is developed by Microsoft Support teams. Once the tool is installed, you can run it at any time to scan for hundreds of known issues in Office
programs. -
I download itext for convert jsp to PDF. How to set content type for PDF.
I download itext for convert jsp to PDF. How to set content type for PDF. I try
<%@ page contentType = "application/pdf;charset=TIS-620" %>
, but the page does not PDF.
Thank.PDF files are usually binary files, JSPs are not well-suited for binary content.
(If you download the result of your JSP you'll see that it is not a valid PDF file; it will have probably a lot of whitespace and linefeeds, that will choke your PDF reader.). The first few characters must be
"%PDF-" without whitespace.
You can try using PDF files encoded as text - check if you can use text-encoded PDFs in iText.
Try using a Servlet instead. -
Can anyone post me a solution for converting gif containing text to a .txt
Can anyone post me a solution for converting gif containing text to a .txt file using Java
wow!!!
not gonna put a full solution (since its huge!!!)
but heres how you would do it
open the gif in a bufferd image, then do some image recognition on it
easy (hehe) case is if the gif only contains text (black on white of a standard font)
then i would scan down the image until i find a raster line that isnt all white (top of a text line), then find the next raster line that is all white (bottom of text line.
half(ish) way between these lines scan from the left until you find a black pixel (start <maybe> of next character) (be careful you dont aim for the gap in "i"
from that point (x, y) test a set of pixels (x+n1, y+m1) to (x+nt, x+mt) where t should only have to be ~8 or 10 such that for each character in the font these test points return a unique combination, you then know what character is next, add it to a string and repeat along the line, then down the page
tahh dahh there ya go
you can write a program to learn the required test points by giving it a line of the full character set
good piggin luck is what i say :-) -
Convert HTML to PDF - API or utility
Hi community,
Our product generates HTML reports, after that the users can edit them, and finally they want to send them via e-mail to another party. They want to send PDF document generated from that HTML. So I need to convert the HTML to PDF. Till now we did that with FOP and a xsl file we found(I don't remember where from) and improved a bit. However it becomes hard to maintain.
Searching around the forum and Google I found out about HTMLDoc, but it is not appropriate because FAQ states that currently it cannot embed other fonts than preset ones, and I need cyrillic font support. I tried several virtual printers that print to PDF file, but I want to escape from the HTML look - like table borders, etc.
I need a pointer to an appropriate product. Preferrably a pure java library, cross-platform because we will soon migrate from Windows to Linux, with support for external font embedding (like fop and iText). I am not limited to using only opensource and free libraries, it can be a commercial licence one.
Please share your experience in this area and guide me to a good library
Thanks for your time
MikeThanks for that idea ChuckBing. I will download OpenOffice and try this, it sounds good because OpenOffice seems to support both Linux and Windows.
Unfortunately the adobe online solution turned out not to be applicable for our case since there are customers that don't have access to Internet, besides there was a note on the site that currently only US and Canada are supported(but maybe I read it wrong)??
Thanks to all - kylias, MOD, DrClap and ChuckBing - for your participation. If OpenOffice does not solve the problem I intend to continue following the FOP path.
Mike -
Need help - CONVERTING HTML to PDF
A friend of mine has tasked me to research on a Java API that converts HTML files to PDF... Does anyone know about this?
I've been browsing the net for an hour now and i still haven't got a plausible solution to his problem... Can you guys help me with this?!? Any help would be greatly appreciated... Thanks... :)Done, I'd be glad to send
you the spanish MX properties
version, where can I send it
to you?great, please use email address [email protected], thanks
Strangely, many options that
work fine as standalone, appear
to be disabled now, options
like... in menu Edition: Undo,
Redo; in menu Format: Table, Link;
all the Table menu options.Undo and redo are active if there is something to undo or redo only. As well many table actions are only active when the caret is inside a table. See also the source code: class FrmMain contains almost all actions as inner classes (a design I changed in later applications). Each action class has a method 'update' which takes care of enabling or displaing an action depending on certain conditions.
HTH
Ulrich -
Converting HTML to PDF substitutes fonts
Hello!
On one of our workstations that is running Acrobat 9 Pro, whenever the user converts from an HTML document to a PDF for proofing purposes, we're getting different fonts in the output to than we had in the input. For example, any text in Arial Black in the HTML document is Arial Bold in the resultant PDF. Attached are screenshots of the before and after.
Before:
After:
As these are proofs that the client is supposed to be approving, this needs to be fixed quickly. All other machines in the office can convert these to PDF just fine, so it appears to be only the one machine. I uninstalled and reinstalled the software to no avail.
Please advise.Does the errant machine actually have the Font available.
Check the list of fonts avaiable in system in the machine acting up.
Then check in the system on a machine working.
If there are differences add the ones missings on the defective machine from the good machine.
Then try.
If a font is missing Acrobat will attempt to substitute to nearest similar font it can find.
Maybe you are looking for
-
Hello Comunity, I'm at a complete loss and I hope you could help me. The machine I have is I believe an early 2006 iMac model. (Model identifier 4,1 if I recall correctly). Awhile ago I installed Snow Leopard on the machine but have since begun encou
-
Link between excise and purchase document
how do you link the excise and GRPO or A/P Invoice in a select query?
-
Making XSD element name match the column name/header
The XML format of the answer I created looks like the following. How can I change the element name from C0, C1... to real column name? http://host:port/analytics/saw.dll?Go&searchid provided the XML <?xml version="1.0" encoding="utf-8" ?> - <RS xmlns
-
Populating Custom Containers in ALRTCATDEF
Hi, I have developed an Alert Category and within it I have developed a custom container that I want to populate using the Data from Payload (Source Message). Any suggestion that How can i populate the that and so the some Data field from Payload wil
-
Does anyone have any info about the availability of an agent(preferably 10.2.0.5) for solaris x86 platforms? I was able to find a document referencing a 10.2.0.2 version but nothing else.