HTML/URL to PDF

Hello everybody,
I'm trying to get a module working which should transform HTML code into pdf.
Unfortunately this is not working. I tried different versions like:
- iText
- flyingsourcer (https://xhtmlrenderer.dev.java.net/
- FOP
- http://www.allcolor.org/YaHPConverter/
- http://www.javaworld.com/javaworld/jw-04-2006/jw-0410-html.html?page=3
Mostly I get an error while parsing the HTML-Code (given by a stream from w URL e.g. http://www.google.com). There is no HTML code that 100% correct in the www.
Has anyone made such a project? Can anyone pass me a hint how I can get that working?
I would be pleased for hints that anone has already tested because there are a lot of libs that promise that it should work but it doesn't.
Best regards,
nobody

Hello,
I've now tested a lot with the concept of http://www.javaworld.com/javaworld/jw-04-2006/jw-0410-html.html?page=1 which typically working like a lot of other things.
Unfortunately normal pages like http://www.google.com cause errors:
Tidy (vers 4th August 2000) Parsing "InputStream"
line 1 column 106 - Warning: <style> lacks "type" attribute
line 1 column 1.404 - Warning: <script> lacks "type" attribute
line 3 column 1.100 - Warning: <nobr> is not approved by W3C
line 3 column 1.171 - Warning: unescaped & or unknown entity "&tab"
line 3 column 1.264 - Warning: unescaped & or unknown entity "&tab"
line 3 column 1.356 - Warning: unescaped & or unknown entity "&tab"
line 3 column 1.447 - Warning: unescaped & or unknown entity "&tab"
line 3 column 1.544 - Warning: unescaped & or unknown entity "&tab"
line 3 column 1.717 - Warning: missing </nobr> before <div>
line 3 column 1.729 - Warning: inserting implicit <nobr>
line 3 column 1.729 - Warning: discarding unexpected <a>
line 3 column 1.732 - Warning: discarding unexpected </a>
line 3 column 1.775 - Warning: unescaped & or unknown entity "&tab"
line 3 column 1.870 - Warning: unescaped & or unknown entity "&tab"
line 3 column 1.965 - Warning: unescaped & or unknown entity "&tab"
line 3 column 2.060 - Warning: unescaped & or unknown entity "&tab"
line 3 column 2.112 - Warning: missing </nobr> before <div>
line 3 column 2.126 - Warning: inserting implicit <nobr>
line 3 column 2.126 - Warning: missing </nobr> before <div>
line 3 column 2.126 - Warning: trimming empty <nobr>
line 3 column 2.141 - Warning: inserting implicit <nobr>
line 3 column 2.141 - Warning: trimming empty <nobr>
line 3 column 2.141 - Warning: trimming empty <div>
line 3 column 2.147 - Warning: trimming empty <div>
line 3 column 2.153 - Warning: inserting implicit <nobr>
line 3 column 2.153 - Warning: discarding unexpected <a>
line 3 column 2.156 - Warning: discarding unexpected </a>
line 3 column 2.198 - Warning: unescaped & or unknown entity "&tab"
line 3 column 2.303 - Warning: unescaped & or unknown entity "&tab"
line 3 column 2.381 - Warning: unescaped & or unknown entity "&tab"
line 3 column 2.470 - Warning: unescaped & or unknown entity "&tab"
line 3 column 2.561 - Warning: unescaped & or unknown entity "&tab"
line 3 column 2.592 - Warning: missing </nobr> before <div>
line 3 column 2.606 - Warning: inserting implicit <nobr>
line 3 column 2.606 - Warning: missing </nobr> before <div>
line 3 column 2.606 - Warning: trimming empty <nobr>
line 3 column 2.621 - Warning: inserting implicit <nobr>
line 3 column 2.621 - Warning: trimming empty <nobr>
line 3 column 2.621 - Warning: trimming empty <div>
line 3 column 2.627 - Warning: trimming empty <div>
line 3 column 2.633 - Warning: inserting implicit <nobr>
line 3 column 2.633 - Warning: discarding unexpected <a>
line 3 column 2.636 - Warning: discarding unexpected </a>
line 3 column 2.731 - Warning: discarding unexpected </nobr>
line 3 column 2.772 - Warning: trimming empty <div>
line 3 column 2.807 - Warning: trimming empty <div>
line 3 column 2.888 - Warning: <nobr> is not approved by W3C
line 3 column 2.912 - Warning: unescaped & or unknown entity "&pref"
line 3 column 2.920 - Warning: unescaped & or unknown entity "&pval"
line 3 column 2.927 - Warning: unescaped & or unknown entity "&q"
line 3 column 2.979 - Warning: unescaped & or unknown entity "&usg"
line 3 column 3.111 - Warning: unescaped & or unknown entity "&hl"
line 3 column 3.285 - Warning: <table> lacks "summary" attribute
InputStream: Document content looks like HTML proprietary
53 warnings/errors were found!
[WARNING] Screen logger not set - Using ConsoleLogger.
[INFO] setting up fonts
[ERROR] Unknown enumerated value for property 'text-align-last': relative
[ERROR] Error in text-align-last property value 'relative': org.apache.fop.fo.expr.PropertyException: No conversion defined
[ERROR] defaulted font to any,normal,normal
[ERROR] unknown font sans-serif,normal,bolder so defaulted font to any
[ERROR] Error while creating area : Error with image URL: \intl\de_de\images\logo.gif (Das System kann den angegebenen Pfad nicht finden) and no base URL is specified
In that case its not that big problem. Only pictures are not shown in the output pdf.
Pictures are big problems because there no chance to chance he native into a direct path. So I get errors when I try to convert www.sun.com:
[ERROR] Error while creating area : Error with image URL: \im\a.gif (Das System kann den angegebenen Pfad nicht finden) and no base URL is specified
Exception in thread "main" java.lang.NullPointerException
at java.io.FileOutputStream.write(FileOutputStream.java:247)
at html2pdf.Html2Pdf.main(Html2Pdf.java:91)
A big problem also appears when I try to open XHTML URLs like http://en.wikipedia.org/wiki/XHTML.
As you can read there are a lot of problems. I don't know if the concept of javaworld to convert html to pdf is up to date. Now its not the problem of the syntax of the HTML files. Unfortunately its not a problem that can solved with JTidy or a grphical module which is used in this thread (url from above): http://forums.sun.com/thread.jspa?forumID=31&threadID=5284184 Whith this I don't get any XHTML converted.
Is there another chance to convert html2pdf ?
nobody85

Similar Messages

  • Hide URL in PDF output in hyperlink , Maintain History within reports

    hi
    i have some issues to discuss for report output in pdf format.
    we are using java application to call reports via AS 10g, my issue are
    1) Hide URL in PDF output for hyperlink created.(Using username/password in Key is not enough)
    2) Maintain history as i call second report from my parent report then i want to go back to parent report.(Note here go.history or history.go(-1) work in HTML output format but not in PDF)
    3)To Call Child PDF report in new window(here again working fine target="_blank" for HTML output but not for PDF)
    regards
    abid

    Hi Ram,
    Not sure if this is possible.
    However, one workaround might be the following:
    1. Write a javascript that submits the URL using, say a POST method, and does not show the parameters in the URL. You will have to write this javascript code in the "Before Report" Report Escape. For a generic example on how to use javascript in a report, see Metalink Note 125652.1. This note shows javascript to disable the right-click of the mouse on the report output.
    2. Use the Hyperlink peroperty of the report to call this javascript function, eg,
    javascript:myfunction('http://machine:port/reports/rwservlet?report=...+server=...+empno=&empno')
    I am not a javascript expert, so I cannot give you an example of the function, but I hope someone in your team can find out.
    Navneet.

  • Html link to pdf works ok on mac but not on ipad. adobe reader for ipad downloaded.

    HTML link to pdf works ok on Mac but not on ipad. Adobe Reader for ipad downloaded. What's wrong?

    Hi pwillener,
    Thank for reply. The pdf’s are in a subfolder “PDF” of the folder holding the web page. The web page itself uses
    . This works in full size computer browsers but not in an iPad.
    My excursion through the search engine has produced the idea of adding #page = requesting page to the URL but I have not had time to try it perhaps without the = sign.
    Bryan

  • Bitmap image won't show after converting HTML file to pdf

    I have a program which saves a technical report as a HTML file. I don't  have another choice as to file type to save as. This program has a  template which generates the report....and there are fields which are  filled in...some text fields, and one is a company logo. Just to be  clear the logo itself is not in the template, a holding place is there  which points to the location of the logo on my drive. This is all set up  by the program and can't be altered.
    If I save the logo as a jpg, png or tiff file, the logo shows up in the  final HTML report and if I use Adobe Professional (version 7.0 on Win XP)  to create a pdf the  image also shows up - a perfect replica of the report is created in pdf,  making it convenient for printing and email.
    If I save the logo as a bitmap file (which creates a higher resolution  and sharper looking logo) the logo shows up in the initial HTML  technical report, but when I convert this to pdf the file is not found,  instead it shows as a small square red box with the error "Bad file  image" or "unable to download image".
    I can't understand why this would be the case as the logo is saved in  the same location whether it is jpg, png, tiff or bmp. It just seems  that the Adobe Pro doesn't know how to "find" the bitmap logo from the  HTML template. 
    I did try converting the HTML document to Word but it threw the  formatting out and when it was converted to pdf it cropped the page  & I wasn't able to resolve this...maybe it can be done but it is a  long way around in any case, particularly if steps then need to be taken  to correct the formatting.
    At the moment the document can be printed with success from the original  format - then it's a matter of scanning the printed document in order  to email. It can't be emailed directly because of the embedded  information which is lost if then sent to the recipient's computer.  However it would be nice if the doc could be converted to pdf which  would then make it easier to transfer the file to a backup drive or for  email.  Any ideas? Thanks

    Thanks for your response. Yes I did mean .bmp files.
    The problem is not getting it to show in HTML - but Adobe then picking it up from there. I guess though if it is not a standard format for HTML then maybe Adobe has trouble recognising it if it is there. Also, it is possible for Adobe to convert .bmp files if they are inserted in the HTML file, but I have instead a template with the code for the location of the file rather than the graphic itself.  I don't have this problem using the other file formats and getting them into pdf...it's just that the images aren't as sharp as the bmp. I have tried various formats and resolutions so if there isn't a way I can get the bmp from the HTML file into pdf I might just have to live with printing and scanning from the original. Thanks for your input, it confirms what I was thinking. I'll leave this unanswered for now in case someone else has further information.

  • JasperReportIntegration tool - Report coming back as HTML instead of PDF

    Using:
    * Apex 4.1.1 (using EPG)
    * 11G SE
    * Apache FOP as print Server
    * Dietmar Aust's free tool on integrating JasperReports
    I've recently created a second APEX instance locally (from externally hosted instance) and copied the application (via application export) and parsing schema (using exp/imp). Everything works fine except when I try to print reports (for JasperReports) it comes up as garbled HTML instead of PDF.
    In my hosted instance it all works fine.
    Since the necessary packages to integrate with Jasper reports goes into the parsing schema for application, all necessary objects are imported as well during the exp/imp.
    The only thing different about the hosted and my local instance is that the hosted one is XE where as my local one is SE but I don't think this is a factor at all.
    The other difference is the hosted instance uses Apex Listener while mine doesn't. Again don't think this a factor as well.
    My print server is enabled and working and I can print tables as PDF reports.
    I am not sure if I need to enable something or missed a step needed so the reports come up as expected as is the case in hosted instance. Any hints or suggestions welcome.
    Cheers.

    Hi Dietmar, I couldn't do it as I couldn't find that line I needed to modify. The XLIB_HTTP package goes upto line 45 and those lines referred in that post goes upto 89,90.
    But then I was due to change from EPG to Web Server + Apex Listener (for other reasons) which in turn solved my issue anyway.
    create or replace PACKAGE "XLIB_HTTP"
    AS
    /*=========================================================================
      $Id: xlib_http.pks 21 2010-01-07 07:41:27Z dietmar.aust $
      Purpose  : Make http callouts
      License  : Copyright (c) 2010 Dietmar Aust (opal-consulting.de)
                 Licensed under a BSD style license (license.txt)
                 http://www.opal-consulting.de/pls/apex/f?p=20090928:14
      $LastChangedDate: 2010-01-07 08:41:27 +0100 (Thu, 07 Jan 2010) $
      $LastChangedBy: dietmar.aust $
    Date        Author           Comment
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    19.02.2007  D. Aust          initial creation
    07.08.2008  D. Aust          added check_get_request
                                  display_url_raw: pass all request headers
                                     to the client
    =========================================================================*/
       c_success   CONSTANT CHAR (1) := '1';
       c_fail      CONSTANT CHAR (1) := '0';
       PROCEDURE display_url_raw (
          p_url                       VARCHAR2,
          p_mime_type_override   IN   VARCHAR2 DEFAULT NULL,
          p_charset              IN   VARCHAR2 DEFAULT NULL
       PROCEDURE retrieve_blob_from_url (
          p_url               VARCHAR2,
          o_blob        OUT   BLOB,
          o_mime_type   OUT   VARCHAR2
       FUNCTION escape_form_data (s VARCHAR2)
          RETURN VARCHAR2;
       FUNCTION check_get_request (p_url VARCHAR2)
          RETURN CHAR;
    END;Edited by: Arc_x on 11-Sep-2012 21:17

  • Need to sent alv output as html or as pdf attachment in mail

    +Hello
    I want to send an ALV output as attachement in html or as pdf format. how to do that? line size is greater than 600(nearly 40 fields).
    +please help me in this query.
    Regards
    Guruvayurappan
    Moderator Message: Please search before posting your question. Thread locked.
    Edited by: Suhas Saha on Dec 29, 2011 4:57 PM

    Hi,
    For sending the ALV output as PDF attachment, you can create a spool (proper page size in print parameters) and convert the spool to PDF using the FM CONVERT_ABAPSPOOLJOB_2_PDF and then send the same as attachment in mail.
    For send the data as HTML attachment, try the below FMs
    WWW_ITAB_TO_HTML_HEADERS & WWW_ITAB_TO_HTML_LAYOUT to create the HTML layout
    WWW_ITAB_TO_HTML to create the HTML for the actual data.
    Hope this helps you.
    Regards,
    Sachinkumar Mehta

  • Can Preview recognize hyperlinks (embedded URLs) in PDF docs?

    I've been trying to research this online but I get conflicting answers. Can Preview recognize hyperlinks (embedded URLs) in PDF documents? If it can, then how are those created?
    I am using Adobe Acrobat Professional v7 (on Windows) and Preview doesn't appear to see the links I've created. Acrobat Reader on both Mac and Windows works fine.

    I don't have any documents that I can test straight off, but I know Preview is limited when it comes to some of the advanced features of PDFs and you have to use Acrobat Reader for those. This might be another of those.

  • Writing Html Content into PDF using JSP

    Dear All,
    I am using JSP to generate Employee payslip dynamically. Presently i am diplaying payslip as on-screen display.
    Here i want to give option "Save as PDF". I am able to create PDF file using IText Libraries. i want to integrate HTML code into the PDF file.
    Writing HTML Content (using HTML Tags) into PDF.

    Well it is a difficult ask and i believe we are trying to re-invent the wheel
    Just to make my life simple i could have choosen either sample API given below to serve my cause
    1).[http://xmlgraphics.apache.org/fop/] (Apache FOP one can simply write an XSL template of their and then substitute the values from respective DTO's by using XSL - XML transformations and can generate content in different formats which includes PDF aswell)
    2).[http://jasperforge.org/plugins/project/project_home.php?group_id=102] (Just design a simple report template using iReport and create a jrxml files and write a small code snipett such that you can pass few details at the runtime and can export the report in different formats which includes PDF aswell)
    3).[http://www.object-refinery.com/jfreereport/] (Is much similar to what jasper offers in order to compile these you can either use pentaho product IDE or BEA/Oracle Actuate report tools)
    You can simply try searching for different examples for getting more help on using the respective API's
    Hope that helps :)
    REGARDS,
    RaHuL

  • RichText with HTML markup in PDF

    Hello
    I've come to the point, when I need to display rich text with html markups in output PDF.
    PDF is going to be printed and I don't want anything to be editable.
    I've started with xsd schema for the xdp template, where particular element looks like this:
           <xs:element name="note">
                   <xs:complexType>
                          <xs:simpleContent>
                               <xs:extension base="xs:string">
                                    <xs:attribute ref="xfa:contentType" fixed="text/html"/>
                               </xs:extension>
                          </xs:simpleContent>
                   </xs:complexType>
           </xs:element>
    in xdp form, I am using TextField with RichText option switched on, value type: Read Only
    as a test, xml data, which I am passing into it, looks like this:
    <note>
         <body xmlns="http://www.w3.org/1999/xhtml">
              <b>Note</b>
              <a href='http://somepage.com'>homepage</a>
              <img src = 'pdficon.png'/>
         </body>
    </note>
    It comes from the html snippet, which is also displayed in java web application using the wicket framework, so no problem with that in web browser.
    In general, it works, I am not getting any parsing errors or other exceptions from LC.
    Now, my questions are:
    1.hyperlink is displayed correctly in blue color and underlined, but it is not clickable ... is it because of the ReadOnly option?
    2. Obviously, image is skipped. I don't expect that to be shown in textField, also there is not real path to that image, but I wonder, is it
    possible to display it in some other way? Or, to be more precise, is it possible to interpret <img src... correctly in PDF?
    3. I understand from some other thread in this forum, that it is not possible to paste html snippet into PDF directly?
    Many thanks
       Martin

    Hi Abhinav
    What my question your reply does apply to? Nr. 3?
       Martin

  • Generate URL in PDF made with Smartforms

    Hallo,
    i wanted Generate URL in PDF made with Smartforms, i have done everything necessary with cuastmizing and programing, the PDF ist working good, but afer clicking on the link in my PDF , it doesnt work, and with debbaging , the FM HR_RCF_SF_URL_CALLBACK it was not generated,
    thank you for your help

    Hi,
    Post your thread in Form Printing forum for better response , here is the link
    Output Management
    Cheers
    bhavana

  • Convert html into tidy html to convert pdf using iText

    hello.
    I am try to convert html document into pdf.
    first i tried iText it works properly. but it needs all the tags to be witten correctly.
    when u try html not well formeted it gives an exception.
    So is there any way to convert html to pdf.
    or if not if not then way to convert html into properly taged HTML
    so it s easy to convert it to html,
    If you have any working example of Tidy.jar please send me.
    Thanks..

    Hi,
    I had a similar tasko to do i.e converting HTML to PDF.
    Please follow the link to this site and download the trial code.
    http://www.pd4ml.com
    I was able to convert my HTML to PDF.
    Have a look at it and let me know.
    Regards,
    Joe

  • Convert ITF-Url to HTML-Url

    Hi,
    Is it possible to convert an ITF-Url (BITMAP 'ENJOY' OBJECT GRAPHICS ID BMAP TYPE BCOL) to a HTML-Url by using the FUBA "CONVERT_ITF_TO_HTML"?
    Thanks!
    Christof

    When I read the comment for this function, it says: "convert internal SAP script format to HTML". It does not talk about URLs, but about HTML content. You can even see that there are exporting parameter for the HTML text (or a table for same HTML text). I can not see anything about an bitmap.
    You can also press that "documentation" button to see a little about the parameters, what they do, and a reference onto an example program.

  • What does HTML have over PDFs when working with data?

    I'm doing some research for a client whose company is moving from PDFs to HTML for their in-house user interfaces.
    What does HTML have over PDFs when working with data?
    Thanks!
    Luke

    PDFs can indeed work with data and can be programmed with javascript. You can do some pretty interesting things with it. It is even possible to create interactive forms on the web using PDF, however it requires server-side support. As a general rule, though, PDFs are terrible as a web interface and it's far easier to work with HTML and PHP.
    HTML is lightweight and PDF isn't (in case that seems like a small thing, it's actually a big negative for PDFs). The success of your PDFs will depend on the versions of acrobat your users use, and getting data in and out of the PDFs will require learning far more about Acrobat's FDF format and XML implementation than you may want to know.

  • HTML Frame inside PDF

    I want to display HTML content inside PDF. This HTML content just contains an Image, nothing else. How can it be possible? I am not able to find any way doing this. Is it possible to render HTML content inside PDF?
    Thanks.
    Abhinav

    Thanks for reply Paul.
    Actually I need to render a Google Map (Static JPG Image) in ImageField dyanamically. Pls have a look on snap below -
    But ImageField accepts only Base64 encoded value, so I can't use the URI of static image. Also I can't load it in temp memory inside PDF (Using HTTP Request) & then convert it into Base64 value. So I thought of putting HTML frame inside PDF.
    Now I am thinking that the only way left is to make a WebService in b/w GoogleMap and PDF which will accept the location as the I/P and returns with the Base64 encoded value. Is it the correct way? Or Is there any better way to do this?
    Please help. Thanks.
    Abhinav

  • Is it possible top render an Adobe Interactive form as HTML instead of PDF?

    Currently a form developed as an Adobe interactive form is presented to the user as a PDF document (this is somewhat obvious!!!!)
    If, for whatever reason, we are unable to use Adobe reader on the front end PC is there a way to configure ADS to render the form as HTML and not PDF???  (In our case its because of a bug that requires the installation of Adobe 8 and this will take require extensive testing and roll out...)
    I think this might be possible if I read http://store1.adobe.com/devnet/livecycle/articles/forms_coldfusion.html " Form authors can develop a single form design that the Form Server Module can render in PDF or HTML format in a variety of browser environments."
    Any insight would be greatly appreciated,
    Armando.

    Hi Philip and other ADOBE expert:
    You sound like you know alot about ADOBE.
    Please tell me is there a restriction on size of PDF file?
    I am reprinting invoices(RSNAST00) and then download these invoices to 1 big PDF file.  When tried to open this PDF file with Adobe reader- I get <b>error  "damaged file and not able to repair".</b> .  My program works fine for
    reprint 1 invoice and download to 1 pdf.   But <b>client wants multiple invoices into
    1 BIG PDF file.   HELP!!  thanks for your reply in advance.</b>
    cheers
    SilviaB

Maybe you are looking for

  • FEBA -Treasury - Duplicate Entries

    Hi Gurus I need your help on FEBA duplicate entries. Whenever we load bank statements manually or automatically, it will update the statement numbers and short keys in FEBKO and thereby at FEBA.  My problem is whenever we use manual and automatic met

  • Nokia Series 40  & iSync 2.2: Minimum firmware version.

    Been reading the testimonials about the long overdue series 40 support for iSync. Seems like there are some f/w requirements and I just want to get a quick log of the s40 users' f/w versions, and whether they were able to sync or not. I have a 6230 (

  • Lightweight APs not receiving IP address following controller upgrade

    Hi guys, I have a 5508 controller, just upgraded from 6.0.182 to 7.0.98.0. Also LAG was switched off directly after upgrade. Since the upgrade none of the 1142 LAPs are associating with the controller. I can see they are being issued IPs by the inter

  • Dear Apple...

    Despite what some say here, I know you read this discussion forum. Please... I'm begging you - can you work on the movie bugs in Keynote? What's interesting is previewing a slide with a movie has always worked - even when 6.0 came out - but exporting

  • Crash when closing lid

    hi, My Macbook pro crashes when I close the lid. The front light does not pulse... etc... Same behavior for all the users on the system. This article mentions that I should disable Safe Sleep: http://www.macfixer.net/articles/184/ Is this my ONLY opt