Convert html into tidy html to convert pdf using iText

hello.
I am try to convert html document into pdf.
first i tried iText it works properly. but it needs all the tags to be witten correctly.
when u try html not well formeted it gives an exception.
So is there any way to convert html to pdf.
or if not if not then way to convert html into properly taged HTML
so it s easy to convert it to html,
If you have any working example of Tidy.jar please send me.
Thanks..

Hi,
I had a similar tasko to do i.e converting HTML to PDF.
Please follow the link to this site and download the trial code.
http://www.pd4ml.com
I was able to convert my HTML to PDF.
Have a look at it and let me know.
Regards,
Joe

Similar Messages

  • Convert list item attachment from docx to pdf using Word Automation Services

    I have been trying to convert List Item attachments from docx to pdf using word automation services, it works in a normal document library but when I use the list attachment it throws a null reference error.
    var settings = new ConversionJobSettings();
    settings.OutputFormat = Microsoft.Office.Word.Server.Conversions.SaveFormat.PDF;
    var conversion = new ConversionJob("Word Automation Services", settings);
    conversion.UserToken = SPContext.Current.Site.UserToken;
    var wordFile = SPContext.Current.Site.RootWeb.Url + "/" + wordForm.Url;
    var pdfFile = wordFile.Replace(".docx", ".pdf");
    conversion.AddFile(wordFile, pdfFile);
    conversion.Start();
    Using reflector I was able to see my problem lies in Microsoft.Office.Word.Server.FolderIterator.cs where it uses SPFile.Item which returns NULL
    internal void CheckSingleItem(SPFile inputFile, SPFile outputFile)
    Microsoft.Office.Word.Server.Log.TraceTag(0x67337931, Microsoft.Office.Word.Server.Log.Category.ObjectModel, Microsoft.Office.Word.Server.Log.Level.Verbose, "OM: FolderIterator start a single item: source='{0}'; dest='{1}'", new object[] { inputFile.Url, outputFile.Url });
    Stopwatch stopwatch = Microsoft.Office.Word.Server.Log.StartPerfStopwatch();
    try
    this.CheckInputFile(inputFile.Item);
    this.CheckOutputFile(outputFile.Url);
    Is there any way to get around this?

    Hi Qfroth,
    According to your description, my understanding is that when you use word automation service to convert Word to PDF for list item attachment, it throws the null reference error.
    I suggest you can create an event receiver and convert the word to memory stream like below:
    private byte[] ConvertWordToPDF(SPFile spFile, SPUserToken usrToken)
    byte[] result = null;
    try
    using (Stream read = spFile.OpenBinaryStream())
    using (MemoryStream write = new MemoryStream())
    // Initialise Word Automation Service
    SyncConverter sc = new SyncConverter(WORD_AUTOMATION_SERVICE);
    sc.UserToken = usrToken;
    sc.Settings.UpdateFields = true;
    sc.Settings.OutputFormat = SaveFormat.PDF;
    // Convert to PDF
    ConversionItemInfo info = sc.Convert(read, write);
    if (info.Succeeded)
    result = write.ToArray();
    catch (Exception ex)
    // Do your error management here.
    return result;
    Here is a detailed code demo for your reference:
    Word to PDF Conversion using Word Automation Service
    Best Regards
    Zhengyu Guo
    TechNet Community Support

  • Adding text to PDF using iText instead of CFPDF

    Hi,
    I know this may seem a bit off topic being posted here but i'm asking this board since i'm a complete JAVA noob and i figure some of you CF folk might have had to do this before.
    Anyway, about my question...i'm already adding a watermark image to a pdf using iText (CF8) thanks to the help of fellow poster (=cfSearching=).  What i'm looking for is the best way to go about adding some text to this same pdf.  I need to add 4 lines of text (with specific font and size) and center it underneath the added image.   Does anyone have a site they could point me to as to how to add formatted text and how to get the width of that text so as to align it correctly?  I've search Google and looked at a lot of JAVA code but being a JAVA noob it's tough to figure out exactly which libs and methods can be used to do this. 
    Any help would be greatly appreciated!
    -Michael

    Hi again!
    Well, the merged image is an idea but i'd rather have it be actual text so that it is at least copy/paste-able if viewed on a computer.
    The four lines of text are dynamic (company name, broker name, phone number, email address) and limited to 40 characters.  Right now they are being added via CFPDF and DDX and use the following code in the DDX file to add it to the PDF.
    <PDF result="DestinationFile">
         <PDF source="SourceFile">
              <Watermark
              rotation="0"
              opacity="100%"
              horizontalAnchor="#horzAnchor#"
              horizontalOffset="#horzOffset#"
              verticalAnchor="#vertAnchor#"
              verticalOffset="#vertOffset#"
              alternation="OddPages"
              >
                   <StyledText text-align="center">
                        <p font="#font#" color="#color#" >#left(dCompany,maxlinechars)#</p>
                        <p font="#font#" color="#color#" >#left(dName,maxlinechars)#</p>
                        <p font="#font#" color="#color#" >#left(dPhone,maxlinechars)#</p>
                        <p font="#font#" color="#color#" >#left(dEmail,maxlinechars)#</p>
                   </StyledText>
              </Watermark>
         </PDF>
    </PDF>
    Then using the created pdf from above, i use a slightly modified version of the cfscript code ( that uses iText) you provided me previously to add a logo image just above this text.  The only changes i made to it were resizing of the image and adding where to place it.  Here is that code:
    <cfscript>                    
        fullPathToInputFile = "#tempdestfilepath#";
         writeoutput("<br>fullPathToInputFile=#fullPathToInputFile#");
        fullPathToWatermark = osFile("#request.logofilepath##qord.userlogo_file#",request.os);
         writeoutput("<br>fullPathToWatermark=#fullPathToWatermark#");
        fullPathToOutputFile =  "#destfilepath#";
         writeoutput("<br>fullPathToOutputFile=#fullPathToOutputFile#");
         ppi = 72; // points per inch
         watermark_x =  ceiling(#qord.pdftemplate_logo_x# * ppi);      // from bottom left corder of pdf
         watermark_y =  ceiling(#qord.pdftemplate_logo_y# * ppi);     // from bottom left corder of pdf
         fh = ceiling(0.75 * ppi);
         fw = ceiling(1.75 * ppi);
       if( not fileexists(fullPathToInputFile) )
                  savedErrorMessage = savedErrorMessage & "<li>Input file pdf for logo add does not exist<br>#fullPathToInputFile#</li>";
       else
                 try {
                 // create PdfReader instance to read in source pdf
                 pdfReader = createObject("java", "com.lowagie.text.pdf.PdfReader").init(fullPathToInputFile);
                 totalPages = pdfReader.getNumberOfPages();
                 // create PdfStamper instance to create new watermarked file
                 outStream = createObject("java", "java.io.FileOutputStream").init(fullPathToOutputFile);
                 pdfStamper = createObject("java", "com.lowagie.text.pdf.PdfStamper").init(pdfReader, outStream);
                 // Read in the watermark image
                 img = createObject("java", "com.lowagie.text.Image").getInstance(fullPathToWatermark);
                    w = img.scaledWidth();
                   h = img.scaledHeight();
                   //$is[0] = w
                   //$is[1] = h
                   if( w >= h )
                      orientation = 0;
                  else
                      orientation = 1;
                      fw = max_h;
                      fh = max_w;
                  if ( w > fw || h > fh )
                      if( ( w - fw ) >= ( h - fh ) )
                          iw = fw;
                          ih = ( fw / w ) * h;
                      else
                          ih = fh;
                          iw = ( ih / h ) * w;
                      t = 1;
                  else
                      iw = w;
                      ih = h;
                      t = 2;
                 // adding content to each page
                 i = 0;
                 //while (i LT totalPages) {
                     i = i + 1;
                     content = pdfStamper.getOverContent( javacast("int", i) );
                     img.setAbsolutePosition(javacast("float", watermark_x), javacast("float", watermark_y));
                        if(t==1)
                             img.scaleAbsoluteWidth( javacast("float", iw) );
                             img.scaleAbsoluteHeight( javacast("float", ih) );
                     content.addImage(img);
                     WriteOutput("Watermarked page "& i &"<br>");
                 //WriteOutput("Finished!");
                 catch (java.lang.Exception e) {
                 savedErrorMessage = savedErrorMessage & "<li>#e#</li>";
             // closing PdfStamper will generate the new PDF file
             if (IsDefined("pdfStamper")) {
                 pdfStamper.close();
             if (IsDefined("outStream")) {
                 outStream.close();
    </cfscript>
    The above code resized the image to a certain width/height if needed and adds it to the pdf. 
    I just figured they might be a way to tap into one of the java objects that would allow adding the text.  Ideally, adding the text and image to some sort of 'bounding box' that would allow centering of the image and text in relation to that bounding box.  Or if there is no way to add to a bounding box, a way to get the horizontal length of the longest line of text so i could calculate a common centerline for the image and text.
    I've attached the following pdf to show how the image and text would look together.  This example is not to scale but a similar image and text would be added to a separate pdf.
    Thanks for you help.

  • How to write special characters in PDF using iText

    How to write special characters encoded with UTF-8 in PDF using iText.
    Regards,
    Pandharinath.

    I don't know what your problem is but that's almost certainly the wrong question to ask about it. Java (including iText) uses only Unicode characters. (You may consider some of them to be "special" if you like but Unicode doesn't.) And when it does that, they aren't encoded in UTF-8 or any other encoding.
    So can you describe your problem? That question doesn't make sense.

  • Creating PDF using ITEXT API's - error

    Hi,
    In my WebDynpro Application I want to generate a PDF (using ITEXT API's) out of the data retrieved from back end system .
    I used this source code.
    Document document = new Document(PageSize.A4);
    document.open();
    PdfPTable table = new PdfPTable(1);
    PdfPCell cell;
    cell = new PdfPCell(new Paragraph("ONE"));
    table.addCell(cell);
    cell = new PdfPCell(new Paragraph("TWO"));      
    table.addCell(cell);
    document.add(table);
    document.close();
    byte[] b = new byte[100 * 1024];
    b =  document.toString().getBytes("UTF-8");
    IWDCachedWebResource pdfRes = WDWebResource.getPublicCachedWebResource(b, WDWebResourceType.PDF, WDScopeType.CLIENTSESSION_SCOPE,      wdThis.wdGetAPI().getComponent().getDeployableObjectPart(),"FileNameHelloText"));
    I have used Window Manager to create a external window with the URL from pdfRes.getUrl() method.
    After execution i get a pop up window with out PDF document.
    Please let me know your thoughts & solutions to the above mentioned problem.
    Thanks
    Senthil

    Hello Folks,
                   Use the following snippet of the code to generate PDF using ITEXT API.
                                       Document document = new Document(PageSize.A4);
         ByteArrayOutputStream bos = new ByteArrayOutputStream();
         PdfWriter.getInstance(document, bos);
         document.open();
                    PdfPTable table = new PdfPTable(1);
                    PdfPCell cell;
                    cell = new PdfPCell(new Paragraph("ONE"));
                    table.addCell(cell);
                    cell = new PdfPCell(new Paragraph("TWO"));      
                    table.addCell(cell);
                    document.add(table);
                    document.close();
                    byte [] byteContent = bos.toByteArray();
         IWDCachedWebResource cachedResource =
                             WDWebResource.getPublicCachedWebResource(
              byteContent,
              WDWebResourceType.PDF,
              WDScopeType.CLIENTSESSION_SCOPE,
              wdThis
                                          .wdGetAPI()
                                          .getComponent()
                                          .getDeployableObjectPart(),
              "TestPDF");
                  IWDWindow externalWindow =
            wdComponentAPI
                            .getWindowManager()
                            .createExternalWindow(cachedResource.getURL(),                         "PDF Window",true);
                  externalWindow.open();
    Thanks and Regards,
    Gopi

  • Adding a link in PDF using itext

    I am adding a link in PDF using itext and opening the link from the final created pdf using app.launchURL and app.getURL
    it works fine on windows XP but does nothing on MAC OSX /IOS

    Hi Lynn
    And here I was about to suggest you review the link below.
    Silly me.
    Click
    here to read the article
    Sincerely... Rick

  • Problem while generating PDF using iText

    Hi:
    I have generated PDF using iText, where i have written all code in sequential flow.
    <code>
    com.lowagie.text.Document document = new com.lowagie.text.Document(PageSize.A4, 55, 5, 20, 20);
    OutputStream outputstream = response.getOutputStream();
    PdfWriter.getInstance(document,outputstream);
    </code>
    And i have added all fields in the document.
    But my problem is how to display total pagecount on all pages e.g.1\20 (because i have generated PDF sequentially)
    Also i want to add watermark on all pages.
    So, can any body help me to solve this problem?
    Thank You,
    Balaji

    sabre150 wrote:
    Maybe http://itext-general.2136553.n4.nabble.com/
    Nice pron link in there :/

  • Converting xml file with arabic content to pdf using FOP

    Hello all
    I am trying to convert a dynamically generated xml file in which most of the data comes from the oracle database with arabic content, to pdf using FOP. I have used "Windows-1256" encoding for the xml. If i open the xml generated with the internet explorer the arabic content displays properly but the pdf is not generated and the acrobat reader shows the file as corrupted or not supported. Please help me. Its very urgent.
    Thanks & Regards
    Gurpreet Singh

    There is no direct support for importing RTF from an XML extract. Perhaps feature 1514 "Mapping formatted XML data into multiline field" will be of some use. This was released in 11.0, I believe.
    Essentially you can establish paragraph and certain text formatting like bold and underline when you include the proper token information in the data. I believe this is similar to simple HTML tokens.
    Example: &lt;FIELD>&lt;P>First paragraph of data.&lt;/P>&lt;P>New paragraph with &lt;B>&lt;U>bold and underline text&lt;/U>&lt;/B>. Rest of paragraph normal.&lt;/P>&lt;/FIELD>
    The result is something like this:
    <P>First paragraph of data.</P><P>New paragraph with <B><U>bold and underline text</U></B>. Rest of paragraph normal.</P>

  • Why can I not Convert a Microsoft Office Document to a PDF using the Context Menu?

    Why can I not Convert a Microsoft Office 2013 Document to a PDF using the options found in the Context Menu? (Ex: Convert to PDF, Combine Supported Files into PDF?)
    I updated to Acrobat XI PRO recently, but now i'm unable to combine or convert microsoft word docs to PDF.
    In Adobe Acrobat X I had this feature below, and it would combine Microsoft Office Documents all into a single PDF. Now I no longer have this issue in Adobe Acrobat XI Pro. It seems like it was program named Adobe Elements that was running the conversion.

    Ajlan. That page is showing as not available. Would the fix apply to Adobe Acrobat X and XI?
    Zach Moses
    Direct Phone and Fax (615) 577-5814 | [email protected]
    W Squared, Inc.
    5500 Maryland Way | Suite 200 | Brentwood, TN 37027 | www.wsquared.com<http://www.wsquared.com>
    This email and any attachments may be confidential and are solely for the use of the individual to whom they were intended. If you are not the intended recipient of this email, you must take no action based upon it, nor must you copy it or show it to anyone. Please immediately reply to the sender if you suspect you were not the intended recipient. All contents of this email are provided "as-is" without warranty of any kind and are subject to change without notice. W Squared assumes no risk from the recipient's use of this email. W Squared is not a certified tax firm or law firm and recipient should not rely on any communication from W Squared or its employees as having such authority.

  • How to insert an html  into another html

    Hi there,
    I have created a slide show with adobe bridge CS4 ,
    www.roulasorour.com/storm/index.html
    I am trying to insert it into the edit3 region of page
    www.roulasorour.com/test
    Steps I am taking to do this:
    Select all code of the slide show
    Paste it into Edit3 region
    as you can see the slide show is not showing...
    Would appreciate any help,
    sincerely

    To bring one HTML page into another HTML page, use an iframe (inline frame).
    <iframe name="IFRAME1" id='IFRAME1" width="100%" height="500" frameborder="0" allowtransparency="true"  scrolling="auto" src="folder/page.html">
    More on iframes:
    http://www.w3schools.com/tags/tag_iframe.asp
    Nancy O.
    Alt-Web Design & Publishing
    Web | Graphics | Print | Media  Specialists 
    http://alt-web.com/
    http://twitter.com/altweb

  • Hiii, i have m rows & n columns in my table..how to convert it into m columns & n rowa by using sql..pls help me...thanks.

    hiii,
    I have a table which has 14 rows & 8 cols.
    I want covert it into 14 cols & 8 rows,
    by using sql how to do it..pls help me.

    Oracle Database Search Results: pivot

  • InvalidPDFHeaderSignature Exception while opening pdf using itext jar

    Hi,
    I am getting InvalidPDFHeader while opening pdf file using itext jar.
    How to overcome this?
    Thanks,
    Veera

    Please continue in your previous thread: Sample code to dynamically append bytes to pdf
    Mod: I'm locking up.

  • Japanese text from jsp to pdf using itext

    I am using itext for pdf export in my application.
    It works well for english text.But I have pages in japanese also.
    when i export japanese pages to pdf it shows ??? in pdf files.

    Install Google at your machine, enter the following keywords "java", "pdf" and "api" in that input field, hit the submit button and explore the results.

  • Reading a multilevel list from MS Word Doc and converting it into an HTML nested list using C#

    I can achieve the above for a single level list as follows:
    foreach (Paragraph item in app.Selection.Range.ListParagraphs)
    item.Range.InsertBefore("<li>");
    item.Range.InsertAfter("</li>");
    Using C#, how can I programmatically convert a multilevel list (like the following) in a Word doc to a nested HTML list? Note: The bullet icons are not important. Thanks..Nam
    List from Word Doc:
    A
    B
    C
    D
    E
    F
    G
    H
    I

    Hi Nam,
    >>how can we programmatically determine the start and end elements of the sub-list with elements C,DE,F,G in the example of my original post? <<
    We can check the begin and end elements of the sub-list by the
    ListLevelNumber. For example, the sub-list's ListLevelNumber start at 2 by default. Here is the code to find the begin element for your reference:
    Sub FindBeginSubElement()
    For i = 1 To Selection.Range.ListParagraphs.Count
    If Selection.Range.ListParagraphs(i).Range.ListFormat.ListLevelNumber = 2 Then
    Debug.Print "begin sub element:" & Selection.Range.ListParagraphs(i).Range.Text
    Exit Sub
    End If
    Next i
    End Sub
    Also we can loop the selection in reverse order to find the end element for the sub-list.
    Hope it is helpful.
    Regards & Fei
    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click
    HERE to participate the survey.

  • Converting flowing layout xdp to background artwork pdf using java

    I design a Form in LiveCycle designer and save it as XDP. Very nice, since I can see the "fixed content" as content of "draw: elements and my dynamic fields.
    This pdf is going to be served up through the web, and needs no editing. All the form display information either comes from
    a) the boilerplace text)
    b) from a database.
    The boilerplate text data needs to be modified only before the production release of the pdf. after which is really is fixed.
    The second kind of data is really dynamic data.
    Is there a way to programmatically save the XDP as PDF converting all the draw elements into pdf atrwork after we are satisfied with the changes to the boiler plate ?
    I have access to XPAAJ, does that have any way of doing this ?
    Thanks

    Things like that are normally done with the "Interactive Form" UI Element with which you can use the integrated Adobe Forms environment. For more information have a look at the Integrate Online Interactive Forms by Adobe

Maybe you are looking for

  • I can't sync my playlists from iTunes to my phone!

    I can't sync my playlists! I've tried dragging the playlist to my phone and it says it's syncing but it's not appeared on my phone. I click on the drop down menu from the phone and it says that all the playlists are there, but they're not actually th

  • Changing the 400 session Timeou Page

    Hi everyone, it's possible change the page that's displayed when a Web Dynpro Application has expired? Regards, Eduardo Campos

  • Random Quotes

    I want to have a random quote (chosen from a separate file) displayed on our site that changes on a daily basis. I have about 100+ quotes that are in a already in a HTML file using a table. Do they need to be in a different format eg SQL file. How do

  • Audio delay after rendering

    Hello, I'm a YouTuber from Poland who puts on my channel movies (named commonly gameplays) from variety of games. I rendered about 250 gameplays using Premiere Pro CS5.5 and CS6. Except for the raw video from the game, I also render my gameplays addi

  • FB50 screen layout

    Hi All, While posting the transaction in FB50 we select Debit or Credit in line item.  Now my requirement is to change the D/C column to Posting keys.  I want to display posting keys in place of D/C. Any inputs will be rewarded. Thanks ~Rajesh