HTML to RTF conversion

Hello. I am trying to convert a simple html string and produce a rtf file using the HTMLEditorKit and RTFEditorKit. I have already found some topics which cover the process and their authors warn that the output RTF will not contain images and tables. But what strikes me is that i can't even get new lines and paragraphs in the RTF-file. Here is the example code. The output RTF document contains just a single line of text. Thank you in advance!
import java.io.*;
import javax.swing.JEditorPane;
import javax.swing.JFrame;
import javax.swing.text.BadLocationException;
import javax.swing.text.DefaultStyledDocument;
import javax.swing.text.html.HTMLDocument;
import javax.swing.text.rtf.RTFEditorKit;
import javax.swing.text.html.HTMLEditorKit;
public class MainController {
     private DefaultStyledDocument htmlDoc;
     private HTMLEditorKit htmlKit;
     private RTFEditorKit rtfKit;
     public MainController() {
          htmlDoc = new HTMLDocument();
          htmlKit = new HTMLEditorKit();
          rtfKit = new RTFEditorKit();
     private void convert(String strText) {
          StringReader reader = new StringReader(strText);
          try {
               htmlKit.read(reader, htmlDoc, 0);
               FileOutputStream f = new FileOutputStream("rtfdoc.rtf");
               rtfKit.write(
                    f,
                    htmlDoc,
                    0,
                    htmlDoc.getLength());
               JEditorPane pane = new JEditorPane("text/html", strText);
               JFrame frame = new JFrame();
               frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
               frame.getContentPane().add(pane);
               frame.setSize(300,300);
               frame.show();
               Thread.sleep(5000);
          catch (IOException ie) {
          catch (BadLocationException ble) {
          } catch (InterruptedException e) {
               e.printStackTrace();
     public static void main(String args[]) {
          MainController conv = new MainController();
          String strRTF =
                    "<html><head><p class=default><span style=\"color: #000000\">Test </span><span style=\"color: #000000\"><b>line</b> </span><span style=\"color: #000000\"><i>1</i> </span></p>" +
                    "<p class=default><span style=\"color: #000000\">Test </span><span style=\"color: #000000\"><b>line</b> </span><span style=\"color: #000000\"><i>2</i> </span></p></head></html>";
          conv.convert(strRTF);
          System.exit(0);
}

When you post code, please use [code] and [/code] tags as described in Formatting Help on the message entry page. It makes it much easier to read and prevents accidental markup from array indices like [i].

Similar Messages

  • HTML to RTF Convertor

    I've got a number of HTML formatted datafields that users have entered using the APEX text editors (FCK).
    I would like to integrate those fields with a report but the report if not interpreting the HTML very well. As such I'd actually like to convert the HTML to RTF.
    I see a number of commercial DLL's exist for doing this within end applications.
    I'd prefer to actually do this conversion at the database side - has anyone done anything similar? Or have any suggestions for an approach?
    Thanks,
    Scott

    I'm working with Crystal Reports - the report must be customer quality. And the issue surrounds the fact that Crystal only supports limited HTML tags (see http://technicalsupport.businessobjects.com/KanisaSupportSite/search.do?cmd=displayKC&docType=kc&externalId=c2014842&sliceId=&dialogID=9876280&stateId=1%200%209874388)
    But it's RTF support is a lot better, hence the desire to convert the HTML to RTF.

  • HTML to WML conversion code needed

    hi all,
    we need HTML to WML conversion java code.Can somebody please help.The jtidy source code has errors.
    iwapgrp

    http://www.google.co.uk/search?q=html+2+wml&start=0&ie=utf-8&oe=utf-8

  • Acrobat 9 HTML to PDF conversion sets all checkboxes to checked?

    When I convert an HTML file that contains checkboxes to PDF using Acrobat 9 Standard or Pro (fully updated) on Windows XP SP3, all of the checkboxes end up checked in the resulting PDF.  I've looked in settings menus but can't find anything that seems to be a relevant option to prevent this from happening.  I've attached a simple test case .html file to this post that you can use to repeat the problem.
    To convert the file, I right-click on the file in Windows Explorer, and click Convert to Adobe PDF.  I've tried "printing" the document to the Adobe PDF printer, but that introduces other issues and is not really an acceptable solution.
    Has anyone encountered this before, and/or have ideas how to fix it?

    Input elements have no such inheritance on the checked attribute.  Furthermore, the input elements in my test case are not grouped together.  They are each encapsulated within separate list item elements, and so no inheritance should take place after the first input element.
    Just for grins, I changed the order of the elements (moved the checked one below the unchecked one), but that did not make any difference in the Acrobat 9 HTML to PDF conversion.
    I did test this with Acrobat 8 Standard, and the HTML to PDF conversion preserved the correct checked status of the input elements.  It looks to me like this is a bug that was introduced in Acrobat 9.

  • HTML to PDF conversion - problems with page-breaks and bookmarks

    Hello,
    My company is currently considering updating your software (from Acrobat 9 Pro to Acrobat XI Pro) and I’ve been assigned to research its features and make sure that it is a right fit for our goals. Basically we want to automate the whole process as much as possible and we want to create PDF directly from HTML. We’re providing a lot of content in HTML and we need a fast way to transfer it into PDF format. There are however some guidelines:
    We want page-breaks in is this kind of documents, and thus - your app needs to be able to interpret HTML and put them where we want to;
    We need to have bookmarks in there. Converter must be able to make them based on headlines from HTML source or afterwards, directly in PDF by using some auto-bookmarks feature;
    There has to be table of content generated, based on HTML Link Tags if possible. Here’s sample of TOC structure that we have currently:
    <A NAME="redirect">sample_text</A>
    <A HREF="#redirect">sample_text</A>
    Of course we can modify HTML in any way you want us to. The important thing for us is to make it work in PDF without the need to make a lot of manual changes after conversion.
    I’ve been messing with Acrobat 9 Pro and reading some documentation that you have provided and I’m convinced that point 3 is not a problem. I’m aware that Acrobat 9 Pro is not having any difficulties with links in document and they work fine in PDF format that has been created from HTML.
    Page-breaks on the other hand are bothering me. Your app is apparently ignoring every HTML code that the Internet is advising me to use to force page-break where I want. Honestly - I’ve tested like ten ways to make them and not even one was working. That’s why I’m asking for your help.
    Another problematic subject for me is the bookmarks creation. I know that they are not a problem if I’m doing DOC to PDF conversion. Then I’m able to decide what header should be used as a curtain level of bookmarks and everything is working great at the end. However - with direct HTML to PDF conversion - I really don’t know how to generate bookmarks that are based on the source of the input document. Is there any way to make fully working 2 level bookmark tree in this case? Here’s an example of the structure we want at the end:
    header1
    header2
    header2
    header1
    header1
    header2
    Could you please help me in finding the solutions? Just like I’ve mentioned - we can modify input HTML in any way, but in the end we would like to achieve our goals as quickly as possible.
    Please excuse my English.
    I am looking forward to your response,
    Lucas

    Frankly - we would like to avoid using Word. We are using it currently but there are long-term plans of improving whole conversion process, eliminate any mid-steps and automate as much as possible even though conversion is not going to be done unattended on a server. Thank you for your response, but I hope that maybe someone else would have any idea?

  • How do I save a file as an unformatted txt file instead of html or rtf?

    How do I save a file as an unformatted txt file instead of html or rtf?

    Use menu Image>Image Size in the image size dialog uncheck Resample and enter 300 in the resolution field and click OK.  Note no Pixels are changed only the resolution setting get changed.  The use Menu Fils>Save As in the save as dialog use the file type pull down and select Tiff then click Save
    In the Tiff Option Dialog in the Image Compression  section set None The click OK.

  • Acrobat 9 HTML to PDF conversion sets all checkboxes to checked - Duplicate question

    I am converting html files to pdf using Acrobat 9 pro. All of the check boxes and radio buttons come out checked.
    This is exactly the same as the following thread:
    Acrobat 9 HTML to PDF conversion sets all checkboxes to checked?
    That thread is old, but not answered. Has there been any update to this?

    Hi Don,
    Have you tried updating to v 9.5.5 and checked.
    Which OS and browser are you using?
    Have you checked the behavior with a new sample form on the browser?
    Regards,
    Rave

  • Problem reading html and rtf emails

    When I send emails from my pc to my iPhone 5 in html or rtf format they are unreadable as all of the coding instructions are also included in the text when it appears on screen. This was never a problem with my iPhone 4 so I am not sure what has changed. I have a business contact who has had similar problems in the recent past with my emails so I know it is not just me.
    I have tried sending html emails from other pc's in the office to my phone and they are all readable so perhaps it is something in the set up of my pc that is causing this issue. As my office is changing over to iPhone 5's does anybody have any solution to what will become a very annoyinmg problem.
    Obviously I could send all of my emails in plain text but that doesn't really work for what I need to send, logo's / graphics etc.

    Hi
    The best way to organize data and images to get them next to each other is to use a table (no borders) in your RTF template. Create a two celled table and drop the image into one and the text next to it.
    Regards
    Tim

  • HTML to PDF Conversion in Linux env

    Dear all,
    Do you have any idea how to convert HTML to PDF using java in Linux environment.
    Thanks
    SS

    HTML to PDF with Java, using OpenOffice.org - example here: [http://www.dancrintea.ro/html-to-pdf/|http://www.dancrintea.ro/html-to-pdf/]
    You can use OpenOffice.org, running as a server and command it remotely for document convertion.
    Besides HTML to PDF, there are also possible other convertions:
    doc --> pdf, html, txt, rtf
    xls --> pdf, html, csv
    ppt --> pdf, swf
    Code example:
    import officetools.OfficeFile; // this is my tools package
    FileInputStream fis = new FileInputStream(new File("c:/test.html"));
    FileOutputStream fos = new FileOutputStream(new File("c:/test.pdf"));
    // suppose OpenOffice.org runs on localhost, port 8100
    OfficeFile f = new OfficeFile(fis,"localhost","8100", true);
    f.convert(fos,"pdf");
    -----------------------------------------------------------------------------------------------------------------------------------------

  • HTML to PDF Conversion

    I have requirement to convert a document in HTML to PDF. I need to do this in the background programatically (in C++ preferably) and without any UI prompts to the user. Does the Acrobat/PDF SDK provide any such APIs.

    Do you want to do this on the desktop or server?
    Acrobat (and thus the Acrobat SDK) can NOT be used on a server, so you'd need to look at our LiveCycle/ADEP products.
    For the desktop, you can write a C++-based plugin for Acrobat to do this conversion.
    From: Adobe Forums <[email protected]<mailto:[email protected]>>
    Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>
    Date: Mon, 28 Nov 2011 03:21:52 -0800
    To: Leonard Rosenthol <[email protected]<mailto:[email protected]>>
    Subject: HTML to PDF Conversion
    HTML to PDF Conversion
    created by Subramanya P<http://forums.adobe.com/people/Subramanya+P> in Acrobat SDK - View the full discussion<http://forums.adobe.com/message/4049769#4049769

  • HTML to RTF

    Hi all
    I rename an HTML to RTF and if I use WORD DOC I see what I want to see.
    If I use WORD PAD I see the HTML code.That's bad.
    Any usefull idea with xsl or sort of?
    Regards

    You can do as follow:
    HTMLEditorKit htmlEditorKit = new HTMLEditorKit() ;
    RTFEditorKit rtfEditorKit = new RTFEditorKit() ;
    Reader in = new FileReader("theHTMLFile.html") ;
    Document doc = htmlEditorKit.createDefaultDocument() ;
    htmlEditorKit.read(in, doc, 0) ;
    in.close() ;
    OutputStream out = new FileOutputStream("theRTFFile.rtf") ;
    rtfEditorKit.write(out, doc, 0, doc.getLength()) ;
    out.close() ;That will probably only work with simple text without any complex formatting and without graphics...
    Yannick

  • HTML to Tiff conversion

    Hi,
    I want to perform Html to Tiff conversion.
    The Html file is on my sys and i want to convert it into Tiff file using my java code. The html file contain Some formated text and 3-4 images.
    I have a tool (GUI) that take html file path and snap the html and convert it into Tiff. But i want it in my java programming.
    Does jave provide API for doing this, or any other vendor providing this as jar so that i include the jar and call its API for conversion.
    Thanks,
    Manish

    Do you know of a method in the xdk that takes a well formed HTML doc and using xsd / xslt convert back to original xml spec?
    Because you created (and as long as you create) the HTML from XML it will be well formed (every tag will be ended with an end-tag) and you can therefore transform it back into XML.
    Most times it will not be possible to convert HTML found on the 'internet' into XML because this HTML is not well formed. For example, many people forget to end a paragraph of text within HTML with the </p> tag.
    We are evaluating using xslt to convert the XML to a form based medium for content maintenance. Wondering if once a XML document is parsed to HTML (DOM) can it be parsed back to XML for subsequent update to stored value in blob column. Specifically interested in conversion (parser) from HTML to XML
    Simply can HTML (in DOM format validated against a xsd) be transformed back to XML ?

  • Printing XLS, HTML, or RTF using Apache FOP

    Hi,
    Is it possible to print XLS, HTML, or RTF reports using Apache FOP? If not, is there another open source or free print server that will do this?
    Thank you.
    Martin

    Hello,
    >>
    Your going to want to study the Cocoon sitemap concept to see how to pipeline the xml and xsl to the right output.
    >>
    To get this working correctly you need to understand the Cocoon sitemap concept and how to read data out of the XML file your posting in order to pipeline it to the right rendering format. It's pretty much a hands on affair so it depends on what your trying to do, sorry there is no easier way to do it.
    In a future versions it will be easier to do this as you will be able to select specific end points per report.
    Regards,
    Carl
    blog : http://carlback.blogspot.com/
    apex examples : http://apex.oracle.com/pls/otn/f?p=11933:5

  • Open HTML and RTF in JEditorPane !!

    Hello,
    Is it possible to use JEditorPane to open a HTML file and then save it as RTF?
    And open a RTF file and save it as HTML ???
    Eric

    I tried. When the HTML or RTF contains TABLE, both will not recognize it. Is there any problem.
    I mean if the HTML contains TABLE, when I tried to write it to RTF, it will omit the table format. In the contrast, if the RTF contains the TABLE format, when I write the RTF to HTML, it also omits the table format.
    Eric

  • Framemaker to RTF conversion script

    Hello,
    I have an requirement for a script to do FM to RTF conversion scripts?
    Can someone provide any references or samples for this.
    Thanks & Regards,
    Shail

    Shail,
    The most robust solution – if FrameMaker’s own RTF export is not sufficient – is Mif2Go from Omni System. Check out
    http://www.omsys.com
    - Michael

Maybe you are looking for

  • Corrupt Rotated Images

    Hi I have made some low res scans of 35mm trannies for reference on a project that I'm working on. They are 16 bit and just over 2mb in size. The ones that have been shot in portrait format have been rotated either in Bridge or Photoshop CS3. I've im

  • Imac thunderbolt -vga

    just got  a thunderbolt -vga adaptor but on the old tv im trying to use as a secondary monitor i am getting a "no VGA signal"

  • WAR file is not getting extracted even though server.xml is configured

    Hi all, I'm using tomcat 4.1... I created the rmitunneler.war file which contains one servlet and configured the server.xml as the following: <Context path="/rmitunneler" docBase="/opt/uas/rmitunneler"      crossContext="false"      debug="0"      re

  • HT4623 how to use itunes and download IOS for the phone

    Hi Recentle i am having problem using whats app application, it says my IOS system does not support....how do i use the I-tunes to down load this into my phone.  Thanks

  • Windows sensor button on monitor stopped working - Yoga Pro 2

    I have had my Yoga Pro 2 all of 3 days, and today the sensor button located on the monitor that brings you to the windows start screen has stopped working. When I purchased the laptop, the sensor worked fine - it vibrated when touched and brought me