Converting HTML document into Document

i want to convert the text of an HTML page into a document...
how can i do that....plzz any ideas...
Edited by: ping.sumit on Jul 14, 2008 10:28 PM

If you mean a word document the simply right click-> open with -> Microsoft Word
if you mean a Java Document then parse it as an XML file because the HTMLDOM is basically XML Syntax
Don't know about Scripts though probably end up being the content of the <Script> tags

Similar Messages

  • Converting html file into zip file and send email attaching zip file

    Hi Experts,
    I am trying to send email with attachment(html). Which contains more than 7MB. So, It is throwing an error like Size exceeded.
    So, Now i need to compress the data for less than 7MB.
    I decided to convert HTML File into ZIP File.
    Kindly suggest me to convert the HTML file into ZIP file and sending email with attached ZIP file.
    Correct answer rewarded,
    Thanks & Regards,
    N. HARISH KUMAR

    Hi Experts,
    *// HTML_TAB converting into ZIP File
       DATA  : zip_tool TYPE REF TO cl_abap_zip,
               filename TYPE string ,
               filename_zip TYPE string .
       DATA  : t_data_tab TYPE TABLE OF x255,
               bin_size TYPE i,
               buffer_x TYPE xstring,
               buffer_zip TYPE xstring.
    filename = text-007.                                                                          "'HTML_TAB
    *describe the attachment
       DESCRIBE TABLE html_tab LINES tab_lines.
       bin_size = tab_lines * 255.
       CALL FUNCTION 'SCMS_BINARY_TO_XSTRING'
         EXPORTING
           input_length = bin_size
         IMPORTING
           buffer       = buffer_x
         TABLES
           binary_tab   = html_tab.
       IF sy-subrc <> 0.
    *     message id sy-msgid type sy-msgty number sy-msgno
    *     with sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
       ENDIF.
    *create zip tool
       CREATE OBJECT zip_tool.
    *add binary file
       CALL METHOD zip_tool->add
         EXPORTING
           name    = 'FSSAI_MAIL.HTML'
           content = buffer_x.
    *get binary ZIP file
       CALL METHOD zip_tool->save
         RECEIVING
           zip = buffer_zip.
       CLEAR: t_data_tab[],bin_size.
       CALL FUNCTION 'SCMS_XSTRING_TO_BINARY'
         EXPORTING
           buffer        = buffer_zip
         IMPORTING
           output_length = bin_size
         TABLES
           binary_tab    = html_tab.
    Thanks & Regards,
    N. HARISH KUMAR

  • How can convert HTML file into xml file?

    Hi,
    I am receving one HTML file as an input and i want to convert that receiving(html file) into .xml file.Is there any converter (tools)to do this.Pls if any give me the details with regard.
    Regards,
    mahesh.

    Use the HTMLEditorKit to parse the html file.
    this kit is having the callback methods which
    are called wenever the tag appears in the HTML
    stream.

  • Convert html file into image

    hello,
    I want to create a templete for business card
    one write name , tel, fax, and upload the logo. I want to convert this tamplete
    incluisive texts and logo to an image for printing.
    how to convert this tamplete into an image
    Regards

    SVG.

  • Converting HTML page into powerpoint file

    Hello Java Gurus,
    I have Java web application which generates reports in various formats (.rtf, .pdf, .xls and .html files) using JasperReports, but now the client needs the powerpoint (.ppt) format that is not supported by JasperReports. As the JasperReports are generating the .html format, I thought if .html page can be converted to .ppt it would be great. Please, I need guidance in this regard; Can anyone help in this direction?
    Thanks,
    NS.

    Anybody, please?? I need the solution urgently.
    Thanks in advance,
    NS.

  • How to convert html file into excel file in java

    Hi EveryBody,
    My problem is to read some coloumns of table from html page and to keep in excel file .
    My Id is [email protected]
    Bye .

    cross post

  • How to convert html file into excel file in java (Need Code)

    Hi EveryBody,
    My problem is to read some coloumns of table from html page and to keep in excel file .
    My Id is [email protected]
    Bye .

    Cross post

  • Convert  html to word document

    convert html to word document ,
    I tried poi-3.0.2-FINAL,Apache POI - HWPF - Java API to Handle Microsoft Word Files
    it is not working...

    My actual goal is convert html file into word document,
    i posted into forum, some people are suggested HWPF just look,
    I tried one by one program i not getting any answer for example one program,
    HWPFDocument     doc = new HWPFDocument (new FileInputStream ("c:\\temp.doc"));
                   Range r = doc.getRange();
              System.out.println("Example you supplied:");
              System.out.println("---------------------");
              for (int x = 0; x < r.numSections(); x++)
              Section s = r.getSection(x);
              for (int y = 0; y < s.numParagraphs(); y++)
              Paragraph p = s.getParagraph(y);
              for (int z = 0; z < p.numCharacterRuns(); z++)
              //character run
              CharacterRun run = p.getCharacterRun(z);
              //character run text
              String text = run.text();
              // show us the text
              System.out.print(text);
              // use a new line at the paragraph break
              System.out.println();
              }catch(NullPointerException exception){
                   exception.printStackTrace();
              } catch (FileNotFoundException e) {
                   // TODO Auto-generated catch block
                   e.printStackTrace();
              } catch (IOException e) {
                   // TODO Auto-generated catch block
                   e.printStackTrace();
    java.io.IOException: Invalid header signature; read 5789751444030890300, expected -2226271756974174256

  • Merge option during assembly of PDF from html documents.

    Hello,
    Can LiveCycle create a combined PDF document by converting
    HTML documents to PDF with the option of merging them (eliminate whitespace) during ddx assembly.
    Here is a simple case. Combine three html documents such as
    html-1: contains text ONE
    html-2: contains text TWO
    html-3: contains text THREE
    The default assembled document appears to have three pages with each having a its single word text of content. However, a combined document with one page containing the merged text is desired in some cases.
    Does LiveCycle handle this case. Thanks for any insight.
    Jesse

    Assembler will only deal with PDFs. PDF/G will take non PDF content and make PDF out of it. So in your case you would use PDF/G to change the HTML to PDF then use Assembler to manipulte the three docs into a single doc.
    Hope that helps

  • Convert html into tidy html to convert pdf using iText

    hello.
    I am try to convert html document into pdf.
    first i tried iText it works properly. but it needs all the tags to be witten correctly.
    when u try html not well formeted it gives an exception.
    So is there any way to convert html to pdf.
    or if not if not then way to convert html into properly taged HTML
    so it s easy to convert it to html,
    If you have any working example of Tidy.jar please send me.
    Thanks..

    Hi,
    I had a similar tasko to do i.e converting HTML to PDF.
    Please follow the link to this site and download the trial code.
    http://www.pd4ml.com
    I was able to convert my HTML to PDF.
    Have a look at it and let me know.
    Regards,
    Joe

  • Convert HTML codes to RTF

    Hi,
    In a Java Servlet, I need to convert HTML codes into an RTF/word document.
    Any help for some related Java API ?
    Regards,
    Priya Ranjan Sahay
    Message was edited by:
    Priya Ranjan Sahay

    Checkout iText:
    http://www.lowagie.com/iText/
    Example code:
    http://www.java-tips.org/other-api-tips/itext/manipulating-pdf,-rtf,-or-html-documents-with-java.html

  • Convert "html to xml"

    Does anyone know if HomeSite or Dreamweaver has the ability
    to convert HTML documents to XML documents? This would save alot
    instead of buying another program that does this.
    PG

    does htmltidy do the conversion you need?
    homesite integrates with it, though htmltidy itself is open
    source.

  • Converting HTML to XML

    Does any one know if there is a
    plug-in for HomeSite to convert HTML documents to XML
    documents? This would save alot instead of buying another program
    that does this. Because Adobe just finished a online tutorial on
    how to use AJAX which made it very easy. Now I need to switch my
    HTML docs to XML.
    PG

    In my previous post I have added a singleton tag img with the closing of it.
    Normally the tags which are singleton tags in HTML can beclosed by using /> at the ned of the tag
    Like following:
    <img src="abcd.gif" />
    <input type="button" />
    However, I agree with your point about Quotes. In HTML it is not mandatory to put quotes around a value.
    But if it is possible then you  can make it a habit to put quotes around all the HTML attribute values.
    [This is also a standard practice]

  • Convert HTML to PDF or AFP

    As part of the project we have to convert html documents to PDF or AFP. We tried with different tools like HTMLDOC and we are not able to get the perfect matching tool. Any help on finding the best tool for conversion of HTML to PDF or AFP will be appreciated.
    My basic requirement is
    1) The conversion process needs to be automated
    2) the tool has to run on Linux.
    3) Everything in the page (text, image etc) should be extracted in a single file
    Background
    A batch job which runs on Q&R cache servers to run every evening. The job has a list of 1500 symbols which iterates through and does an http get of the Stock Summary page for each ticker in the list. The next step is to launch HTMLDOC or another tool to convert to PDF, APF, or other format.
    Regards,
    Jags.

    I'm not sure it'll help you, but take a look at
    http://xml.apache.org/fop/index.html
    maybe you can go this way
    XHTML->XML->FOP->PDF
    ???

  • Converting html file to csv file or excel file

    Any body help me to convert html file into csv file or excel file using java.
    I have no idea how to proceed.
    is there any third party API's available.
    Please guide me.
    Thanks in advance
    Vivek

    dev_vivek wrote:
    I have no idea how to proceed.That could be due to the fact that there is no generic way to transform a html file into a csv or excel file.
    Could it be that you're really interested in extracting specific values from the html file and then you want to export these values to csv and/or excel?

Maybe you are looking for

  • How to Set Default Transition in FCPX?

    FCP Studio allowed you to "Make a Favorite" and set it as the default.  FCPX has a "Favorite" item in a drop-down menu, but it doesn't seem to set a chosen transition as a default.  How to you set a default in FCPX?

  • Sequence settings not matching

    Have just imported footage from my new panasonic GH2, into FCP. Footage shot 720 60 p is appearing as an exagerated 16.9 image.[ stretched out and not fitting the window. Project has footage from panasonic HVX 200 720 24p appearing normally. How do i

  • Unable to view the details data  in "Employee Self-Service 4.0" of HRMS

    Unable to view the detail data in "Summary of Absences" when click the "Leave of Absence " button in "Employee Self-Service 4.0", the whole login path is : "Employee Self-Service 4.0" ===> "Leave of Absence " It shoud be seen the "Summary of Absences

  • Multiple input streams

    hey folks newbie here i am designing a client server program with my client i want to send serialized objects & primitive types across the socket to the server is it possible to creat two output streams on the same socket i.e an ObjectOutputStream &

  • Is there a way to stop the launching applications to come to foreground?

    Is there a way to stop the launching applications to come to foreground? Back with os 10.3 Panther or 10.4 Tiger the apps opened in the background. That was one of the good differences with Windows. I guess that 10.5 Leopard adopted this annoying Win