Connecting CF to OpenOffice (convert html to doc)

I have a simple problem that appears to have a difficult solution.
I need to convert an HTML file to a DOC file (an actual word document, not just the extension)
In the past I would use cfobject to connect to Word, open the file and then save it as a word file.  This required Word to be installed on the server and got a little fuzzy win running in 64bit so the solution needs to skip Word being installed.
I've seen Apache's POI and some talk about using Open Office.  Using Open Office would be the prefered solution as this will already be installed on the server.
Does anyone know an easy way (free or cheap) to convert an HTML file to a DOC file?
Even if you know how to use ColdFusion to connect to Open Office I can start there.
Thanks for any help!!

Hey Adam, no need to appologize at all.  I was really hoping you had a solution with that link!!
Please don't feel like you have to spend too much time on it, I know I've spent tons of hours on this over the years.
Solution that DID work:
Created a word document and saved it as an HTML file (these files have all the Microsoft tags in them allowing me to set how the file will open in word as well as enable Trach Changes.  Its not just a simply HTML file)
Use CF to to replace text and create new HTML files
Start Word as a COM object using CF
Open the Word file using CF
Save as a DOC file using CF (It must be a REAL doc file, not just a file with a doc extension)
Close Word
This solution worked until I had to move the site to a 64bit server.  It did not like to open Word after the move.  I also understand that Word isn't necessarily made to be run on a server and I don't want to install it anymore and rely on it so I'm moving past this as a solution.
Solution I want to work:
Create the HTML files like I have before
Use OpenOffice or POI to convert to an actual Word file
I've had problems with this.  OpenOffice doesn't open the HTML files in Writer.  It opens in Writer/Web and doesn't have the ability to save as a doc, only as other HTML files.  I tried to use POI but I can't seem to get it to simply open a file and save it.  It would be way to difficult and I have to much formatting to create it on the fly, I also need Track Changes enabled when the file is open.
Anyway, that's where I am. I think a company called Aspose has a product called "Aspose.Words for Java" which may work BUT it costs more than the projects budget will allow and its a yearly cost not just a one time up front cost.
Thanks for taking a look. I may need to break my overall process to create the files, have the user download a zip file of the 100 HTML files and then have them manually open and save them as doc files and then upload a new zip file.  I could then continue the processes my files go through.
Thanks!

Similar Messages

  • How to convert Ms Word (.doc) file to Protected pdf

    Hi all,
    Is anybody out there could help me on how to convert Ms Word (.doc) file to protected pdf file using java? May be there are some jar file I need to download or any tools you used before? Thanks in advance... =)

    Hi all,
    Is anybody out there could help me on how to convert
    Ms Word (.doc) file to protected pdf file using java?
    May be there are some jar file I need to download or
    any tools you used before? Thanks in advance... =)Hi All,
    Thanks for your replies..I think i almost find the solution. I found 2 options to do this. They are :
    1. Get Adobe Acrobat and it's SDK (has to buy)
    2. Get OpenOffice 2.0.4 and it's SDK (opensource)
    So, i do option 2. I install them in my system.then i call them from my ide. Then i follow the code from this link..
    http://weblogs.java.net/blog/tchangu/archive/2005/12/open_office_jav_1.html
    Thanx.. =)

  • How do I convert a word doc to pdf?

    I cannot seem to convert a word doc to a pdf- it keeps taking me back to the subscription page and I already have signed up for that. HELP!

    Hello Saundra,
    I'm sorry to hear you're having trouble.  Could you post a screenshot of what you see when you try to log in here: http://createpdf.acrobat.com/signin.html ?
    -David

  • Unable to convert Microsoft Word doc. to PDF in Words (there is no response)

    Unable to convert Microsoft Word doc to PDF in Words (Does not respond) or Create PDF from a Word doc. in Adobe Acrobat X Standard 10.1.1 with all updates installed. I receive apop-up saying "Missing PDF Maker Files: Dou you want to run the installer in Repair Mode"  I have done this several times. I have un-installrd and re installed the program twice. Still does not work. I'm running Windows 7 Home version and Microsoft Office XP 2002. This is a brabd new Acrobat program right out of the box. Suggestions Please.

    In WORD 2002, I believe you can only print to the Adobe PDF printer. I think that WORD 2003 is the first compatible with AA X. Check out http://kb2.adobe.com/cps/333/333504.html.

  • Problem with converting html to pdf using LiveCycle ES Java API

    I am using this code to convert html to pdf.
    * 1. adobe-generatepdf-client.jar
    * 2. adobe-livecycle-client.jar
    * 3. adobe-usermanager-client.jar
    * 4. adobe-utilities.jar
    * 5. wlclient.jar
    import java.io.File;
    import java.util.Properties;
    import com.adobe.idp.Document;
    import com.adobe.idp.dsc.clientsdk.ServiceClientFactory;
    import com.adobe.idp.dsc.clientsdk.ServiceClientFactoryProperties;
    import com.adobe.livecycle.generatepdf.client.GeneratePdfServiceClient;
    import com.adobe.livecycle.generatepdf.client.HtmlToPdfResult;
    public class ConvertHTML {
       public static void main(String[] args)
            try{
            //Set connection properties required to invoke LiveCycle ES                             
            Properties connectionProps = new Properties();
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_DEFAULT_EJB_ENDPOINT, "t3://localhost:7001");
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_TRANSPORT_PROTOCOL,Service ClientFactoryProperties.DSC_EJB_PROTOCOL);       
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_SERVER_TYPE, "WebLogic");
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_CREDENTIAL_USERNAME, "administrator");
            connectionProps.setProperty(ServiceClientFactoryProperties.DSC_CREDENTIAL_PASSWORD, "password");
            //Create a ServiceClientFactory instance
            ServiceClientFactory factory = ServiceClientFactory.createInstance(connectionProps);
              //Create a GeneratePdfServiceClient object
            GeneratePdfServiceClient pdfGenClient = new GeneratePdfServiceClient(factory);
           //Get an HTML document to convert to a PDF document a
            String inputFileName = "http://www.adobe.com";
            //String inputFileName = "C:\\Documents and Settings\\venkat\\Desktop\\Adobe.htm";
            String securitySettings = "No Security";
            String fileTypeSettings = "Standard";
    System.out.println("one");
            //Convert HTML content to a PDF document
            HtmlToPdfResult result = pdfGenClient.htmlToPDF2(inputFileName, fileTypeSettings, securitySettings, null, null);
    System.out.println("two");         
            //Get the newly created document
            Document createdDocument = result.getCreatedDocument();
            //Save the PDF document as a PDF file
            createdDocument.copyToFile(new File("C:\\test.pdf"));
        catch (Exception e) {
            System.out.println("Error OCCURRED: " + e.getMessage());
            e.printStackTrace();
    I can able to compile this class but while running i am getting error like below.
    Error OCCURRED: Internal error.
    ALC-DSC-000-000: com.adobe.idp.dsc.DSCRuntimeException: Internal error.
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.doSend(EjbMessageDispatcher.java
    :160)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispat
    cher.java:57)
            at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:208)
            at com.adobe.livecycle.generatepdf.client.GeneratePdfServiceClient.htmlToPDF2(GeneratePdfSer
    viceClient.java:666)
            at ConvertHTML.main(ConvertHTML.java:84)
    Caused by: java.rmi.RemoteException: Remote EJBObject lookup failed for 'ejb/Invocation'; nested exc
    eption is:
            org.omg.CORBA.COMM_FAILURE:   vmcid: SUN  minor code: 203  completed: No
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initialise(EjbMessageDispatcher.
    java:101)
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.doSend(EjbMessageDispatcher.java
    :130)
            ... 4 more
    Caused by: org.omg.CORBA.COMM_FAILURE:   vmcid: SUN  minor code: 203  completed: No
            at com.sun.corba.se.impl.logging.ORBUtilSystemException.writeErrorSend(Unknown Source)
            at com.sun.corba.se.impl.logging.ORBUtilSystemException.writeErrorSend(Unknown Source)
            at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.writeLock(Unknown Source)
            at com.sun.corba.se.impl.encoding.BufferManagerWriteStream.sendFragment(Unknown Source)
            at com.sun.corba.se.impl.encoding.BufferManagerWriteStream.sendMessage(Unknown Source)
            at com.sun.corba.se.impl.encoding.CDROutputObject.finishSendingMessage(Unknown Source)
            at com.sun.corba.se.impl.protocol.CorbaMessageMediatorImpl.finishSendingRequest(Unknown Sour
    ce)
            at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.marshalingComplete1(Unkno
    wn Source)
            at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.marshalingComplete(Unknow
    n Source)
            at com.sun.corba.se.impl.protocol.CorbaClientDelegateImpl.invoke(Unknown Source)
            at com.sun.corba.se.impl.protocol.CorbaClientDelegateImpl.is_a(Unknown Source)
            at org.omg.CORBA.portable.ObjectImpl._is_a(Unknown Source)
            at weblogic.corba.j2ee.naming.Utils.narrowContext(Utils.java:126)
            at weblogic.corba.j2ee.naming.InitialContextFactoryImpl.getInitialContext(InitialContextFact
    oryImpl.java:94)
            at weblogic.corba.j2ee.naming.InitialContextFactoryImpl.getInitialContext(InitialContextFact
    oryImpl.java:31)
            at weblogic.jndi.WLInitialContextFactory.getInitialContext(WLInitialContextFactory.java:41)
            at javax.naming.spi.NamingManager.getInitialContext(Unknown Source)
            at javax.naming.InitialContext.getDefaultInitCtx(Unknown Source)
            at javax.naming.InitialContext.init(Unknown Source)
            at javax.naming.InitialContext.<init>(Unknown Source)
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initJndiContext(EjbMessageDispat
    cher.java:213)
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.getJndiContext(EjbMessageDispatc
    her.java:226)
            at com.adobe.idp.dsc.provider.impl.ejb.EjbMessageDispatcher.initialise(EjbMessageDispatcher.
    java:87)
            ... 5 more
    can u plz give me some way to do the convertion.

    Yes Sir.....Thanks for ur suggestion.....
    But i didn't find exact solution..well..yes i found some but not exactly there were not in the way i required...I jus need to convert HTML to PDF using iText API for java.....I already used some classes in that like HTMLParser.....etc..
    So Any thing else...Any one...Sure can help me in this................

  • Convert  html to word document

    convert html to word document ,
    I tried poi-3.0.2-FINAL,Apache POI - HWPF - Java API to Handle Microsoft Word Files
    it is not working...

    My actual goal is convert html file into word document,
    i posted into forum, some people are suggested HWPF just look,
    I tried one by one program i not getting any answer for example one program,
    HWPFDocument     doc = new HWPFDocument (new FileInputStream ("c:\\temp.doc"));
                   Range r = doc.getRange();
              System.out.println("Example you supplied:");
              System.out.println("---------------------");
              for (int x = 0; x < r.numSections(); x++)
              Section s = r.getSection(x);
              for (int y = 0; y < s.numParagraphs(); y++)
              Paragraph p = s.getParagraph(y);
              for (int z = 0; z < p.numCharacterRuns(); z++)
              //character run
              CharacterRun run = p.getCharacterRun(z);
              //character run text
              String text = run.text();
              // show us the text
              System.out.print(text);
              // use a new line at the paragraph break
              System.out.println();
              }catch(NullPointerException exception){
                   exception.printStackTrace();
              } catch (FileNotFoundException e) {
                   // TODO Auto-generated catch block
                   e.printStackTrace();
              } catch (IOException e) {
                   // TODO Auto-generated catch block
                   e.printStackTrace();
    java.io.IOException: Invalid header signature; read 5789751444030890300, expected -2226271756974174256

  • Err msg converting pdf to doc/docx

    "An error occured while trying to access the service" whilst trying several times to convert .pdf into .doc & .docx ((have also had to type this in twice because I did not have a "screen name" - I have one now))
    Maybe I am being dim, but I thought this was a request to a "help desk" or similar - seems to have gone out as a "discussion" - at the risk of coming across as grumpy, I don't want a discussion - I am trying to run a business - I need help, preferably asap from Adobe...

    Sorry, I missed one of your points. Yes, this absolutely is a discussion, a forum or whatever. You'll hear from fellow users, not Adobe staff.
    https://www.acrobat.com/exportpdf/en/faq.html discusses your options for help with ExportPDF.

  • PDF to HTML or .doc conversion

    Hi
    Do we have any Java API for converting PDF files to HTML or .doc or XML format?
    It seems pdf files doesn't maintain any structure inside so its difficult to parse or convert PDF files to other file formats..
    Comments, suggestions are welcome
    -Ven

    Do you know about GObcl.com
    They have a free webservice to convert pdf file to html or doc or doc file to pdf.
    Just mail you *.doc file to [email protected] and you will get a reply in the form of the zip file containing doc files.
    May be you can use this in your app.

  • Content in Jsp to be converted to Word Doc

    I have .jsp page. with some generated content. In that page, there is an option Convert to Word DOC PAGE. When the link is clicked , the content in the JSP page has to be converted to a Word Doc. How to do?

    <%@ page language="java" %>
    <%@ page import="java.util.*" %>
    <%@ page import = "java.io.*" %>
    <HTML>
    <HEAD>
    <script language="JavaScript">
    var fso = new ActiveXObject('Scripting.FileSystemObject');
    var wdApp = new ActiveXObject("Word.Application");
    function readFromFile(fileName)
         if (fileName == "C:\\Award_Ltr.TXT")
    var fs = fso.OpenTextFile(fileName);
    var result = fs.ReadAll();
    return result;
    function readFromWord()
    alert("PLEASE SAVE THE FILE AS C:\PPY Letter for Annuities to Retirees and Alternate Payees wi_temp.doc");
    var pause = 0;
    var wdDialogFileOpen = 80;
    var wdApp = new ActiveXObject("Word.Application");
    var dialog = wdApp.Dialogs(wdDialogFileOpen);
    var button = dialog.Show(pause);
    </SCRIPT>
    </HEAD>
    <BODY>
    <FORM NAME="formName">
    <INPUT TYPE="file" NAME="fileName">
    <INPUT TYPE="button" VALUE="show"
    ONCLICK="this.form.fileContent.value = readFromFile(this.form.fileName.value)">
    <BR>
    <TEXTAREA NAME="fileContent" ROWS="20" COLS="90" WRAP="off"></TEXTAREA>
    <BR>
    <INPUT TYPE="button" VALUE="SaveExtract" >
    <BR>
    <INPUT TYPE="button" VALUE="Modify Template" onClick = "readFromWord()">
    </FORM>
    </BODY>
    </HTML>

  • Converting HTML to XML

    Does any one know if there is a
    plug-in for HomeSite to convert HTML documents to XML
    documents? This would save alot instead of buying another program
    that does this. Because Adobe just finished a online tutorial on
    how to use AJAX which made it very easy. Now I need to switch my
    HTML docs to XML.
    PG

    In my previous post I have added a singleton tag img with the closing of it.
    Normally the tags which are singleton tags in HTML can beclosed by using /> at the ned of the tag
    Like following:
    <img src="abcd.gif" />
    <input type="button" />
    However, I agree with your point about Quotes. In HTML it is not mandatory to put quotes around a value.
    But if it is possible then you  can make it a habit to put quotes around all the HTML attribute values.
    [This is also a standard practice]

  • Converting docx to doc files using wordconv.exe

    Hello,
    I have a requirement wherein I need to convert the docx files to doc files. I looked around and found the "Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File
    Formats" ("http://www.microsoft.com/downloads/details.aspx?FamilyId=941B3470-3AE9-4AEE-8F43-C6BB74CD1466&displaylang=en"). Installing this would place a bunch of files in the C:\Program Files\Microsoft Office\Office12\ directory. This folder also contains an executable named "Wordconv.exe" which according to http://www.oooninja.com/2008/02/office-compatibility-pack-review.html converts the docx files to doc files if you use the execute something like the following in the command prompt:-
    "C:\Program Files\Microsoft Office\Office12\wordconv.exe" -oice -nme <input file> <output file>
    I downloaded the compatibility pack and tried the above command. Nothing happens. No error message,no output,nothing. I wonder what is the problem. In the above link,some guys have suggested to download the latest windows updates from Windows.
    Well,I tried this in my Windows XP (with Service Pack 2) and only thing I am left to install in service pack 3. Is that required? This WIndows XP machine does not have the Office 2007 installed. Is it required?
    Also,I tried this on a Windows server 2003 machine which also has the compatibility pack and result is same.This machine does have the Office 2007 installed.
    Am I missing anything? If yes,please let me know as I am kinda stuck in this. I dont want to use a commercial product like "Aspose.Words" for this.
    Is there any other tool available from Microsoft to convert docx to doc files? Please let me know. 
    I am cuurently looking into the Office tools.
    Thanks in Advance,
    Ashish

    Hello!
    I've got the same problem. I've tried to google it, and found this topic.
    Have you found the solution?
    Thanks, Victor.

  • Need to Convert PDF to doc in Russian, why programm do not recognice it?

    Need to Convert PDF to doc in Russian, why programm do not recognice it?

    Hi alsu22,
    The OCR system that converts documents does not recognize any cyrillic languages, such as Russian.  If the text is already renderable (selectable), you may want to try converting your document without the OCR function enabled. You can find steps here: http://forums.adobe.com/docs/DOC-3062
    -David

  • How to convert html to pdf using acrobat sdk 8.0?

    hi
    I am a beginner of acrobat sdk .
    I want to know How to use acrobat sdk 8.0 to convert html to pdf?
    herere some questions :
    1:How to support navigation inside PDF file that generated using acrobat sdk 8.0? For example: theres catalog in the top of HTML file, customer hopes can navigate inside the PDF file just like navigating inside the HTML file.
    2:How to support operating some controls in the PDF file that generated using acrobat sdk 8.0? For example: therere some drop down list and text box in HTML file, customer hopes can input text in the text box, click the drop down list to see available options in it just like in HTML file.
    Thanks in advance for any help and suggestion.

    Hello,
    I want a system to re-brand my 37 pages PDF for affiliates.
    I want a php dynamic link in the PDF online in order to personalize automatically the PDF for each affiliate. I need to change 2 links each time. The affiliate ID and the Paypal email (payment button) in page 36.
    Can you help?
    Please let me know
    Thank you
    Alex
    PS My system is online and i can give you the url if it helps.

  • When I convert my pdf doc to word, the fonts go really weird and it also puts some text into boxes. when I try to select the test and change the font, it does not change it properly?

    When I convert my pdf doc to word, the fonts go really weird and it also puts some text into boxes. when I try to select the text and change the font, it does not change it properly? This is making it impossible to amend.

    Hi Janedance1,
    If the PDF that you converted already has searchable text, please try disabling OCR as described in this document: How to disable Optical Character Recognition (OCR) when converting PDF to Word or Excel. (If the PDF was created from a scanned document and doesn't already have searchable text, disabling OCR isn't a great option, as the text won't be searchable/editable in the converted Word doc.)
    Please let us know how it goes.
    Best,
    Sara

  • A tool can convert HTML to Excel

    Hi All , Are you using report 6i and want to out put report in excel format? If you are , a free software which can convert HTML to Excel is available .
    The software is designed to print very large report , Now a wonderful function is added to software , Thru which you can convert HTML to Excel easily . But the function is still basal , It will do better in the future .
    For more information, Please visit
    http://repbrowser.freewebpage.org/
    Thank you ,
    Regards

    Hi,
    the only other ways (as I know), if you really want to convert is
    a) write a parser to convert html into csv(xls)
    b) use a html2csv script on the os level
    like:
    http://sebsauvage.net/python/html2csv.py (or just google html2csv)
    c) use excel (data source web; local file: "file:///C:/test.htm"
    Kind Regards,
    Dirk

Maybe you are looking for

  • HP Photosmart C4580-hpzsetup.exe

    I have an HP Pavilion dv5 Notebook PC that I purchased in Sept 2008.  It is running Windows Vista Home Premium, Service Pack 2, 32-bit operating system.  I have an HP Photosmart C4580 All-in-One Printer that I also purchased at the same time.  I have

  • Need help with drag and drop game, Urgent!

    Hi I have created a drag and drop game, the drag and drop is working alright however once the right word has been placed in the box, and moves on to the next question the previous correct answer stays where it was placed, how can i get it to snap bac

  • Outlook Users connection issues after Mailbox Migration from 2007 to 2013

    Hi, We have a coexistance between Exchange 2007 SP3 and Exchange 2013 CU7. There is an issue when after migrating the user from exchange 2007 mailbox to exchange 2013, the outlook keeps on prompting for password when it is being launched. Checking th

  • Business Delegate and Session Facade usage.

    Hi guys. I am new to JavaEE and I recently learnt the Business Delegate and Session Facade design patterns. The tutorials from Oracle did gave me a basic idea of what they are and why they are used, but the example didn't really answer all my questio

  • Authentication service (202) did not complete successfully

    Hello, booting my wifes G4 Quicksilver (867MHz), OS 10.4.9, lasts since yesterday about 4 minutes; the time before booting lasted about 1 minute. Excerpt from console.log: Jul 5 22:03:20 MamasMac shutdown: reboot by root: Jul 5 22:03:20 MamasMac Syst