Export pdf to html/txt/xml

Hi,
I downloaded "adobe acrobat x pro" for trying the "save as"/export functionality to xml/htm/text etc. and the result was exactly what I was looking for in terms of output, keeping formatting etc.
However, I am building an application which need to have an embeded library in order to do pdf to html/txt/xml conversion on the fly keeping formatting.
I have tried a number of libraries for pdf to html/txt/xml conversion an none of them deliver anything near what adobe acrobat x pro does in terms om keeping format/tables etc.
So, my question is how can I get access to the "save as"/export functionality in adobe acrobat x pro in any official adobe library, sdk, service, product etc. since I assume acrobat x pro does not expose any api for convert functionality or may be used serverside?
Best regards,
Rick

It sounds like you want to use Acrobat as a web service. Rather than pursue this route, you may want to note that such a use of Acrobat is not permitted under the license. Thus it may not worth pursuing. Why convert to HTML is a possible question anyway, at least on a regular basis? On occasions I can understand the need.
For programmable features you should probably check in the SDK forum.

Similar Messages

  • Export pdf as html by Acrobat 9.2 on windows 7

    Dears,
    i have a problem with exporting pdf as html by Acrobat 9.2 on windows 7.
    after exporting, images and text may have wrong positions or wrong width and height of the images.
    is the problem in the compatibility between acrobat 9.2 and windows 7 ?
    what can i do ?
    Thank you in advance....
    amt

    thank you, but i think that i can get 90% representation of the pdf,
    but that didn't occure,
    also i saw examples for some tools which can do that, but for pdf version 1.5 and i think that is on old windows too.

  • Issue with Exporting Pdf as HTML

    Hi all,
    I am trying to Convert a PDF to HTML using the "Save as Other -> HTML(Web Page) in the File menu, After Conversion I see some of the characters that are present in Symbol Font are not converted properly(Eg a text 8.7 appears as ""). instead of numerals. Any thoughts on this issue. Thanks in advance.
    Regards
    Srini

    I can't see this working very well. Even if it exported a correct reference to the symbol font, people who view the converted page probably won't have the font and won't see the symbols. You can perhaps replace them with graphics.

  • How to create PDF/RTF/HTML from XML?

    Hi,
    I like to generate reports as per clients requirement in PDF/RTF/HTML format....... I have genrated XML file, with the help of the same i like to generate other report formats.......i tried FOP but there is nothing for RTF format..... i tried aurigadoc but that has some problem .not able to maintain colspan........SO pl guide me to solve this issue......................thnx

    Hi,
    Thanks for suggestion.........But i am already implemented FOP for PDF...... But with the same i can not generate RTF becoz that facility is not there ....... thats why i want a unique solution which will solved my both the problems...... So pl let me know the solution..........Thnx

  • How to export PDF to HTML with JPEG image format (not PNG)?

    Hello,
    When I export a ".pdf" file to ".html", using Acrobat 11 Pro, the program creates a subdirectory with ".png" image files.
    I need these images to be in the ".jpg" format, not ".png".
    Do any of you know how to change this setting? I am assuming that it is not a permanent default...
    Thank you,
    brivera0

    Alas, I checked on my Acrobat XI before posting. That setting was removed.

  • Exporting a PDF to HTML without OCR

    Hi All,
    I am using Adobe Acrobat X to export PDFs to HTML files. It looks like the HTML conversion runs an OCR process on the document before the HTML page is written. This is resulting in a lot of images not showing properly because the OCR process strips out the text and puts it in the body of the HTML rather than recognizing that it should be part of the image. I had used Acorbat 9 to convert to HTML in the past and this was not an issue.
    Is there any possible way to disable the OCR portion of the HTML conversion in Acrobat X?
    Thanks,
    Teri

    Hi Teri,
    Edit > Preferences > Converting from PDF > HTML.  Click 'Edit Settings..' and uncheck 'Run OCR if needed'.
    -David

  • Export PDF Form to XML through VBScript

    Hi,
    I was wondering if there is a way to automate the export of an Adobe PDF Form to XML, either using the Adobe SDK/AcroExch.App or the PDF Test Toolkit.
    I'm wanting to perform this action via a QuickTest Pro Script and thought there might be a funciton already written to perform this action, rather than having to automate the selection of Document>>Forms>>Export Data... from the Adobe Reader Menu and the Export Dialog.
    Any suggestions would be much appreciated
    Thanks in advanced,
    Ross

    Hi,
    Are you trying to edit the Adobe Test Toolkit ? If yes, I think it’s not easy to edit Adobe Test Toolkit. If you want to export PDF file to Microsoft XML file and Adobe is not working with you, than I will recommend you to try Classic PDF Editor which can easily export PDF files to Microsoft XML files. Classic PDF Editor also done many other type of file conversation like PDF to Doc, XML, PPT etc
    If Classic PDF Editor solve your problem, must come back share your important views about this software.
    Thanks

  • Pdf to html 4.01 with css 1.0 using c#

    Hi all ,
    I'm working on a C# console application that is used to export pdf to html ,
    Is there a method in Acrobat SDK ,  to export a pdf document to html 4.01 with css 1.0 , or even save it as html 4.0 with css ?
    Thank you.

    Thank yo lrosenth , problem is solved now

  • Clean pdf to html conversion

    I am searching for a clean conversion from pdf to html or xml. I know that there are many solutions, which follow the way of keeping the positions and layouts of several contents. But I am searching for a conversion tool, which converts into clean html without css or into xml. Existing products create a mess of thousand div-tags and span-tags, but you cannot differentiate between a header and a table.
    Example (what I need):
    <h1>This is a header 1</h1>
    <p>This is text</p>
    <img src="..... />
    <h2> This is header 2 </h2>
    <table>.....This is a table ....... </table>
    Existing solutions:
    <div style="position:.....">Text</div>
    <div style="position:.....">Text</div>
    <div style="position:.....">Text</div>
    <div style="position:.....">Text</div>
    Is there any product, which can do that? (batch conversion on servers (e.g. JAVA))

    Given that CSS is part of HTML - I don't see why that would be an issue.
    Since this is Adobe's forum, we offer a Java-focused server side solution called LiveCycle ES.

  • Exporting form field data to xml

    Hello,
    I want to export pdf form data to xml file.
    I don't want to use adobe Livecycle Designer.
    I want to save the data in xml file where pdf file is present.
    i have created button as export.
    when user clciks on export button, the data has to be exported to xml file.
    how can i do this using Java script, is there any sample code available?
    Please help me
    Thanks in avance,

    thanks for your help.
    I want to save form data to xfdf file, on click of button.
    I have written below code in one .js file.
    Export = app.trustPropagatorFunction(function(oDoc, sPath, bXDP, sPackets)
    app.beginPriv();
    oDoc.exportXFAData(sPath, bXDP);
    app.endPriv();
    TrustedExportXFAData = app.trustedFunction(function(oDoc, sPath, bXDP, sPackets)
        var bSuccess = false;
        app.beginPriv();
        try {
            Export(oDoc, sPath, bXDP, sPackets);
            bSuccess = true;
        catch(e){
            app.alert(sPath  + " NOT exported!\n" + e.message + "\n" + e.name);
        app.endPriv();
        return( bSuccess );
    on button mouseup action i am calling trusted function as
    TrustedExportXFAData(this, "/c/testing", "True", "*");
    I am getting error as /c/testing NOT exported
    Secuirty settings prevent access to this property or method.
    NotAllowedError.
    Please help me to solve this issue.a

  • How to import pdf with hyperlink and and maintain its hyperlink while exporting it as HTML?

    We can import pdf in indesign either using plaacemultiplefile script or using place submenu  under file menu but both of them import pdf as graphic object and lost its all hyperlink. no doubt when i again export its as pdf than hyperlink are there as it is but when i export it as HTML,IDML or other format than than all hyperlink are lost so is there any way to maintain this if no than is there any way to write some new script utilities which will do the same .one solution i think is first convert pdf into docx and than import it as docx indesign import  it as text frame and maintain all hyperlink but in this process pdf file lost its some formating while converting it into docx .

    I probably should've remembered "rtfm" before writing all that.  It seems to me now that I need to create a Master Page for each of the different layouts, then dynamically select the one I want to use at run-time, maybe with a script event ?  The Master Page creation is a bit messy ... the only way I seem to be able to get multiple Master Pages is to open each Word doc, let Livecycle import it, then copy all the content into a new Master Page in my main document.  That is, there doesn't seem to be a way to import a series of Word docs directly into separate Master pages of a single Livecycle form.  I'll continue with the "read" part of "rtfm" ...

  • Exporting PDF text to html

    How can I export PDF text and post the exported text on a web page, to which I can then apply Google Translate?  Our organization post PDF articles from our journal.  (I can manually block and copy the text, so I know the text can be captured.)  I want a program/app/software to run on our website that will allow a user to extract the text from the PDF and display the text as html.  From there, the user can apply Google Translate.  So does anyone know how I can do this?  It doesn't seem like a difficult task -- I can do it manually -- but I want an app that will do it automatically.

    Thanks for the reply.  Do you have any idea how I could do what I want to do, perhaps with some other software?

  • Exporting pdf to xml

    We want to export pdf data into xml format (the same way as "Export Data" functionality in Adobe Reader). We have a huge set of pdf files to export (approx 20K). We got to know that we can achieve this using Acrobat 9  Professional software. Please suggest whether this is the right approach. Is it possible to achieve the same functionality through some batch program. Your quick reponse will be appreciated, since we require this information ASAP for one data migration activity.

    Hi Bernd,
    Thanks for your reply. Can you please elaborate it more as to how I can sent up batch sequence, what javascript code is required for this purpose and what pre-requisite softwares are required to achieve the same.
    I read somewhere that using Acrobat 9 Pro OOTB functionality, we can convert pdf form data into xml. We have about 20K pdf files currently that needs to be converted to xml. This is just a beginning and we expect more data in pdf files to be convrted to xml in future. We want to evaluate as to whether we can use Acrobat 9 Pro for this purpose,and how we can set this up as a batch sequence so that no manual intevention is required.
    Thanks
    Ajith Jacob

  • Convert xml file into .pdf and .html

    hi all,
    can any one let me know how can i show xml file values into pdf and html by using java
    thanks in advance

    sreeks27 wrote:
    hi all,
    can any one let me know how can i show xml file values into pdf and html by using java
    thanks in advanceTake a look at Apache FOP:
    http://xmlgraphics.apache.org/fop/

  • Converting PDF Files to Html or Xml

    how can i tranfrom the pdf file to html or xml using Acrobat's API? The software already have the function(http://tv.adobe.com/watch/learn-acrobat-x/converting-pdf-files-to-other-file-formats/). In C# ,I can use the acrobat's dll open the pdf file  and  can invoke the  MenuItem SaveAs;
    like this:
                AcroApp.Show();
                AcroAVDoc.Open(@"D:\xpdf\a.pdf","aaaa");
                AcroApp.MenuItemExecute("SaveAs");
                AcroApp.CloseAllDocs();
                AcroApp.Exit();
    But this is not automatic.

    Try the forum for Acrobat SDK.

Maybe you are looking for

  • Pages doesn't print lowercase.

    Hi, I have a document with some tables and with some letters with lowercase in those tables. The problem is that when I print this document it's printed with those letters with lowercase missing. I've added screenshot of haw it should look and scan o

  • Error when trying to create recordset

    I'm trying to create a recordset on my ASP page but when I click Bindings-Recordset I get this error: 'While executing applyServerBehavior in Recordset.html, the following Javascript error(s) occured: At line 1243 of file "C:\Program Files...\Shared\

  • Wifi losing power and disconnecting

    Hi, My mid-2014 MacBook Pro keeps disconnecting from wifi intermittently, whether in use or in stand by. I reset the wifi and it reconnects after some time searching. The errors in the wifi.log each time this happens are below. I suspect there's a pr

  • Decimal Notation in Forms as per his SAP Decimal Notation

    Hi, We need to display all numeric fields in the Adobe Form as per the Decimal notation of the User in his SAP user defaults. Also, we need to take care that the currencies have current decimal places. Please let us know if this is achievable in Adob

  • M2T Files in FCP

    I'm currently editing on FCP 7. I just downloaded a plugin from Sony that should allow me to use M2T files. I have a friend who has downloaded all of the files onto an external drive using his PC. Will I be able to copy the M2T files over onto my com