Exporting a PDF to HTML without OCR

Hi All,
I am using Adobe Acrobat X to export PDFs to HTML files. It looks like the HTML conversion runs an OCR process on the document before the HTML page is written. This is resulting in a lot of images not showing properly because the OCR process strips out the text and puts it in the body of the HTML rather than recognizing that it should be part of the image. I had used Acorbat 9 to convert to HTML in the past and this was not an issue.
Is there any possible way to disable the OCR portion of the HTML conversion in Acrobat X?
Thanks,
Teri

Hi Teri,
Edit > Preferences > Converting from PDF > HTML.  Click 'Edit Settings..' and uncheck 'Run OCR if needed'.
-David

Similar Messages

  • Exporting a PDF from keynote without losing hyperlinks

    exporting a PDF from keynote without losing hyperlinks
    anyone have any ideas why it doesn't export the hyperlinks to the PDF?
    i only have PDF reader on my machine, not the full PDF creator

    Or, better than that, export the PDF from InDesign as PDF/X-4!
    By default, Acrobat and Reader use overprint preview when displaying PDF/X-4 files. Plus you have the benefit of full color management and not having transparency ruined by premature flattening.
    One other note. Other than for the fairly common practice of overprinting black text over colors to avoid halo effects due to registration, one should be exceptionally careful in using overprint. Trapping, the other good, justifiable use for overprint is really best performed by the RIP. Most modern RIPs provide for automatic trapping. Overprint should absolutely not be used as a “poor man's transparency,” especially in conjuction with spot colors. With the advent of real transparency in the Adobe imaging model and hence in Illustrator, InDesign, and PDF, there is no good reason to use overprint hacks anymore to simulate transparency.
              - Dov

  • How do I export to PDF in InDesign without the hidden layers

    Hi there
    This is probably a simple solution but I am trying to export a document to PDF in InDesign without the hidden layers.
    i.e - I have created a document that includes images etc from an existing PDF.
    When I open the new PDF and use the 'find' tool, Pro is highlighting words on the hidden (original pdf).
    I have tried exporting only the visible layers but is there a setting in InDesign that I can use to export only the visible content to pdf?
    Thanks in advance.

    Jenjimay wrote:
    I have tried exporting only the visible layers but is there a setting in InDesign that I can use to export only the visible content to pdf?
    Even if you mask off part of a PDF, or place a white rectange on top of it, the text will still be selectable. Other than with raster images, InDesign is not smart enough to clip off parts of a PDF (or rather, it's smart enough to leave them as they are). PDFs can contain incredibly complex objects, and any clipping routine would probably make as many mistakes as it solves problems.
    Introduce a small Transparency to this image or one on top of it, then export with Acrobat 4 (PDF 1.3) settings. Since that old version does not support native transparency, ID is forced to redraw the image, and then it will clip off excess data.

  • Creating PDF from HTML without background

    Hi,
    I want to create a PDF from a html-page. I tried several ways without success to get a PDF without a background (white). I need a transparent background to add different backgrounds for different pages in Acrobat X Pro.  I use an iMac with OSX 10.10.2, Safari 8.0.4 and have the MasterCollection 6 installed. I tried:
    - print page -> create PDF Result: white background which hides added background
    - print page -> save as Adobe-PDF Result: white background which hides added background
    - in Acrobat: create PDF from web Result: white background which hides added background
    In all cases it doesn't matter if I select "print background" or not. And it doesn't change if I add in the CSS for the web a body {background-color: rgba(255,255,255,0)} or body {background-color:transparent}.
    How can I create a PDF with no or transparent background?

    Hi Manoj
    If your application generate a static html page, you can open those html files with Reports 9i and add a dynamic data sections which would be fetch the data in your html pages with Report servcies. However you can only run those modified pages as a jsp and show the output to browser window.
    Thanks
    Rohit

  • Utf-8 export to PDF and HTML

    Hello People,
    In my Oracle DB (Oracle 10g) I use utf-8.
    Now when I have some unusual character, it works if I make the output as html. But when I make a PDF output, my bip server changes the character with a questionmark (?) Whats wrong with my BIP?
    thanks,
    Paul

    Paul
    With HTML (and RTF) outputs the browser (Word) will read the fonts you have available on the client machine. In the case of a PDF output its a completely portable document ie it can be opened and printed on any machine, the fonts are embedded in the document. What ever font you are using to 'see' the special characters needs to be available on the server for the Publisher engine to be able to read and create a sub set of the font to be embedded in the output PDF.
    Please check the user docs for help. You did not mention the version or flavor you are using but the following demo should help
    http://www.oracle.com/technology/products/xml-publisher/demoshelf/shelf.html
    Check the demo second from the right - Font Mapping
    Tim

  • I couldnt change the Pdf in HTML ?

    Hello
    If i try the exporting from pdf to Html
    it allert that  the paper capture is not working  mayable it is not awailable.
    Where is the reason? It ocurrs on screen : Der Dienst  paper caputre Erkennung kann nicht gefunden werden.
    mfg
    i.A. Sulmena

    You cannot convert from pdf to html in Adobe Reader. Are you sure you are not using Adobe Acrobat?

  • Export from Acrobat PDF to HTML

    Is there any reliable way to export an Acrobat PDF file to HTML uising Acrobat 9 Pro Extended. When I try to do it from Acrobat, I get:
    Tables that are a total, horrendous mess, nothing at all like the tables in the original Pdf.
    Text that is randomly surrounded, for no apparrent reason, with span tags and completely worthless CSS. A single, uninterrupted sentence can have four or five spans, all with different CSS classes for no apparent reason. When you look at the four or five CSS classes, they all have the same, identical style attributes.
    Exporting it to RTF results in a mess that is no less horrendous.
    What I'm trying to do is take some rather lengthy PDFs (200 pages or more each) and use them as input to a RoboHelp WebHelp project. However, the output, whether I try an import into RoboHelp, an export from Acrobat Professional Extended, or one of several third party PDF converters is all so horrendous that it would be easier to completely retype the the 200 pages times sixteen or so documents than to attempt to cleanup the horrendous mess that results from an Acrobat export.
    Why can't Adobe get their products to work together? At this pont, I've just about reached the conclusion that it is very clear that nother should ever, for any reason, under any circumstances be converted to PDF unlesss you absolutely have the documents in the original format locked up someplace because you can never get anything meaningful and transportable out of Acrobat. Please tell me that, in this day and time, the Acrobat PDF files can actually be converted in a meaningful way.

    That's very true. However, you can create RTF documents with Word and WordPerfect and in this day and time, virtually everyone can read those, while they are ultimately transportable to a whole wide variety of formats, including HTML, without any significant degradation. If Adobe can't make PDFs transportable between document types in the same way, why on earth would anyone ever use the PDF format. All this hoopla about embedding video, sound, etc. is pretty much nonsense. You can pretty much do that in RTF, and few if any actually want to do it anyway.
    All I can say is, why does Adobe advertise that you can export to file types such as HTML and RTF from Acrobat PDFs when you obviously can't. Why do they include the options on their menus. This is obviously a case of false and misleading advertisement. This is the day and time of transportable documents. Adobe has chosen to make Acrobat Pro an anachronistic and archaic dinosaur.
    Once again, why create PDFs when they have such limited usefulness.

  • How can I make hyperlinks work in InCopy when I export to PDF without using InDesign?

    I created an InCopy document with hyperlinks linking to other InCopy documents. When I exported to PDF, these hyperlinks don't work. Is there any way I can make these hyperlinks work without using InDesign? Thanks.

    beer and no prepress schrieb:
    If it's for the web, why not export to JPEG?  Why PDF?
    Terrible idea. In a JPG the text will not be alive, you loose all interactivity.
    And making JPGs with InDesign is not what the program is meant to make.

  • Export to PDF defaults to HTML file instead of Adobe PDF since moving to CC2014

    Hi friends!
    Help please. Ever since I moved to CC2014 my InDesign and Illustrator is defaulting to HTML files instead of Adobe PDF when exporting to PDF. I also use this same software at another workplace and this has not been a problem there. I am on PC, on Windows 7. Does anybody know of a preference or such that I can switch to correct this?
    Any help will be greatly appreciated.
    Mindy.

    This is the window I am getting when attempting to use Data merge in ID. Have used this successfully before in previous versions of InDesign. Any ideas?
    Thanks

  • Export to PDF from populated c# Dataset without Database connection

    I have Crystal Reports Developer Full version 11.5.12.1838, all versions of Visual Studio and about 100+ reports targeting printed paper.
    They were developed over the last decade with the Developer Designer (and it's predecessors) and are based on SQL Server Stored Procedures. In the past, they were produced at runtime using a MS Access VBA application that either printed them or exported them to PDF files. I have successfully converted that application to C#, however a few problems: despite loading a populated dataset and setting all parameters, the runtime still wants to refresh the dataset. That works for 95% of all reports, but some just use null values for the parameters (despite having set them correctly), others mix the sequence of the parameters (despite querying them correctly) and some don't use the provided credentials. But that is not the main problem.
    Now I need a way to produce (export to PDF) these reports at runtime based on a populated C# Dataset without database connectivity for the Crystal environment. The structure of the data in the Dataset is identical to what the stored procedure return in the development environment.
    The question is: is that possible and what do I need in terms of version, SDK etc. Any advice (sample links) would be much appreciated.

    Hi Ludek;
    Thanks for the hints. After many many hours of research and trial and error here are my findings:
    Assigning a dataset to CR does NOT work - seems CR is just ignoring it completely. What does (partially) work is assigning a DataTable, however, if you try to do that with a main report and more than ONE sub-report, it fails with likely unrelated error messages (not yet implemented).
    I took one of my reports and deleted all sub-reports. Assigned a DataTable at runtime - works fine. Added one sub-report that has a single field (picture) and loaded at runtime with a DataTable. No problem.
    Removed that subreport and added another subreport that has a single text field. All good.
    Now: having both sub-reports and having both subreports loaded from a DataTable fails. It works if one of the datasets is empty. It also works with 2 sub-reports returning text into their single field.
    The DataTables are populated from the exact same stored procedures (and data source) that are embedded in the report from the design (CR XI) - essentially this should be a swap from internal to external fetch with no consequences.
    I tried assigning using subDocument.SetDataSource and subDocument.Databas.Tables[0].SetDataSource  - no difference.
    The background of this: all my reports work just fine using VBA (Access), but that app will be retired and replaced by something C# based. In the meantime, the same report file has to work with both engines.
    If I cannot get the DataTable assignment for sub-reports to work, there is a plan-b type possibility: replacing the stored procedure for some sub-reports at runtime - which works using SubReport.Database.Tables[0].Location.= "newSPName" as long as the new sp has no parameters like the old one.
    The problem is that the new procedure has 2 parameters and I did not find a way to add them to the parameters collection (not allowed in sub-reports)
    Anything that comes to your mind ?
    Any help will be much appreciated
    Thanks
    Rolf

  • Issue with Exporting Pdf as HTML

    Hi all,
    I am trying to Convert a PDF to HTML using the "Save as Other -> HTML(Web Page) in the File menu, After Conversion I see some of the characters that are present in Symbol Font are not converted properly(Eg a text 8.7 appears as ""). instead of numerals. Any thoughts on this issue. Thanks in advance.
    Regards
    Srini

    I can't see this working very well. Even if it exported a correct reference to the symbol font, people who view the converted page probably won't have the font and won't see the symbols. You can perhaps replace them with graphics.

  • Export pdf to html/txt/xml

    Hi,
    I downloaded "adobe acrobat x pro" for trying the "save as"/export functionality to xml/htm/text etc. and the result was exactly what I was looking for in terms of output, keeping formatting etc.
    However, I am building an application which need to have an embeded library in order to do pdf to html/txt/xml conversion on the fly keeping formatting.
    I have tried a number of libraries for pdf to html/txt/xml conversion an none of them deliver anything near what adobe acrobat x pro does in terms om keeping format/tables etc.
    So, my question is how can I get access to the "save as"/export functionality in adobe acrobat x pro in any official adobe library, sdk, service, product etc. since I assume acrobat x pro does not expose any api for convert functionality or may be used serverside?
    Best regards,
    Rick

    It sounds like you want to use Acrobat as a web service. Rather than pursue this route, you may want to note that such a use of Acrobat is not permitted under the license. Thus it may not worth pursuing. Why convert to HTML is a possible question anyway, at least on a regular basis? On occasions I can understand the need.
    For programmable features you should probably check in the SDK forum.

  • Export pdf as html by Acrobat 9.2 on windows 7

    Dears,
    i have a problem with exporting pdf as html by Acrobat 9.2 on windows 7.
    after exporting, images and text may have wrong positions or wrong width and height of the images.
    is the problem in the compatibility between acrobat 9.2 and windows 7 ?
    what can i do ?
    Thank you in advance....
    amt

    thank you, but i think that i can get 90% representation of the pdf,
    but that didn't occure,
    also i saw examples for some tools which can do that, but for pdf version 1.5 and i think that is on old windows too.

  • Exporting to PDF without clicking any button

    Hi,
    Is there a way we can export the output of the WEB report to PDF without having to hit a button "Export to PDF"?
    The moment I open the report, the contents should get exported to PDF in a new window which I can print later.
    Please share your ideas.
    Regards,
    Shameem

    Hi
    You can try on Information Broadcasting.
    Regards,
    Chama.

  • Keep exported image size in HTML as shown in PDF

    I have many inline formulas (imported from Word file via using mathtype) in PDF article made by ID. But when I export the articles as HTML, the images for formulas become much larger than shown in PDF version. How can i keep the exported images to the same size as they are shown the PDF file? I know I can edit the html file to specify image size, but that is the ideal work flow.
    Thank you.

    Hi Eric,
    If i'm getting you right. you have the formulas placed as the images.
    So, to keep the size fixed, In HTML export Options in image tab keep the image size as fixed.
    Snapshot to refer:
    Now you have image of same size as it appears in InDesign.

Maybe you are looking for