Pdf file to html conversion with embedded images

hi all,
i want to convert any pdf file to its html equivalent. currently i am using PDFBOX java api to do that. it works fine with simple pdf files having no images, but if there are embedded images in pdf file then it do not show these images.
anyone who has clue of solving this problem. i can convert individual pdf pages to jpg pictures if all embedded images would also be in these pictures.
help me regarding pointers to other APIs, code snippets etc that can solve my purpose.
thanks in advance

Hi..
really soorry i am not having any solution for u.
But i am having one problem regarding pdf box, i think u know pdf box, i am reading japanese file using pdf box, its giveing
caught a class java.io.IOException
with message: Unknown encoding for 'UniJIS-UCS2-H'
I have wrriten code like this.....
PDDocument pdfDocument = null;
PDFParser parser = new PDFParser( new FileInputStream(file));
parser.parse();
pdfDocument = parser.getPDDocument();
PDFTextStripper stripper = new PDFTextStripper();
String text = stripper.getText(pdfDocument);
reader = new StringReader(text);

Similar Messages

  • PDF to html conversion with embedded images in java

    hi all,
    i want to convert any pdf file to its html equivalent. currently i am using PDFBOX java api to do that. it works fine with simple pdf files having no images, but if there are embedded images in pdf file then it do not show these images.
    anyone who has clue of solving this problem. i can convert individual pdf pages to jpg pictures if all embedded images would also be in these pictures.
    help me regarding pointers to other APIs, code snippets etc that can solve my purpose.
    thanks in advance

    Hi..
    really soorry i am not having any solution for u.
    But i am having one problem regarding pdf box, i think u know pdf box, i am reading japanese file using pdf box, its giveing
    caught a class java.io.IOException
    with message: Unknown encoding for 'UniJIS-UCS2-H'
    I have wrriten code like this.....
    PDDocument pdfDocument = null;
    PDFParser parser = new PDFParser( new FileInputStream(file));
    parser.parse();
    pdfDocument = parser.getPDDocument();
    PDFTextStripper stripper = new PDFTextStripper();
    String text = stripper.getText(pdfDocument);
    reader = new StringReader(text);

  • Sample Java  Code to send an HTML mail with embeded image

    Hello,
    Please can I get a sample Java code on sending an HTML mail with embeded image.
    The HTML message and relevant input parameters withhbe supplied from a PL/SQL that will call the class , the class will embed the image and send the mail to the recepient.

    tev wrote:
    Please can I get a sample Java codeNo. This is a forum, not a code mill.
    Recommended reading: How to ask questions the smart way
    db

  • UTL_SMTP send HTML email with embedded image?

    Hi, I can use UTL_SMTP to send an HTML email ok, but does anyone have an example of how to include an inline embedded image in the email? Thanks!

    If you want to send the html page and have it
    reference the images and css files on your web
    site, that's pretty easy. Just create a message
    with text/html content that is your html page.
    If you want to include all the images and css files
    in your message along with the html page, you'll
    need to create a multipart/related message and
    you'll need to change all the html to reference the
    images and css files using "cid:" references.

  • Send HTML Email with Embedded Images and CSS

    Hi All,
    I have a html page. I want to send that html page(not with attachment) with all images and css. i search and try but I cant find a good solution. can any one help... plz..........
    Thank You.....

    If you want to send the html page and have it
    reference the images and css files on your web
    site, that's pretty easy. Just create a message
    with text/html content that is your html page.
    If you want to include all the images and css files
    in your message along with the html page, you'll
    need to create a multipart/related message and
    you'll need to change all the html to reference the
    images and css files using "cid:" references.

  • PDF File Size - Live Type with Flattened Images

    I am looking for a way to Export my PDF to include live type but flatten all other imagery (like saving as a JPG).
    I work on fairly large catalogs - intended for print - that later need to be saved for web. The imagery is just too complex for the PDF to constantly be rendering every image element on every page of my catalog. These catalogs need to include that searchable text data & live links built within InDesign.
    Any suggestions would be greatly appreciated. I am currently on CS5.5, but hoping to upgrade soon. Thank you!

    See this post: https://forums.adobe.com/thread/998458  If you want to make the file even smaller, adjust the compression settings to a lower resolution.

  • Can not use scroll mouse in html tab when open pdf file in new tab with middle button but will avaliable only after switch to pdf tab and switch back to html

    When I open a "pdf" file in new tab with middle mouse click. After few second my mouse will not able to scroll in page that I read. But it will be able to scroll only after I click to any opened pdf tab and click on pdf document and click back on previous tab to continue reading. It is not only happen on my laptop also on my desktop too. (winxp 4gb/8gb ram core2 duo 2.66)

    There's a Bug filed about that issue.

  • When I forward an HTML email with embedded graphics to someone, it forwards it as plain text.. this is driving me batty.. how do I forward such mails INTACT??

    I have the latest Thunderbird installed on a new 64-bit Winblows Eight netbook.. fantastic program, but one problem is driving me absolutely batty, and after using the latest Thunderbird for weeks, I simply can't figure out how to fix it..
    I'm on a lot of mfr. and other kinds of mailing lists, like eBay watch list alerts, and so on.. these are not s p a m (although I get plenty of that.. who doesn't).. but lists I WANT to be on..
    Many such emails from those mailing lists are in HTML format with embedded graphics.. I'm not talking about graphic file attachments, but embedded graphics which are coming from the senders' servers, and appear AS a graphic in the email.. sometimes such emails are one huge graphic with hardly any text.. all well and good..
    However, here's the problem.. when I want to forward such an email to a friend, Thunderbird ALWAYS formats it as plain ASCII text.. I know this because I look in the "sent mail" folder, and can see that it has turned an HTML email with embedded graphics into plain ASCII text..
    I absolutely can't figure out how to get it to forward an HTML email with embedded graphics INTACT, so the sender receives it looking the way it looks when I receive it from a mailing list, or an advertiser, or eBay, or whoever..
    Is Thunderbird capable of forwarding an HTML email with embedded graphics INTACT?.. If so, how / where do I turn on that capability?..
    If the capability to do this isn't built into the program, is there an add-on I can install that will give it that ability?..
    I am not new to computers.. but this really has me stumped.. I want to put Thunderbird on my 32-bit Vista laptop and stop using its horrible "Windoze Mail" program, which I've been using for years, and is slower than snot, and has all kinds of other problems..
    So, assuming whoever reads this FULLY understands my question, PLEASE tell me how to get Thunderbird to have the ability to forward an HTML email with embedded graphics AS-IS, so the receiver(s) I forward it to see it exactly the way I see it when I receive it.. if that ability is built in, please tell me how to turn it on.. if that ability is not built-in, please tell me what add-on I need to install to give Thunderbird this capability.. if Thunderbird absolutely can't forward an HTML email with embedded graphics at all, please also tell me that..
    A virtual box of candy and a dozen long-stemmed roses to anyone who can give me a solution that works..
    Thanks..

    Dear Mr. Toad (my all-time favorite ride at Disneyland ;-) ..
    Thanks so much for your detailed reply.. my netbook is in the bedroom, turned off.. I (so far) only use it in the evening, in the bedroom.. I've saved your response, and will try your suggestions, and let you know if they solve the problem I described. I really appreciate you taking the time to post such a detailed reply..
    I can't answer your Thunderbird "configuration" questions, because I'm in the living room, using the crap Vista laptop, on which I plan to install Thunderbird, and then take Windoze Mail out in the street and drive over it a few times.. I'll get back to you one way or the other, and let you know if your instructions solved the problem, or not..
    I don't understand why Thunderbird "out of the box", so to speak, simply doesn't forward HTML emails with embedded graphics, (like Outlook Excess, and Winblows Mail do).. without having to go through those steps. I personally HATE HTML email, but over the years, it's become more and more prevelant.. so it's a problem I must fix..
    Thanks again..
    Harv..

  • Busrting pdf file cannot be open with Adobe Reader

    OBIEE 10.1.3.4 on Linux redhat 5.2. Configured busting to local file system in BIP, with file format PDF and HTML. The bursting query used is select distinct today KEY,'2297-hen' TEMPLATE,
    'RTF' TEMPLATE_FORMAT,'en-US' LOCALE,'HTML' OUTPUT_FORMAT,
    'FILE' DEL_CHANNEL,'/tmp/cmisout' PARAMETER1,
    'cmis_unmatched_'||to_char(sysdate,'yyyymmdd_hh24_miss') ||'.html' PARAMETER2
    from rpt2298
    union
    select distinct today KEY,'2297-hen' TEMPLATE,
    'RTF' TEMPLATE_FORMAT,'en-US' LOCALE,'PDF' OUTPUT_FORMAT,
    'FILE' DEL_CHANNEL,'/tmp/cmisout' PARAMETER1,
    'cmis_unmatched_'||to_char(sysdate,'yyyymmdd_hh24_miss') ||'.pdf' PARAMETER2
    from rpt2298The job ran successfully and two files generated in the target location. While the html files is OK but the pdf file cannot be opened with Adobe Reader. Verified that my Adobe Read is ok to open pdf files from other sources.

    Do you have any password or encryption settings in your Runtime Properties?I do not think so, but not sure. Is there a way to check it? Is there a properties file.
    Did I misunderstood it? but I can place both PDF and HTML files to the target location, the HTML filess are good only PDF files cannot be open by Adobe Read.

  • Converting PDF files to html

    I am trying to convert a number of .pdf files into html
    format for a webhelp project unfortunately when I convert them it
    makes several html pages unlike the MS word files which just
    produce one long page.
    I have downloaded a trail version of Adobe Acrobat 8 which
    allows you to convert the files into html but it does not hold the
    formatting of the original document and in one case some of the
    fonts turned white and you could not see the text. (I don’t
    seem to get this when converting MS Word documents)
    Is there anyway that individual html pages can be joined?

    Hi Molaa
    That's normally the way a conversion to HTML format works.
    Normally in the world of HTML, you want your information on several
    smaller "bite size" chunks. I suppose if you honestly want it all
    on a single page, you could perform a copy/paste to put it all
    together after it gets converted to HTML.
    Cheers... Rick

  • Will file converting pdf file to excel work with Adobe Reader 10.1, Windows 7 platform?

    Will file converting pdf file to excel work with Adobe Reader 10.1, Windows 7 platform?  It shows that it is available for purchase but does it perform?  Will this work only with Reader XI?  Is the conversion done on-line?

    Moved to Adobe ExportPDF.
    The file is uploaded to the web for conversion. You can manually upload pdfs for conversion with your web browser and/or with Reader 10.1.

  • How can I make a PDF file from each folders with layers, where each page is a each folder?

    How can I make a PDF file from each folders with layers, where each page is a each folder?

    I found an answer to my own question. A work around of sorts.
    Download Photoshop Elements 6 for Macintosh. With PSE6 I made a slide show with 550 images 1920x1200, without thumbs. I ran into one problem making the slide show. My images contained 4 images which had not ben created by Photoshop and could not be included in the slide show. Opening the images in Photoshop CS4 and re-saving them still did not make then acceptable. Not a big deal. I probably could have fixed the four images by stripping all EXIF data before opening them in Photoshop. BTW, PSE6 made the slide show in demo mode.
    I hope the bug in Photoshop CS4 will be fixed in Photoshop CS5.

  • I can no longer print PDF files that were scanned with the very same MX479 on Windows XP, SP3

    I can no longer print PDF files that were scanned with the very same MX479 All-in-One.
    I have an (obsolete) Windows XP, SP3 computer.  Obviously, no updates from Microsoft
    have occurred that might have impacted my Canon Pixma MX479.
    When I attempt to send 1 or more pages from a PDF file to the printer, I don't even get the
    Canon PIXMA popup that would allow me to cancel the print job.
    Yet the Print Queue icon appears in the lower right corner of my monitor.
    If I view the Print Queue, the print queue shows "spooling" and does nothing more.
    If I attempt to CANCEL the print job in the Print Queue, it simply changes to "deleting -
    spooling".
    I cannot cancel the print job that shows NOT RESPONDING via Task Manager, either.
    In fact, the only way the Print Queue icon disappears is if I totally reboot my computer!
    "Luckily", I can still print e-mails, Word documents, and even my downloaded PDF of my
    bank statement.
    So, I was thinking that it was *only* PIXMA'S OWN SCANNED PDFS that could not be
    printed ON THE SAME PIXMA 479.
    However, I have since realized that I also cannot print USPS Signature Proof of Delivery
    PDFs that I received from USPS via e-mail.
    So why can I print *some* PDF files, but not *all* PDF files, like I previously could do?
    THESE SCANS ARE OF LEGAL DOCUMENTS, SO I WILL LIKELY HAVE TO E-MAIL
    THE REQUISITE, PREVIOUSLY SCANNED PDFS TO FEDEX TO PAY FOR *THEM*
    TO PRINT THEM.
    NOTE: I *have* tried turning the PIXMA off, unplugging it, rebooting the computer, plugging
    the PIXM in and restarting the PIXMA - still no ability to print PIXMA-scanned PDFs.
    And, I find it very strange that e-mails, Word documents, and some other PDFs print with
    no problem "around" "NOT RESPONDING" PIXMA print jobs.
    I have owned Canon printers for years, including the MX420, and the MX340.

    There may be some security issues related to the USPS PDFs.
    http://www.certified-mail-envelopes.com/signatures-usps-certified-mail-return-receipt-requested
    I can't help with the scan/print problem. You seem to have done everything I would try.
    I don't know if maybe using a registry cleaner would help.
    John Hoffman
    Conway, NH
    1D Mark IV, Rebel T5i, Pixma PRO-100, MX472

  • Is it possible to convert PDF file into HTML

    Dear friends
    Is it possible to convert PDF file into HTML. I have few hundread PDF files i like to convert this files into HTML. I hope it can be done through Java but i don't know how to start this coding. anybody can give me a brief idea to go ahead.

    Why do you want to do this yourself? I quick search on Google showed several utilities to do this, some freeware, some commercial.

  • How do I open a pdf file on my iPad with Adobe reader.

    I can't open a pdf file in an email with Adobe reader.

    This document explains how:
                     Opening PDF Files in Reader for iOS (iPhone and iPad)

Maybe you are looking for