PDF file size increases several folds after OCR

Hello,
I've been using Acrobat Pro 9.  Whenever I receive a scanned PDF from someone, I always try to OCR it.  I thought this would reduce the file size because it's saved as text characters instead of images.  However, the file size actually increases several folds.  For example I just had a scanned PDF with original size of 3M, but after OCR (settings:  searchable image, downsample to 300dpi), the file size is > 19M.  This is a legal document with no pictures or diagrams.  Even ClearScan gives a result of 15M file.  Could you help explain why and how I can change this?  Thank you.
Hung

I just did an OCR with Searchable Image (exact); and here are all results
Original:  3,046 KB
Searchable Image (300dpi):  19,408 KB
Searchable Image (Exact) (600dpi):  3,377 KB
ClearScan (300dpi):  14,813 KB
So in this example, Searchable Image (exact) only causes about 10% increase in size.  I used Save As "PDF" (not "Reduced Size PDF" or "Optimized PDF"...).  Does this make sense?  Thank you.
Hung

Similar Messages

  • Why does PDF file size increase each time I "save" tagging tasks?

    Why does PDF file size increase each time I "save" tagging tasks?
    Given:
    1) I'm running Acrobat Pro 11.0.07 (this is most current version)
    2) My file size starts at 750mb and ends up 15mb when finished tagging.
    3) Only certain documents experience this increase, i.e., no visible pattern of document characteristics
    4) PDF's are not full of images...in fact, mostly <H1> <H2> <H3> <P> <Figure> alt text, ect.
    5) Occurs with text PDF's and/or fillable forms (again, does not happen to all documents...only some)
    6) File increase occurs incrementally as tagging tasks are performed; i.e., each new save increases file size a few megabytes.
    7) Occurs whether I "save" or "save as"
    8) Difficult to optimize these files without corruption
    9) I'm running Mac OS 10.9.4 (this is most current version)

    Thank you so much for responding! I've been experimenting with the SAVE AS vs. SAVE for the past few days...and you are correct. It's funny because I've been tagging files for 2 years and never noticed this. Probably b/c I use both methods depending on the tagging tasks...some are more complicated than others and should not be overwritten. In those cases I SAVE AS new file.
    I love this forum...thank you again!

  • PDF file size increases when I add a Email Submit button

    Hello,
    My PDF form is 87KB.  Once I add a Email submit button the file size jumps to 525KB.  Why?
    Reason this confuses me is that it seems to be inconsistent.  I have added this function to other PDF forms and the file size only increased a few KB.   Every now & then adding a button or Javascript will increase the file size.
    I have retraced my steps over & over, but I cannot find a answer to why this is happening.
    Thanks for the help everyone!
    -CM

    The problem is the font. Use Helvetica instead. When you do, Acrobat/Reader actually use a version of Arial anyway.
    When you use one of the non-base-14 fonts (Helvetica, Times, Courier, [and their bold, italic, bold italic variants] Symbol, and Zafp Dingbats) for a form field, Acrobat has to embed the entire font in the file, and some can be rather large. The base-14 fonts are guaranteed to be available for use by the compliant PDF viewer, so there is not need to embed them in the file. A viewer is allowed to substitute suitable replacement fonts, which is what Acrobat/Reader now do.
    The problem is you can't simply go back and change the font to Helvetica and see the full size reduction that you'd expect after you do a Save As. This is because Acrobat does not clean up everything as well as you might expect. So what you have to do is go back to a version of the form that does not have the button, or copy & paste the field from the old form to a new PDF that doesn't have any fields, but delete the button first. Then create the button anew.

  • Pdf file size increases when printing resulting in slow print speed.

    I have a .pdf that is saved at 27MB. When I send it to an HP Designjet T520 36in printer the file size jumps up to 136MB and the printing is incredibly slow. This does not happen on all files on a few random ones. Any ideas on how to resolve this issue?

    It's entirely normal for file size to increase. PDFs are much more compressed than print streams. For some printers the print size is almost constant (a huge bitmap), for others, it's a collection of graphical items and the size varies enormously. Rarely anything you can do.

  • PDF file size too large

    Hi,
    I have a report (6i and 9Ids) which contains an image (stored as a blob in the database (8i)). The size of the image in the database (and as a file) is just 750k. The image is sized to fit on to the A4 report page. If I set the desformat of this report to PDF the resulting PDF output file is 10mb in size. I need to make this report available over the web so this is too large. Has anyone got any ideas as to reducing the output file size?
    I have tried the pdfcomp report parameter with no joy.
    Cheers
    Andy

    Hi Andy,
    The image you are using might be a JPEG image. In 6i and 9i, while generating the PDF file, Oracle Reports always converts the image to GIF and embed it. This image type conversion increases the file size of the outputimage and hence PDF file size increases. This is fixed in Oracle Reports 10g.
    In Oracle Reports 10g, you can select the outputimageformat based on your need, using either:
    1. commandline: OUTPUTIMAGEFORMAT
    (or)
    2. environment variable: REPORTS_OUTPUTIMAGEFORMAT
    If your image in the database is a JPEG image, set the outputimageformat to JPEG. Hence, there will not be any image type conversion and the PDF file will be very small.
    Please refer to the Publishing Reports manual to know more about the usage of these commandline/environment variable.
    Links:
    http://download-west.oracle.com/docs/cd/B10464_01/bi.904/b10314/pbr_cla.htm#644163
    http://download-west.oracle.com/docs/cd/B10464_01/bi.904/b10314/pbr_rfap.htm#644448
    Thanks,
    Regards,
    Siva B

  • Preview.app increases PDF file size after deleting pages

    Hello, I'm experiencing odd behavior with Preview.app and PDFs.
    If I open a PDF with Preview, delete a page, then save the file, the file size increases anywhere form 2x to 20x. This happens both with PDFs that only contain text and PDFs that contain text and graphics. It is very frustrating because I start with a file that is 150KB, remove some pages and end up with a 10MB monster that I can't email to people.
    Any help is appreciated. I can post a link to a test PDF for people to try to replicate with if it would be useful.

    I generated the file with pdftex. I'm guessing that there must be different ways to encode a PDF and when Preview gets something with an encoding other than that provided by PDFKit, it rewrites the file how it likes. In my case this is increasing the file sizes. I tried finding docs about PDFKit on the Apple developer site, but couldn't find any details about ways of encoding a PDF.

  • Using Examine Document Remove increases pdf file size !

    Hi,
    I have Adobe Acrobat Pro v9.3.0
    I've been editing a lot of scanned .pdfs - rotating and cropping pages.
    All this has previously worked fine with v8 but now I find that Acrobat 9 is increasing the .pdf file size after using Examine Document and clicking Remove
    For example:
    original file: 16,861 Kbytes
    file after cropping 74 pages (from A4 to A5): 16,879 Kbytes
    file after running Examine Document > Remove cropped metadata : 79,914 Kbytes !!!
    With Acrobat 8 this process would normally have halved the file size.
    Am I now doing something wrong ??
    Thanks in advance.

    Hello - This problem is still here !
    Acrobat 9 Pro version 9.4.2
    I've got a .pdf created by an agency (so not a scanned image) which I want to make as small as possible for emailing to hundreds of people (I'll attach it if possible somehow ?)
    I open it up when it's 331Kb
    Click Document > Examine Document
    Check the Metadata and Deleted/Cropped items
    Click Remove
    Click File > Save As and hey presto, the new file with all that stuff supposedly removed is 2,588 Kb
    Surely I'm not the only one who's bothered about .pdf file size ?

  • PDF signing signature file size increase in increment 3mb

    Hello,
    Have anyone seen Adobe Reader digital signature insert and file increase from 85kb over over 3mb?  The current Adobe Reader use is 9.x and every time inserting PDF signature it changing the size largely.  The signature file less than 95kb.  Please let me know if you seen this and a fix for it.
    Thanks
    JT

    Hi Ankit,
    Thanks for following up and the issue has been resolved since I found tips from the forum below.  The OS is WinXP Pro and current version Adobe Reader 9.4.6.  It work after de-select the option under preference so I’m good to go.  Maybe you can posted this on your site as “fixed” for those experience same issue as I’m.
    Re: ACROBAT DIGITAL SIGNATURE ADDS 700KB TO .PDF FILE SIZE"
    go to edit>preferences>security.
    click on advanced preferences then select the 'creation' tab.
    De-select 'Include signature's revocation status when signing'
    the extra data is the CRL list that gets embedded with the signature if the above preference has been selected.
    Thanks
    John Ta

  • Checking out Pdf files to Local draft Folder option no longer works after office 2013 installed.

    Checking out Pdf files to Local draft Folder option no longer works after office 2013 installed.
    Summary :
    We are using SharePoint 2010 Ent edition and users were able to checkout Pdf files similar to local drafts folder like any other office file without any problem when they used office 2010.
    i.e
    Unfortunately ever since their machines upgraded with office 2013 recently , this functionality completely stopped working for PDF files. This has now become a big problem for the users when it comes to check out and replace PDF files.
    All browser plugins required for this functionality (i.e    SharePoint OpenDocuments Class ) are all available and active. All document libraries are configured to Checkout is required for editing files. Browser version
    used at the moment is IE9 (32 bit) version with windows 7. 
    Can anyone please help with this issue and any help to get away with this problem is much appreciated.

    Hi,
    Based on your description, my understanding is that the PDF files cannot be checked out to local drafts folder after Office 2013 is installed.
    Did this issue occur with Office files?
    I recommend to check if the Office files can be checked out to local drafts folder with Office 2013.
    And it is recommended to use Office 2010 with SharePoint 2010 for best practice.
    Thanks,
    Victoria
    Forum Support
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact
    [email protected]
    Victoria Xia
    TechNet Community Support

  • File size increases dramatically after digital signature with Acrobat X

    I signed a PDF file using Acrobat X.
    File size of unsigned document: 200 k
    File size of signed document: 3200 k (more than ten times bigger)
    Any idea?
    BR
    Harald

    Finally I found the solution.
    The file size increase is a result of the embedded information used for long-term signature validation.
    If you do not need this feature, you can turn it off:
    Preferences > Security > Advanced Preferences > Creation tab
    disable "Include Signature’s Revocation"
    Help page: http://help.adobe.com/en_US/acrobat/X/pro/using/WS934c23d7cc8877da1172e0811fde233c98-8000. html
    BR
    Harald

  • PDF file size grows with each save if .access property set on a field

    We are seeing an odd form behavior and have isolated the apparent trigger to something we are doing in the form script.  I'm hoping someone can confirm they see the same problem, details follow:
    We have a form generated in LiveCycle.  It contains a text field.  In the docReady event for that field we have javascript which sets the field to be readOnly (TextField1.access = "readOnly").  We reader extend the form so we can save it from reader and/or the plug-in/control which is used by a browser when reading PDFs.
    With that simple form, open the form via the browser (we've tested with both IE and Chrome) and without doing anything else, just save the form (with a new name).  When we do that, the saved copy of the form is significantly bigger than the copy we started with.  If we then repeat the process using the newly saved file, the third copy is bigger than the second.
    This file growth does not happen if you open the file in Adobe Reader (instead of in the browser).
    When we look at the file contents via a text editor, what we have found is that each save via the browser is tacking on a chunk of data to the end of the file AFTER the %%EOF mark.  This new section appears to be one or more object definitions and ends with another %%EOF.  The first portion of the file (prior to the first %%EOF) is identical in all versions of the file.
    If you take a copy of the file that has these extra section added, then open and save it in Adobe Reader, it eliminates those extra sections and you get a 'clean', small version of the file again.  So those extra sections are clearly erroneous and unnecessary.
    Another thing worth noting, we took the script for setting the field access property out of the docReady event and put it as the click event on a button added to the form.  If you then open the form, press the button and save it you see the file growth (but not if you don't press the button.)  So it doesn't appear related to the docReady event, or timing of when we set the access property of the field.
    On the small test form described above the growth of the file is around 13KBytes.  But in testing with our real forms we've found that the amount  of growth seems to be tied to the size/complexity of the form and the form data.  In our most complex form with multiple pages with hundreds of fields and a large amount of XML data (form size is 2+MB), we are getting a file size increase of 700KBytes on each save.  This is a therefore, a significant issue for us, particularly since the process in which the form is used requires the users to save it a couple of times a day over the period of a month or more.

    I would start by exporting the XML data from the form before and after it grows to see if it is the underlying data that is growing and where. Did you define a schema and bind your form fields to a data connection created from the schema? That is always my first step when creating a form. It makes for a smaller saved file not including multiple levels of sub forms in the data structure.

  • Snow Leopard - Save as PDF file size

    Since my Acrobat Printer no longer works after upgrading to Snow Leopard, when I "save as PDF" the file size is several times what the output used to be using the Printer.  So saving Illustrator or InDesign files where the Mac OS PDF printer can't be used, I end up with a 600k file that used to be 80k.
    Are there settings in the Save As PDF dialog that can get that size down? It always worked fine before for my purposes so I don't know the advantages gained with this huge files size.
    Steve Horn

    websentia wrote:
    But I'm not using the Mac PDF creator. So what I am asking is a way to use the File/Save As/Adobe PDF
    That is the Mac PDF creator. The PDF printer is hosed in Snow Leopard so you basically won't be using Acrobat at all.
    File/Export/Adobe PDF command or even the Print Dialog/ Adobe PDF
    In InDesign? If so, use the smallest file size setting. Also, that's not Acrobat.
    websentia wrote:
    Distiller still can make small files but with added steps of creating objects that fill the artboard (rectangle with white background or whatever), save as EPS, drop into distiller, then multiply all that by the number of pages in the document..
    Again, you can use "Save As" in Illustrator to save it as a PDF file. Check the advanced settings to be sure they're tuned up for what you want. In InDesign, use the "export" command to create a PDF using the InDesign PDF generator. Check your presets and choose the one that works best.
    Both of the above are just as easy as printing to the Adobe PDF printer. You just need a little practice using them to get started.

  • PDF file size

    I have an HP Officejet 6500 E710n-z (Network) that I use at home to scan into PDFs, and I know exactly how to move the slider from "Smallest size" to "Best quality" 
    However the size of the PDFs are unacceptably large when image quality is acceptable, and if I move the slider to reduce the size then the image becomes unacceptable.  It is impossible to scan a legal document of more than a few legible pages without producing a file size too large to email .  I am forced to break these scanned documents into 3-4 page bite-sized chunks.
    A simple one page HOA Disclosure form, at 220 DPI and the slider in the middle for balance, produces a 699K PDF.  I have used other printer scanners at work (different brands) at the same 200 DPI that results in very clear documents at less than 100K per page. 
    I believe there is something wrong with HP's scan-to-PDF algorithm.  The problem must be due to some unskilled (or flawed) software design.  What will it take to have HP or third party developer (and developer staff supervision) take this seriously -- compare HP vs other brand scanners PDF files -- and update the HP drivers to fix this?
    This question was solved.
    View Solution.

    Just a follow up.  I went into chat mode with some low-level tech named Nathan, giving him the link to this thread.  After reading it, he suggested I can improve file size by using greyscale and 200 DPI.  DUH!!!!  I complained he was just humoring me, so he said I should call the tech support number.
    I did that next and spoke with a very nice young lady who actually DID take me seriously.  I could hear her typing away vigorously in the background, capturing every detail of my plea for this to be forwarded up the chain of command for serious consideration.  It was clear she understood and captured from me that there are many forum complaints that can be found with a search term "PDF File Size" that are getting weak or unacceptable "solutions" to push the slider left or use lower DPI etc.  She also captured my assurances that some competitor brands produce PDF file sizes 20 times smaller for the same image quality.
    I believe that this might actually be opened up at a higher level for consideration of an improvement in the compression algorithm.  My fingers are crossed.  Since I have about 11 months left for warranty support, I plan to contact them once or twice again before it expires, using the same case number to see if there is any progress.
    Bottom line:  At my default of 200 DPI and 20% image quality, with an average file size for a single sample page producing a file size of 281KB, a 25-page document creates a PDF file that is 7MB!  That will just barely make it past the file size limit for my email provider, but might be too large for the recipient.  That is still unacceptable, and is forcing me to consider products other than HP for this business purpose.

  • PDF File Size Problem

    Hi,
    I am using Adobe Acrobat Elements 7.0 in my computer to convert files like .txt and .doc to .pdf files. But currently I'm facing a problem with the converted .pdf file size. For example, my .doc file is 5MB and after I convert to a .pdf file, it becomes 10MB. The file size basically multiplies itself by 2.
    I've tried converting .txt and .doc files and it gives me the same problem. I've tried reinstalling, tried all kinds of settings, and recreating user profile but it doesn't help. Can anyone help me? Thank you very much.

    Sorry. The problem is caused when I want to password protect a .pdf using elements. So I used Adobe Reader to open the .pdf file then I'll print using Adobe Printer so that I can password protect it and output file size is actually 10 times the normal .pdf file.
    Please help!!

  • Large PDF file sizes when exporting from InDesign

    Hi,
    I was wondering if anyone knew why some PDF file sizes are so large when exporting from ID.
    I create black and white user manuals with ID CS3. We post these online, so I try to get the file size down as much as possible.
    There is only one .psd image in each manual. The content does not have any photographs, just Illustrator .eps diagrams and line drawings. I am trying to figure out why some PDF file sizes are so large.
    Also, why the file sizes are so different.
    For example, I have one ID document that is 3MB.
    Exporting it at the smallest file size, the PDF file comes out at 2MB.
    Then I have another ID document that is 10MB.
    Exporting to PDF is 2MB (the same size as the smaller ID document)... this one has many more .eps's in it and a lot more pages.
    Then I have another one that the ID size is 8MB and the PDF is 6MBwhy is this one so much larger than the 10MB ID document?
    Any ideas on why this is happening and/or how I can reduce the file size.
    I've tried adjusting the export compression and other settings but that didn't work.
    I also tried to reduce them after the fact in Acrobat to see what would happen, but it doesn't reduce it all that much.
    Thanks for any help,
    Cathy

    > Though, the sizes of the .eps's are only about 100K to 200K in size and they are linked, not embedded.
    But they're embedded in the PDF.
    > It's just strange though because our marketing department as an 80 page full color catalog that, when exported it is only 5MB. Their ID document uses many very large .tif files. So, I am leaning toward it being an .eps/.ai issue??
    Issue implies there's something wrong, but I think this is just the way
    it's supposed to work.
    Line drawings, while usually fairly compact, cannot be lossy compressed.
    The marketing department, though, may compress their very large TIFF
    files as much as they like (with a corresponding loss of quality). It's
    entirely possible to compress bitmaps to a smaller size than the
    drawings those bitmaps were made from. You could test this yourself.
    Just open a few of your EPS drawings in Photoshop, save as TIFF, place
    in ID, and try various downsampling schemes. If you downsample enough,
    you'll get the size of the PDF below a PDF that uses the same graphics
    as line drawing EPS files. But you may have to downsample them beyond
    recognition...
    Kenneth Benson
    Pegasus Type, Inc.
    www.pegtype.com

Maybe you are looking for

  • Satellite A300D-15B PSAKCE - Cannot install the Win XP 64bit sound driver

    Hello all, This is my first post here after lurking for the last week or two, My hope is that somewhere here there is help and hence hope! I have a wonderful Toshiba A300D-15B PSAKCE with 4GB/O RAM, however I am suffering from some problems that the

  • Payment Release Workflow

    Hi Folks,          We are working on Payment Release workflow in AR module.          Once the invoice is posted, workflow is able to block the payment, but not sending mails to the agent for approval.          For testing, i have entered a dummy user

  • How to deploy jar file for use within mapping user-defined fcn

    Hi all, I have a java class I'd like to called from a mapping user-defined function. Here's what I've done (but hasn't worked) 1. Added 'package com.<mycompany>.xi.util.base64 to the source class file and compiled it. 2. Created a sda with a plain pr

  • How to include attributes in BEx?

    Hi, I am using Infoset for BEx reporing and I want to drag attribute fields which is in Key Dimensions but not able to do the same. Ex: Key Dimensions: SALES OFFICE ID -->Attributes(Sales Rep ID,Office location and etc). I can drag Sales office ID in

  • Album art option in Windows7

    Where (is there) is the option to display the album art on your iPod with Windows7? I've installed iTunes 9 and I have a 1st Gen. Nano. I've had zero problems until I loaded my Nano using my new Windows7 machine. In the past (using XP) there was an o