OCR - possible double downsampling?

Hi,
I'm experimenting to find out the best settings to digitizing papers for archival. Right now, I have them scanned at 300 dpi, converted to PDF using Acrobat, and in Acrobat I OCR them so that I can search and highlight etc. I prefer the retain image style of OCR (as opposed to creating a new doc which software like Omni does), since it will preserve the look of the original document.
I chose the searchable image option, and 300dpi under downsample. I cannot choose searchable image exact since that'll prevent Acrobat from automatically rotating any slanted docs. Very nifty feature.
However, as it turns out, the quality of the image seems to have degraded. It's definitely more pixelated. When printed out again, it is noticeably worse than the original.
I suspect that when the downsample option (300dpi) was selected, Acrobat recompressed the image again.
What do you think? It doesn't appear possible to select a "do not downsample, just leave it as it is already downsampled" option. Doesn't make sense to choose a higher dpi value, since the file size will increase but the quality will be the same (or perhaps also worse).

Hi,
No problem. I've uploaded them on Box.net.
Sample scanned image - Before OCR.pdf - http://www.box.net/shared/nodbkj9i2b
Sample scanned image - After OCR.pdf - http://www.box.net/shared/bllp1m0m1p
Comparison - at 800%.png - http://www.box.net/shared/k0g3rkpuyu
Comparison - at 2400%.png - http://www.box.net/shared/roetu42pxo
Thanks,
Jay

Similar Messages

  • Is it possible to downsample an image with javascript?

    A colleague of mine has created a PDF with functionality in place to allow the user to add pictures into predetermined locations in the document.  His concern is that without the ability to downsample these photos automatically, the users may end up attaching photos from their 10+ MP cameras and this would result in enormous PDF documents.
    Is it possible to downsample these images automatically with javascript?  The dimensions of the images can remain unchanged, but the resolution must reduce to bring down the potential size of the resulting PDF.
    Any advice or pointers would be welcome.
    Thank you

    Are there any other options you can think of to accomplish this using the Acrobat API and not necessarilly using javascript exclusively?  Could a plug-in accomplish this?

  • Possible to downsample on transfer to iPod

    I keep all my files in iTunes in uncompressed format for playback through airTunes. Is it possible, when transferring them to my iPod to downsample them on the fly? My Shuffle does this by default (can't pay uncompressed files), but can I set up iTunes to do it with my iPod photo?

    The option to convert on the fly, is not available as a separate setting in iTunes.
    You will have to do it manually, or maybe this script can be useful.
    M
    17' iMac fp 800 MHz 768 MB RAM   Mac OS X (10.3.9)   Several ext. HD (backup and data)

  • Possible to downsample only certain albums when transferring to device?

    Hey guys,
    I have a question regarding the feature iTunes has where you can down sample higher bitrate songs when transferring to your device. Id like to be able to down sample CERTAIN albums when transferring - instead of all of them. Right now I have it set to 256 AAC for transferring to my ipad, but because of the audio equipment that I use, I can tell the difference between the lossless quality and aac and would like to keep certain albums as lossless on my ipad.
    Is this even possible?
    Thanks!

    Thanks for that link!
    It doesnt seem like it would be too advanced to do. If I think about it, it should be pretty simple. In fact, it almost looked like you could do it because there is a checkbox that says "only sync checked songs/albums" I thought maybe I could check that and then manually sync others at whatever bitrate it was at... but no.
    There could be an "Advanced" section at the bottom in a drop down menu with these options... Ill make a feature request
    Thanks

  • Double figure for Elimination

    Hi all,
    This is my detail explaination regarding this issue.
    When I run balance carryforward for period 1/2008, figure for posting level 20 will carryforward.(RM27,441.0). This is IU figure from period 16/2007. But after I run task IU Elimination, the carryforward figure will display like double posting.(RM54,882.00)
    i.e
    C4000 15053100 C4400 00      RM 27,441.00 RM 27,441.00
    C4000 15053100 C4400 00      RM 25,855.92 RM 25,855.92
    C4000 15053100 C4400 20 E1 RM 0.00         RM 54,882.00
    C4000 15053100 C4400 20 E1 RM 0.00         RM 25,855.92-
    My method config for IU Task.
    Use for selection 1 and 2 (Two sided elimination)
    Consolidation Company [Our subsidiaries]
    Item [FS Item]
    Trading Partner [Our subsidiaries]
    Posting Level [00-12]
    Document Type
    Posting Level - Two-sided Elimination Entry
    Balance Check - Error when balance not equal to zero
    Application - Other
    Posting - Automatic Posting
    Inversion - Automatic Inversion Also in New Fiscal Year
    Hopefully all of you will help me to solve this issue...
    Thanks you.
    Azie

    Hi Azie,
    First of all, you ought to understand if this possible double amounts came from the BCF or not.
    If not, then I would thouroughly examine the entries of the I/C elimination task.
    I met several times the situations when a wrong strategy of writing differences lead not to a elimination, but just to the opposite: the doubling of difference. For me it looks like very much as your situation.

  • ClearScan OCR in Acrobat 9 deletes portions of text

    I am experimenting with book scanning using a digital camera and various software including Acrobat 9.  I discovered that in some cases, when I perform OCR with ClearScan, apparently random portions of the text in the scanned PDF image are deleted.  Sample1.jpg shows a page before ClearScan OCR, and Sample2.jpg shows it after.  As you can see some of the text has been deleted.  How could this happen?  And, how can it be prevented?

    I'm having the same issue on a document I have a scan of and am trying to convert to a ClearScan pdf. I have found a workaround to the problem, which involves selecting "Optimize Scanned PDF" and using the default setting with the dial set to "high quality" before performing the OCR. This seems to stop chunks of words from dissappearing (at least I haven't found any cases), but it has the horrible consequence of dramatically increasing the file size when performing OCR with ClearScan. In a small file this may not matter much, but in my case it makes a 10MB pdf increase to 70MB.
    Can anyone confirm if this also happens on Acrobat X?
    From what I can tell, there are no preferences to control the OCR besides the downsampling. Specifically, I would really appreciate it if anyone knew a way to reduce the size of the fonts that are generated by the ClearScan OCR as they account for over 95% of the size of the file.

  • OCR with HP printer Photosmart 2570 all-in-one

    I prepare a newsletter with Pages once a quarter, and I find that it would help if I could use OCR to avoid having to type some of the contributions.  
    When I open the Device Manager of the HP printer it offers 'Make copies', 'Scan to OCR' and 'Scan Picture'.   The Scan to OCR, when double clicked says "Scanner not found error",  this used to happen with the Scan Picture, but by going to Preview, 'Import from Scanner', you can get a picture scanned, but the OCR option doesn't work.
         The scan will scan text as well, of course, but how to you activate the OCR software (if it exists) ?
         Alernatively, is there another easy, inexpensive way to perform the OCR function ?
         I have tried HP, but their website is a bit of a maze.   Also I found that Microsoft have a new Office for Mac 2011, an upgrade from the 2008 version, and they have a process called "Document Imaging", but this may not be in the version for Mac.(it was left out of the 2008 version)
         The scanning function on my present printer give very good results, I need to transfer the scan to a Mac-friendly OCR programme.
         I am open to suggestions, and will be very grateful for a solution.
         Thankyou all       Eric

    Hello anyosua,
    Welcome to the HP Forums.
    I see that you are having an issue with the software and drivers for the printer.
    So I can better assist you, please respond with which Operating System you are running:
    Which Windows Operating System am I running?
    Mac OS X: How Do I Find Which Mac OS X Version Is on My Computer?
    Thanks for your time.
    Cheers,  
    Click the “Kudos Thumbs Up" at the bottom of this post to say “Thanks” for helping!
    Please click “Accept as Solution ” if you feel my post solved your issue, it will help others find the solution.
    W a t e r b o y 71
    I work on behalf of HP

  • TIFF files and OCR

    I previously setup Sharepoint 2013 on Windows Server 2008 R2. In this scenario the TIFF files appeared to be indexed by default and the parser did an OCR on the files (which was very slow).
    I have since built a new Sharepoint 2013 farm on Windows Server 2012 and, after some research, it appears the TIFF OCR engine doesn't exist under Windows Server 2012 and TIF files are not indexed by default. As such, I expect OCR of TIFF files under Sharepoint
    2013 will no longer work.
    Is this correct?
    BTW: This isn't bad, it is actually perferred behaviour. However, I just need to confirm the behaviour.

    http://technet.microsoft.com/en-us/library/dd744701%28v=ws.10%29.aspx
    Forcing optical character recognition of every page of a TIFF image document
    This setting bypasses Windows TIFF IFilter performance optimization mechanisms that are designed to skip the OCR processing for images that do not contain text.
    To force OCR of every page of a TIFF image document
    Open the Local Group Policy Editor as follows: Click Start, type
    gpedit.msc in the Start Search text box, and then press ENTER.
    Under Computer Configuration, expand Administrative Templates.
    Expand Windows Components, expand Search, and then click
    OCR.
    Double-click Force TIFF IFilter to OCR every page in a TIFF document.
    Click Enable, and then select one or more languages.
    Click OK.
    http://technet.microsoft.com/en-us/library/dd755985%28v=ws.10%29.aspx
    If this helped you resolve your issue, please mark it Answered

  • Double encryption in Time Machine?

    I used Disk Utility to wipe a disk and initialize it as an encrypted volume to be used for Time Machine backup. But in the Time Machine preferences, there's also an option to encrypt the destination. I'm a bit confused now:
    If I use the encrypted disk AND choose the Time Machine encryption option, will everything be encrypted twice? Or will Time Machine simply use the encrypted volume without further ado? And what if I don't check the encryption option in Time Machine? Will it convert the disk to an unencrypted format?
    Which is the best way? Using encrypted Time Machine backup on a regular disk, using "normal" Time Machine on an encrypted disk, or using encrypted TM on an encrypted disk? Or is it all just the same? I would assume the latter, but I want to be sure.
    I just want to avoid a possible double encryption (which would be slow and pointless), but I do want to make sure everything is really encrypted.
    Thanks for the info,
    Michel Colman

    Michel Colman wrote:
    I used Disk Utility to wipe a disk and initialize it as an encrypted volume to be used for Time Machine backup. But in the Time Machine preferences, there's also an option to encrypt the destination. I'm a bit confused now:
    If the disk is encrypted, when you select it as the destination, the Encrypt backup disk box should be checked automatically.
    If I use the encrypted disk AND choose the Time Machine encryption option, will everything be encrypted twice?
    No.
    Or will Time Machine simply use the encrypted volume without further ado?
    Yes.
    And what if I don't check the encryption option in Time Machine? Will it convert the disk to an unencrypted format?
    I believe so, but don't recall if I tested that.
    Which is the best way? Using encrypted Time Machine backup on a regular disk, using "normal" Time Machine on an encrypted disk, or using encrypted TM on an encrypted disk? Or is it all just the same? I would assume the latter, but I want to be sure.
    Yes, encryption is encryption.
    See Time Machine - Frequently Asked Question #31 for the gory details.

  • Subtracting very small double value

    i'm having trouble figuring this out..i'm sure it's a simple solution..anyhoo..i'm trying to run the following line of code
    value=1-((1-b1[8+(i%classes)])*(1-b2[4+(i%classes)]));
    let me point out that value is a double value, and b1 and b2 are double arrays...
    the values of b1 and b2 are very small(around E-60)
    the PROBLEM is that value returns 0.0 when i run my code instead of the small value. I'm sure it's something I'm doing/not doing, but I'd appreciate some help on what is going on. Also as a note, I'm running J2SE 1.3.1. I saw on the API that the min double value is supposed to be around E-324 but any suggestions people? Thanks!!

    1-b1[8+(i%classes)]This is 1 - (10^-60), right? A double value only has about 16 significant digits, not 60, so the nearest possible double number to that is 1. Hence you get 1 - 1*1. There's a whole school of mathematics called "numerical analysis" that deals with questions like this. In this particular case, if b1[x] and b2[x] are always extremely small, you should evaluate the expression in advance so that it looks like b1 + b2 - b1*b2, which in turn will round to b1 + b2.
    PC²

  • FCP scaling/aspect ratio motion algorithms make good video soft?

    I have tried everything, but converting 16:9 footage into a 4:3 timeline in Final Cut Pro makes the letter-boxed video soft. I have, of course, double checked against NTSC external broadcast monitor. Original 16:9 clips look great, when converted to letter box 4:3 timeline they go soft. ( I believe it is FCP ineptness in handling Pixel Aspect Ratio conversion from 1.2 to 0.9, but I would love to be wrong at this point) I know FCP has poor scaling algorithms to begin with, but this is a very obnoxious issue when capturing clips widescreen from the deck, but having your final output needing to be edited and taped off in letter-boxed format for SD broadcast.
    I really want to avoid losing any quality when going from 16:9 to 4:3 letter-box, so is there a plugin that can be used to handle the rescaling that does a better job than just FCP by itself? These are pretty complicated edits, (television show) so avoiding having to hop over to After Effects for a second huge render would be preferable. I have exhausted Google looking into this, lol, anyone have any ideas?
    Thanks All,
    Dustin Hoye
    Editor
    Sour Squirrel Studios, llc.

    Patrick,
    Thanks for the quick response. My motion filtering quality is set to "Best". After looking over the link you provided ( I will examine more thoroughly, but saw your response and wanted to re-post to answer your question.) I also tried switching to "Fastest (linear)" just to see if it would eliminate the softness/possible double lines, but of course, no luck. (Also, just to clarify, this is working all in SD.)
    I would consider using Compressor or (After Effects for that matter), but I am editing using the letter box conversion because we occasionally are mixing in past regular 4:3 footage with the letter boxed 16:9. (the reg 4:3 being a minority of the clips) It is easier just to matte those 4:3 clips to match the converted 16:9 since there are fewer of them. If I fed the timeline to compressor then those clips (the reg. 4:3) would get interpreted incorrectly. I suppose I could just leave them out and re-insert them later, but depending on the number of them and how they are used that could be quite a work-around. It would be great if I could just get FCP to interpret the conversion internally (i.e. plugin) w/out causing quality loss. Sigh.
    Everything else is set to 100%; editing 10- bit uncompressed, 10-bit material in High Precision YUV. I have tried multiple combinations of capture, codecs and vid processing options, but all seem to have the same result.
    Thanks Again,
    Dustin

  • Separate line item for taxes in accounting document of MIGO

    All SAP Gurus,
    As we can have separate line item for Freight amount (Freight amt credited) in accounting document for MIGO.
    Similarly is it possible to have separate line item for tax amount which is inventoried.
    Regards,

    Hi Rajan,
    Freight is from pricing procedure whereas tax is from taxing procedure.
    Tax cannot be shown in separate g/l in MIGO.
    Reason:
    If u wanna show tax as separate item then we need to debit tax a/c in migo as well as it has to go to inventory which is not possible(double accounting enrty on debit side).
    I too tried to do the same by specifying 2 for posting indicator and selecting non deductible for the nonsetoff account key used in the taxing proceudre in OBCN but it is behaving like setoff tax.
    U can also try doing the same and then check.
    Reg
    Raja

  • Editing titles or clicking on clips causes serious error

    Problem with new mac mini (purchased two days ago), no problem on iMac (with
    same software install, checked with system profiler). Tried with two mac minis
    (replaced the just purchased one with a replacement after a discussion with a genius)
    Mac OS X : 10.5.7
    iMovie 09' : 8.0.2 (741)
    Editing titles/text, changing titles/back ground lengths causes. Could be associated with generally clicking on clips (possibly double clicking)
    1) Loss of control of inspector window
    2) Inability to share the video (the options are greyed out)
    3) Inability to return to the project selection window
    4) Inability to add more titles/backgrounds
    5) Inspector seems trapped in a strange state, unable to close
    iMovie reports the following messages via console(as example)
    18/05/2009 16:23:11 iMovie[325] NSMutableRLEArray objectAtIndex:effectiveRange:: Out of bounds
    this error repeats many times.
    Anyone please help!

    Since this is a new rig, I'm assuming this is a PPro CS6 installation you're referring to?    

  • Edit Cell Numeric Attributes may show strange default values

    Those values look a bit surprising... I did not type them myself... It happens if you change the data type let's say from unsigned char to double...
    But it's not a meaningless number: if you now type -1e1000 as a new minimum value, CVI 'corrects' it to -9.2E+18...
    It also can behave properly, sometimes, but shouldn't it always?
    It behaves better if instead of -1e1000 you first type -100. If you then enter -1e1000 it will correct the number to -1.0E+300
    (well, it probably should be E+308...) but I'll better stop here 
    Solved!
    Go to Solution.

    Hi Wolfgang,
    When I first open the Edit dialog, the cell's data type is int64 and the range is (-9223372036854775808, 9223372036854775807). Those happen to be the largest possible values that an int64 can hold, and they are probably there because they were coerced down from the default double range (-Inf,+Inf), in a previous editing session.
    You then changed the data type to char, which changes the range to (-128,127), which is the coerced range of a char . But the dialog still remembers that your initial preference was (-9223372036854775808, 9223372036854775807), and so, when you then change the data type to double, in that same editing session, it restores this range again, since it is now valid for the new data type. Except that it displays it in scientific notation, since doubles automatically use this notation when the exponent goes above a certain size.
    It looks as if you then tried to coerce it back to the largest possible range of a double by typing 1e1000. This is a reasonable assumption on your part, but unfortunately it is a limitation of CVI that it cannot coerce a double entered in scientific notation -- it doesn't recognize 1e1000 as a valid number. Therefore, it keeps whatever value was in the control beforehand. If you had typed 1e308 instead, it would have worked, since that is a valid number. I recognize that it is very inconvenient for you to "guess" what the maximum number might be, so that you can enter it. All I can say is that if you really want to enter the highest possible double value in the max/min controls, you should enter "-Inf" and "+Inf" instead.
    Luis
    Given what I wrote above, the following shouldn't happen (and I wasn't able to reproduce it): "It behaves better if instead of -1e1000 you first type -100. If you then enter -1e1000 it will correct the number to -1.0E+300". It should leave -100 in the control.

  • Mail crashes when attempting to view emails

    Been seing most thread on Mail Crashing, and have attempted all different possible solutions, but nothing has happend yet.
    I only have one email-account in Mail - gmail, when attempting to view emails, it simply breaks down, and closes Mail.
    Messageview is simply not possible - I can reply on emails, but viewing in inbox is not possible, double-click makes is crash.
    Anyone with an idea?
    Process:         Mail [1053]
    Path:            /Applications/Mail.app/Contents/MacOS/Mail
    Identifier:      com.apple.mail
    Version:         7.0 (1816)
    Build Info:      Mail-1816000000000000~1
    Code Type:       X86-64 (Native)
    Parent Process:  launchd [151]
    Responsible:     Mail [1053]
    User ID:         501
    Date/Time:       2013-10-27 08:24:57.073 +0100
    OS Version:      Mac OS X 10.9 (13A603)
    Report Version:  11
    Anonymous UUID:  05ECDB88-6FEE-C3F4-9595-78842B4637DD
    Sleep/Wake UUID: 243D9E70-D096-4AFE-8F35-255339133C0F
    Crashed Thread:  0  Dispatch queue: com.apple.main-thread
    Exception Type:  EXC_CRASH (SIGABRT)
    Exception Codes: 0x0000000000000000, 0x0000000000000000
    Application Specific Information:
    *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: '-[NSViewController loadView] could not load the "MessageView" nib.'
    abort() called
    terminating with uncaught exception of type NSException

    Greetings,
    I am having the same problem with my wife's MacBook Pro. Here is the crash report... Any ideas or help out there?
    Process:         Mail [1740]
    Path:            /Applications/Mail.app/Contents/MacOS/Mail
    Identifier:      com.apple.mail
    Version:         7.0 (1816)
    Build Info:      Mail-1816000000000000~1
    Code Type:       X86-64 (Native)
    Parent Process:  launchd [220]
    Responsible:     Mail [1740]
    User ID:         501
    Date/Time:       2013-10-27 11:41:54.526 -0600
    OS Version:      Mac OS X 10.9 (13A603)
    Report Version:  11
    Anonymous UUID:  B514437F-961F-36BD-DB70-CF83E04A1312
    Crashed Thread:  15  -[MFSnippetManager _calculateSnippetForMessages]  Dispatch queue: NSOperationQueue Serial Queue
    Exception Type:  EXC_BAD_ACCESS (SIGBUS)

Maybe you are looking for