Converting PDF/A to PDF while combining images
Thanks in advance for your assistance!
I have a quantity of PDF/A that I need to convert to standard PDFs. However, here's the wrinkle: each file has separate images that are layered over each other. I really need one image that has all the separate images combined. Further information: there's embedded text that needs to be preserved. An example image is here. An example of what it needs to look like is here.
How do I do this in Acrobat Pro XI? I've tried a bunch of different things but had no success. I was hoping someone with more experience has some suggestions. Thanks!
Well, the situation is this. We have an OCR program that reads tiffs of historic newspapers and makes PDFs of them. Then we have another piece of software that creates an XML (containing the text and related metadata) and PNG for screen display using the PDF as the source. Then these are ingested into a third software package that makes the newspaper pages searchable and displays the PNG via the web. It then offers the PDF as a download.
This works for us BUT we upgraded the OCR program and accidentally changed from PDF to PDF/A. For whatever reason the PDFs output with a single image but PDF/A takes any pictures or drawings and makes the separate image layers while "cutting" the pictures out of the main image. The software that creates the XML can only process the first image it finds in the PDF. We didn't detect the change until we had OCRed about 30,000 pages.
We unfortunately pay per page for the OCRing. So I'm hoping to fix the incorrect files rather than going back and re-OCRing the whole batch and have to pay again on the same pages. We corrected the process and now PDFs come out of the OCR with just one image. So moving forward we're fine.
So that's why I'm trying to find a way to flatten the images.
Similar Messages
-
Tag structure lost while combining multiple PDF files
Hi All,
As per client requirements, we have been using ABBYY Fine reader to create PDFs. After creating multiple PDFs of a huge book (separated logically according to Units consisting diff Chapters), we have tagged them correctly.
We need to know put these tagged PDFs together. However, while combining multiple tagged files, we loss the Tagged structure!!! (using Adobe 8 Professional)
Any idea, why are we losing our tags in the combined file? Any help in this matter would be highly appreciated!
Thanks in advance!
MamtaHello Steve (and ALL),
We are combining the files using File --> Combine Files... dialog box. We also tried Document --> Insert Pages. In both the cases, my tag structure of all files just dont appear!
While combining files we get the following:
Tags
--- Document
-----Part (Empty)
-----Part (Empty)
-----Part (Empty)
(The number of "Part" corresponds to the number of files I try to combine)
While Inserting the pages into a file, we only get to see the tag structure of the first file into which we insert the pages.
Are there any other methods to combine the files???
Regards,
Mamta -
I purchased the Adobe PDF Pack for $7.50/month (or $89.99/yr) but it won't allow me to convert any files to pdf or combine multiple files into 1 pdf form. It keeps saying "An error occurred while trying to access the service". WHY is this happening & How can I fix it?
Hi hpmg,
It seems you are trying to access the service via Adobe Reader. Make sure you are signed in with your Adobe ID. Is Reader updated to the latest patch v 11.0.09?
It might be possible that a firewall or antivirus might be breaking the connectivity from Reader to the server.
Try accessing the service from the browser: https://cloud.acrobat.com/exportpdf and check if that works for you.
Regards,
Rave -
Combining images to PDF in folder based on prefix
I have an arbitary folder with images in them, that are named like this:
00_01.tiff
00_02.tiff
00_03.tiff
00_03.tiff
01.tiff
02-1_01.tiff
02-1_02.tiff
02-2_01.tiff
02-2_02.tiff
I want to combine all the files starting with 00 into a new pdf 00.pdf, all starting 01 to 01.pdf, all starting with 02-1 to 02-1.pdf and all starting with 02-2 to 02-2.pdf
In short the rule would be:
if text 3 in fileName = "_" then
set newFileName to text 1 thru 2 of fileName
else
set newFileName to text 1 thru 4 of fileName
end if
It is easy to combine files to PDF's with Automator, but introducing this logic and doing the combining in Applescript is out of my league.
Does anybody have some suggestions on how I should go about solving this?
Best regards,
KnutJacques Rioux wrote:
Does the script should search in the subfolders?
No, I did not intend it to search through subfolders. I have thought about maybe first putting each group of files in to subfolders, and then iterate through the folders and do the PDF conversion then. That would be quite close to what the script you wrote is doing. More about that at the end of this reply.
Do you have multiple image formats (the name extension) to find in the source folder or in subfolders?
No, there will only ever be .tiff images in the folder. The contents of the folder is created by another script (as a part of a bigger workflow).
Do you have any PDF documents or PDF images to find in the source folder or in subfolders that you want to combine?
No, there will only be images in .tiff format. I mistyped in my previous post - it should have said .tiff there too, and not .pdf. Sorry about that.
xx.tif and xx_y.tif : Is it the number in xx can be the same or always different?
If the answer is not always different, (11.tiff and 11_2.tiff) Does the script should create one PDF or two PDF?
If the answer is two PDF, how do you rename the new PDF file? --> (11.pdf and 11_2.pdf) both must be renamed "11.pdf"
I will remove the xx.tiff as a possibility. We will only have xx_yy.tiff or xx-z_yy.tiff to care about, and not xx.tiff.
I will give you some background information, maybe it is easier when understanding the use case. You can skip to the table below if this is not relevant. The reason for this naming convention, is so that yet another script can finalize the workflow later, when doing renaming based on the output name of the created PDF-file. My workflow is for digital archiving of sheet music.
The entire workflow is like this:
Scanning the sheet music.
Converting pdf-files to tiff.
Cleaning, straigthening and convert from grayscale to b/w with ScanTailor
Rename output files to the naming scheme I describe.
Convert the tiff images to pdf, per part. (this is the part I'm asking about here)
Renaming all the files to fit the archiving scheme.
Grouping and sorting the files, and creating zip-archives.
Archiving the entire digital copy of the set in our digital archive.
For our case, we're only interested in item 5 in the list. Another series of scripts and Automator workflows take care of the rest.
Each music piece consists of several parts.
In the filing system I have created, each part has a prefix with the xx-number, like this:
00 = Score
01 = Soprano cornet
02 = Solo cornet
17 = BbBass
18 = Percussion
The final files will have a name like (only an example):
02_SoloCn_31_SilentNight_Gordon.pdf
For some pieces there are more than one solo cornet part. That's why we get the z-component, to differentiate between the parts:
02-1 = Solo cornet 1 --> will get filename 02_SoloCn1_31_SilentNight_Gordon.pdf
02-2 = Solo cornet 2 --> will get filename 02_SoloCn2_31_SilentNight_Gordon.pdf
Many pieces has more than one page per part. That is where the yy component of the file name comes in, the only reason for the yy component is to keep all the pages in the correct order to make it easy to combine them programatically:
02-1_10.tiff = Solo cornet 1, page 1
02-1_11.tiff = Solo cornet 1, page 2
02-1_12.tiff = Solo cornet 1, page 3
Here is a table with the pattern, example of a possible filename and what the output pdf should be called:
Pattern
Example file name
Output pdf
xx_yy.tiff
01_10.tiff
01_11.tiff
01_12.tiff
01_13.tiff
01.pdf
xx_yy.tiff
12_10.tiff
12_11.tiff
12_12.tiff
12.pdf
xx-z_yy.tiff
02-1_10.tiff
02-1_11.tiff
02-1_12.tiff
02-1.pdf
xx-z_yy.tiff
02-2_10.tiff
02-2_11.tiff
02-2_12.tiff
02-2.pdf
Is it possible that there would be other characters after the pattern? --> example: 33_2test.tiff, 40-4_99ball.tiff
No, all the files will have just the components described above in the file name. No need for exception handling, since this is a tightly controlled workflow.
As i mentioned earlier in this reply, I have played with the thought of first running a script to sort all the files for one part into a subfolder with the correct name.
This I can do with the following script (I guess it would be more efficient with shell script or using System Event to do the moving):
tell application "Finder"
set thisfolder to (choose folder)
set mylist to every file in thisfolder as alias list
repeat with eachfile in mylist
set filename to name of eachfile
if (character 3 of filename is "-") then
set prefix to characters 1 thru 4 of filename as text
else
set prefix to characters 1 thru 2 of filename as text
end if
try
set newfolder to ((thisfolder as text) & prefix) as alias
on error -- file not found
set newfolder to make new folder at thisfolder with properties {name:prefix}
end try
move eachfile to newfolder
end repeat
end tell
After this, all the files have moved to folders with the folder name corresponding to the part number, like this:
01/01_10.tiff
01/01_11.tiff
02-1/02-1_10.tiff
02-1/02-1_11.tiff
and so on...
It would perhaps be easier to extend on this script, or create a new one, that will iterate each subfolder in a folder, and run the Automator action for combining images to files with each iteration?
What do you think? I'm sorry if this was too much information. Do not hesitate to ask if something is unclear (I bet it is!).
I really appreciate your help, Jacques!
Knut -
Trying to drag pdf files i have and combine them into one pdf file in the account i just purchased with Adobe. when i drag a pdf file over Adobe doesn't accept it. says it can not convert this type of file. but it is an Adobe file. Do I need to change it in some other form befor dragging it?
Hello djensen1x,
Could you please let me know what version of Acrobat are you using.
Also, tell me your workflow of combining those PDF files?
Please share the screenshot of the error message that you get.
Hope to get your response.
Regards,
Anubha -
Error: EXOpenExport() while converting xlsb file to PDF
I am getting this error while converting .XLSB file to PDF using IBR :Step PDFExport forced conversion failure by conversion engine because of error: EXOpenExport() failed: no filter available for this file type (0x0004). Can anyone tell me how to resolve this issue.
Thanks in advance.Thank you for all your intrest.
The error message I get is... "an unexpected error occured. PDFMaker was unable to process the request".
When I then try to convert again, the system opens the ppt, but then does not do anything, and the error message does not show, for all presentations I try to convert.
I have also tried using the ribbon, but the button does not do anything once I have clicked.
I can print to PDF, but this provides an output which is not suitable, because it shows as if it each slide was printed on to paper and then put in PDF.
I have also uninstalled and reinstalled Acrobat to make sure nothing was corrupt. -
Distorting text/images in converting from pmd to pdf
My company uses PageMaker 7.0 to create our manuals and brochures and then converts them into pdf's. I've run into this problem before and found that if I didn't properly place or import an image (not copy and paste an image or text) that once converting it into a pdf only a portion of the image would be there or everyother 3rd letter would exist. Thought we had corrected it. The creator of this last document saw everything correctly but when another employee opened the pdf file through the network it was distorted and wrong. I opened the document and "saved as" in my documents and then everything seemed fine from my computer but then the creator opened the file on his computer and it was wrong. The link info on some of the images is wrong even though it was placed properly- the text was entered in Pagemaker but on the pdf is missing letters and causing odd spacing. I'm at a loss now on how to correct this. I hate to upgrade when pagemaker is more than sufficient for the basic manuals we do. I did notice that I was using Reader 9 wheras they still had 8 on their computers- already had them upgrade but didn't think this would be a real issue. Any ideas on the problem or how to correct would be greatly appreciated!!
Look in the application where the users are generating the PostScript.
Many of them have an option "print last page first", so this naturally
produces a backwards PDF.
Aandi Inston -
I am using Adobe Reader with paid up service through May 2015. When I attempt to convert a file either to PDF or from PDF, I get the error message, "An error occurred while trying to access the service". What do I need to do to get access to the service I have paid for?
Hi DeaconTomColorado,
Please see "Error occurred when trying to access this service" when logging on to Acrobat.com.
Adobe has just released an update to Adobe Reader, so if you're accessing the service via Reader, please let us know whether the update helps resolve the issue.
Best,
Sara -
After 9.3 update getting error printing to Adobe PDF while converting from Powerpoint to PDF
That happened after upgrading Acrobat Pro 9.1.2 to 9.2->9.3 on Windows XP sp2 platform. Converting from other Microsoft Office 2003 products is totally normal(Excel, Word) Did anyone came accross and know how to fix that?? We have many users with exactly same issue.
thanks...figured out my problem!
Date: Wed, 3 Feb 2010 12:24:29 -0700
From: [email protected]
To: [email protected]
Subject: Re: After 9.3 update getting error printing to Adobe PDF while converting from Powerpoint to PDF
What happens if you try to print to the Adobe PDF printer?
> -
Problem while converting smartform out to PDF.
Hi,
I have an issue while converting smartform output to PDF. After converting samrtform out to PDF, apostrophe(') is appearing as # in the pdf file. For example the word Indian's is getting printed as Indian#s. I'm using Helvetica font for printing this text.
How ever print preview is coming fine.
Could anybody provide me some inputs to solve this.
Thanks,
Rick.I'm using FM: CONVERT_OTF
CALL FUNCTION 'CONVERT_OTF'
EXPORTING
format = 'PDF'
IMPORTING
bin_filesize = lv_filesize
TABLES
otf = ls_job_info-otfdata
lines = lt_pdf_table
EXCEPTIONS
err_max_linewidth = 1
err_format = 2
err_conv_not_possible = 3
err_bad_otf = 4
OTHERS = 5.
IF sy-subrc <> 0.
MESSAGE ID sy-msgid TYPE sy-msgty NUMBER sy-msgno
WITH sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
ENDIF. -
Convert rtf to pdf with good font and sharp images
I can convert rtf to pdf in Microsoft Word (2004) using the Print function but the resulting document doesn't maintain the same font and the images look a bit hinky as a result.
I tried to convert from rtf to pdf in Acrobat 9 but it sez unsupported file type.
I just want a really clean conversion that keeps the images sharp and the fonts looking good. Am I out of luck?<!--- source:http://livedocs.adobe.com/coldfusion/7/htmldocs/wwhelp/wwhimpl/common/html/wwhelp.htm?cont ext=ColdFusion_Documentation&file=00001585.htm --->
<cflock scope="Application" type="exclusive" timeout="120">
<cfif not StructKeyExists(application, "MyWordObj")>
<!--- First try to connect to an existing Word object --->
<cftry>
<cfobject type="com"
action="connect"
class="Word.application"
name="Application.MyWordobj"
context="local">
<cfcatch>
<!--- There is no existing object, create one --->
<cfobject type="com"
action="Create"
class="Word.application"
name="Application.MyWordobj"
context="local">
</cfcatch>
</cftry>
<cfset Application.mywordobj.visible = False>
</cfif>
</cflock>
<!--- Convert a Word document in temp.doc to an HTML file in temp.htm. --->
<!--- Because this example uses a fixed filename, multiple pages might try
to use the file simultaneously. The lock ensures that all actions from
reading the input file through closing the output file are a single "atomic"
operation, and the next page cannot access the file until the current page
completes all processing.
Use a named lock instead of the Application scope lock to reduce lock contention. ---><cflock
name="WordObjLock" type="exclusive" timeout="120">
<cfset docs = application.mywordobj.documents()>
<cfset docs.open("c:\ColdFusion8\wwwroot\RTFtoPDF\temp.rtf")>
<cfset converteddoc = application.mywordobj.activedocument>
<!--- Val(8) works with Word 2000. Use Val(10) for Word 97 --->
<cfset converteddoc.saveas("c:\ColdFusion8\wwwroot\RTFtoPDF\temp.htm",val(8))>
<cfset converteddoc.close()></cflock>
<!--- Read the HTML file ---><cffile
action="read" file="#expandPath('temp.htm')#" variable="fileToConvert">
<!--- Convert from HTML to PDF---><cfdocument
overwrite="yes" filename="#expandPath('temp.pdf')#" format="PDF"><cfoutput>#fileToConvert#</cfoutput></cfdocument>
<p>Conversion from temp.rtf to temp.pdf complete</p> -
Hello,
I have includes a Web link in an image.
When I convert the document in PDF, the link remains inactive. Could use me?
Thank you.No version of Pages will allow images to be links. We always got around this by bracketting the image with text and making the link in that.
Pages 5 has lost the ability to have cross links in pdfs and seems to have a bug that disables any links after the first page.
Peter -
Images downsized and converted when exporting to PDF
I have Pagemaker 7.0.1 installed on Virtual XP so that I can help out a family member.
Pagemaker's embedded Acrobat 4 was not working properly so I installed my Acrobat 9.0 on Virtual XP. This unfortunately has not solved the problem, which is:
My images are greyscale and 300 ppi when imported into Pagemaker. When exported to PDF using Acrobat 9.0, the images are being converted to CMYK and exported at about 50% of the actual size. Interestingly, those images that are only at 200 ppi are converted but not downsampled.
This is happening no matter how I fiddle with the settings. Happens to both jpegs and tiffs.
I found an old thread on another forum where this problem occurred but a solution was not found or posted.
Any ideas?
MichelleThis reply is a series of comments and suggestions rather than a definitive answer.
1. Your set up is less than perfect, if not unstable, and will be prone to all sorts of bizarre behaviours.
2. Never copy/paste a graphic or table into PageMaker. Always use the Place command. The only "exception" would be that you can copy and paste a properly Placed graphic from within your PageMaker document.
When placing images in a PageMaker file, the following message appears: “The graphic in the linked file would occupy xxxxxxx bytes in the publication. Include the complete copy in the publication anyway?” The correct answer to this question is always "NO".
(You can permanently avoid this popup in future docs by unchecking "store copy in publication" under Element -> Link Options with no publication open. You can do the same to avoid further prompts in pre-existing documents.)
3. Try updating PM to Ver 7.0.1a which had a better Export… AdobePDF…macro. http://www.adobe.com/support/downloads/product.jsp?product=34&platform=Windows
4. In PageMaker’s Print dialogue, click on the Options button, and check "Write PostScript to File." Select the "Normal" radio button. Set both Type 1 and TrueType fonts to download. As always, set the Send TIF/Images option to Normal and NOT Optimized. Make other settings elsewhere in the print dialog as desired, such as "paper" size, crop marks, separations, etc.
5. With your non-standard setup, try the two stage print then distill process. Create a PS file using a postscript printer file, then load Distiller and convert the PS file to PDF. Note: if you have hyperlinks in your PMD file, they will not appear in your PDF and need to be re-created using Acrobat.
6. It's time to leave PM and move on to InDesign - see it as PM Ver 13.
7. Aternatively, use PM7 on a contemporary Win2K PC. http://bigjohnd.org.uk/PageMakerExportPDF/index.htm -
Multiple images are blocky when converting from powerpoint to PDF?
Heres my problem:
If I have a powerpoint with ONE image (be it jpeg, gif, png, etc) they apper fine when you convert the document to PDF (via print>Print to PDF or by using the Acorbat plug in in Powerpoint).
BUT
If I have 2 or more images in a powerpoint slide, the images appear blocky and sections of the images are missing. I tried to ungroup the photos and group the photos yet I get the same result. Anyone know how to correct this??
The images were originally created in Adobe Illustrator. There must be a setting in Illustrator or Powerpoint that is causing this problem but I cant figure it out.
ThanksCorrection, it appears to be a problem with the CROP button in Powerpoint. I have a PNG file and it has two images (a map and a legend) on one png. When I crop it to only show ONE image (either the map or the legend) it makes it become blocky. So weird.
-
Hi, I have converted different smartforms to pdf format.How to combine pdf?
Hi All,
I have converted different smartforms to pdf format. How to combine all the pdf's into single pdf.
I need all the smartforms to be in single pdf.
Please help me in this regard.
Thanks in advance.Hi Keshu,
Individual pdf should be sent to individual user.
And at last all the pdfs of the smartform should be combined into one.
And admin have the provision to download it .
I mean the requirement is
For example.
For the month of september i will generate Pay Slip to each employee. And each pdf will be send to corresponding employee.
And finally all the pay slips of all the employees will be combined into single PDF and admin will download it and keep it for reference.
So as of now. I have generated individual pdfs and mailed it accordingly.
But how to combine it into one PDF is my question.
Please help me in this regard.
Thanks in advance.
Maybe you are looking for
-
Is it itunes or my computer thats messed up? ITUNES WONT OPEN!
my itunes used to work perfectly. then it stopped working so i reinstalled only to find that after using itunes the first time i opened it, it wouldn't open anymore. so i tried removing the program and installing it again. now it won't even open once
-
I am unable to install itunes for windows 7
I am unable to install itunes for windows 7.I want to import audio material to my iphon4s from my lap top
-
Can any one help me on OCI_BATCH_ERRORS? Please!!!
I have a shared library which does a Array Bulk Insert into ora817 database. The following code does not give me count of errors incase there are any errors during insertion. ub4 nRecs = 20 if ((mstatus = OCIStmtExecute(msvchp, mstmthp, merrhp, nRecs
-
How do I check an iphone for a virus?
I upgraded my husbands iphone to iOS5 software. The iphone backup file that was saved to his PC had a Trojan Horse virus. He lost everything on his phone because the iphone blocked restoring the phone and identifed the Trojan Horse virus. His phon
-
I have restart, restored, and it still didn't fix the problem