Extract PDF embedded in XML

HiAll,
I have tried searching but did not get proper content on this topic. Could you please help.
I have to extract PDF content embedded in an XML and send it via FTP.

Hi,
Have you read this document http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/9913a954-0d01-0010-8391-8a3076440b6e?QuickLink=index&overridelayout=true&5003637721700
http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/9913a954-0d01-0010-8391-8a3076440b6e?QuickLink=index&overridelayout=true&5003637721700
http://scn.sap.com/community/pi-and-soa-middleware/blog/2009/05/17/trouble-writing-out-a-pdf-in-xipi
Regards
srinivas

Similar Messages

Extract All Embedded Files in All Folders and Save Each? Copy/Paste from PDF to Word?

I have most of what I need here, but I’m missing 2 important pieces.
#1) I want to copy/paste from all PDF files in a folder and paste the copied data into a single Word file.
It works fine if I have ONLY Word docs in my folder. When I have PDF files and Word files, the contents of the Word files are copied in fine, but the contents of the PDF files seem to come in as Chinese, and there is no Chinese in
the PDF, so I have no idea where that’s coming from.
#2) I want to extract all embedded files (in all my Word files) and save the extracted/opened file into the folder. Some embedded files are PDFs and some are Excel files.
Here the code that I’m working with now.
Sub Foo()
Dim i As Long
Dim MyName As String, MyPath As String
Application.ScreenUpdating = False
Documents.Add
MyPath = "C:\Users\001\Desktop\Test\" ' <= change this as necessary
MyName = Dir$(MyPath & "*.*") ' not *.* if you just want doc files
On Error Resume Next
Do While MyName <> ""
If InStr(MyName, "~") = 0 Then
Selection.InsertFile _
FileName:="""" & MyPath & MyName & """", _
ConfirmConversions:=False, Link:=False, _
Attachment:=False
Dim Myshape As InlineShape
Dim IndexCount As Integer
IndexCount = 1
For Each Myshape In ActiveDocument.InlineShapes
If Myshape.AlternativeText = PDFname Then
ActiveDocument.InlineShapes(IndexCount).OLEFormat.Activate
End If
IndexCount = IndexCount + 1
Next
Selection.InsertBreak Type:=wdPageBreak
End If
On Error Resume Next
Debug.Print MyName
MyName = Dir ' gets the next doc file in the directory
Loop
End Sub
If this has to be done using 2 Macros, that’s fine.
If I can do it in 1, that’s great too.
Knowledge is the only thing that I can give you, and still retain, and we are both better off for it.

Hi ryguy72,
>>When I have PDF files and Word files, the contents of the Word files are copied in fine, but the contents of the PDF files seem to come in as Chinese, and there is no Chinese in the PDF, so I have no idea where that’s coming from.<<
Based on the code, you were insert the file via the code Selection.InsertFile. I am trying to reproduce this issue however failed. I suggest that you insert the PDF file manually to see whether this issue relative to the specific file. You can insert PDF
file via Insert->Text->Object->Text from file.
If this issue also could reproduced manually, I would suggest that you reopen a new thread in forum to narrow down whether this issue relative to the specific PDF file or Word application.
>> I want to extract all embedded files (in all my Word files) and save the extracted/opened file into the folder. Some embedded files are PDFs and some are Excel files.<<
We can save the embedded spreadsheet via Excel object model. Here is an example that check the whether the inlineshape is an embedded workbook and save it to the disk for you reference:
If Application.ActiveDocument.InlineShapes(1).OLEFormat.ClassType = "Excel.Sheet.12" Then
Application.ActiveDocument.InlineShapes(1).OLEFormat.DoVerb xlPrimary
Application.ActiveDocument.InlineShapes(1).OLEFormat.Object.SaveAs "C:\workbook1.xlsx"
Application.ActiveDocument.InlineShapes(1).OLEFormat.Object.Close
End If
And since the Word object model doesn't provide API to save the embedded PDF, I would suggest that you get more effective response from PDF support forum to see whether it supports automation. If yes, we can export the PDF as embedded spreadsheet like code
absolve.
Hope it is helpful.
Regards & Fei
We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
Click
HERE to participate the survey.

After I optimize my pdf I get this error "Cannot extract the embedded font 'FONT NAME' Some characters may not display or print correctly.

After I optimize my pdf I get this error "Cannot extract the embedded font 'FONT NAME' Some characters may not display or print correctly.

This Acrobat forum may be a better place to ask: https://forums.adobe.com/community/acrobat/creating__editing_%26_exporting_pdfs

Error: cannot extract the embedded font 'WTNNGO+LiberationMono-Bold' when reading a PDF

I am trying to read a pdf file and I get the error: cannot extract the embedded font 'WTNNGO+LiberationMono-Bold'. Then some of the words are displayed as sets of dots, a dot per character. I have tried Adobe Reader 7 through 11 all with the same error. Apparently some fonts are missing. My OS is 64 bit windows 7.

Tell them to embed the font information into the PDF. You don't have the font on your system so as-is, the PDF needs it to display that font. With embedded font information, you don't need the font.

Error message from Adobe Reader. cannot extract the embedded font 'LICCMC+MyriadPro-Light'. some characters may not display or print correctly. Print looks like gibberish

Trying to view/print PDF documents from website. Print looks like gibberish and is unreadable. Problem is with the embedded fonts. Error message from Adobe says cannot extract the embedded font 'LICCMC+MyriadPro-Light'. some characters may not display or print correctly.

Try Adobe support, that's not a Firefox support issue. 
http://forums.adobe.com/index.jspa

PDF form with XML data connection comes up blank at run time

Hello All,
I am a newbie to ADOBE Livecycle 9, but am very proficient in C#. I would like to request for your guidance on the following issue.
We have a desktop application in C#, WPF, Sqlserver. The requirement is to launch a Livecycle form from the application for the user to read/edit/save data
I have done this much so far -
Downloaded trial version of Livecycle 9
Developed a interactive PDf form
Created an XML based data connection. Generated fields on the form using the fields from this connection.
Set the .XML file as preview source for the form
the controls on the form are boumd to the xml data source
In design mode, the form works fine, it displays my data correctly
I have created a WPF form with a button. On click of this button, I call the Process.Start(pdf-file-path). My pdf is launched properly
I have added a combo box to my WPF form. I select a parameter from this, then call a stored procedure which returns me a datatable depending on parameter passed
Using the returned datatable, I have used the datatable.writexml and datatable.writexmlschema to create my XML and XSD files. as mentioned above, this xsd is used to create the data connection for the PDF and the XML for the preview source
This is what I want to do -
Launch the PDF from my WPF form, pre-populated with the newly created XML data from my WPF form.
So basically, as the user changes the selection criteria from the combo box, the XML file data will change and the PDF file will be launched each time with new data.
The XSD format will always be constant
Problem -
My XML and XSD get created properly, my PDF launches, but it is empty
If I change my selection criteria and run the WPF application, and then open the PDF in design mode, it asks me whether it should refresh the XML source. This means that the PDF form is connecting correctly to the XML source
So why then, does the form come up empty at run time?
What link am I missing?
I have found some sites that help using Web applications, but nothing for desktop applications. It would be fantastic if you could point me to some help for developing Livecycle forms with C# / SQLServer
Your help in this case will be highly appreciated.
Thanks and Regards

Oops, something happended with the above post. I will try again... I have tried your suggestion but I still get the same garbled XML (with data repeated and some values "cut in half". Here is what I get after decode-service and extract-to-XML-service. This is just the first barcode, the others are similar, sorry for the poor formatting, but I get a CDATA tage infront of the "istensen" value. CDATA:istensen</fld_ForMellemEfterNavn ><fld_VejNRpostByEnLinie >Superroad 99, 1330 Supertown</fld_VejNRpostByEnLinie ><fld_PrivatTelefonnummer >20724283</fld_PrivatTelefonnummer ></sub_Person ></sub_PktA ><fld_BlanketNr >kb0371ff</fld_BlanketNr ><fld_BarcodeCount /></form1 >/sub_Adresse ><sub_Person ><fld_ForMellemEfterNavn>Kim Christensen</fld_ForMellemEfterNavn ><fld_VejNRpostByEnLinie > Superroad 99, 1330 Supertown </fld_VejNRpostByEnLinie ><fld_PrivatTelefonnummer >20724283</fld_PrivatTelefonnummer ></sub_Person ></sub_PktA ><fld_BlanketNr >kb0371ff</fld_BlanketNr ><fld_BarcodeCount /></form1 Obviously this is not a legal xml-string, so I can do nothing about it. I have tried using a custom .NET component (ClearImage) for reading the same form (with the barcode) I get the correct data out from the barcodes. So I guess something is wrong with the decode-service in Barcoded Forms ES when I use compressed XML. But I can conclude since the ClearImage component can read the barcodes that they are compressed correctly. Can you help me with getting further with this problem? Sincerely Kim

Cannot extract the embedded font - Adobe Reader XI

Hello,
I have an XPS file converted to a PDF with an embedded font inside, when I open the file, as I scroll to a page having '&' in the text (that's my case, but it's probably the same with any special character), a message box shows up saying "Cannot extract the embedded font 'HDGSLM+[my font]'. Some characters may not display or print correctly." and I can't see the special character in the page.
I checked that my font was actually embedded in the PDF (File -> Properties -> Fonts) and seems to be ok, normal characters look fine and the application doesn't say anything until I scroll to the page with '&' inside. Also, other PDF readers show the document fine.
I looked around the forum and saw that other people got the same problem and to them was a bug introduced in Adobe Reader 8.x, but nothing was confirmed by developers.
Can you help me? Is it a bug? If yes, is Adobe working on it?
Thank you.
P.S.
My Adobe Reader version is 11.0.4

Reader can not create PDF files, so you should look for bugs in the software that was used to generate it.

Cannot extract the embedded font 'IEOFAD+TimesNewRoman'.

I am a Mac OS X 10.6.8 and Adobe reader X 10.1.4 user. When I'm opening a pdf file, this error message come out: Cannot extract the embedded font 'IEOFAD+TimesNewRoman'. Some characters may not display or print correctly.
How do I solve this problem? Thanks.

Your question pertains to the Mac version of Adobe Reader. I've moved it to the Adobe Reader forum.

Cannot extract the embedded font arial.Some char might not be displayed

Hi Folks,
 Our requirement is to generate the customer account statement for a set of cusomters during a particular period.
 we use a script to generate the statement and convert it to pdf.This pdf is then stored in the application server and retrived using cg3y transaction.The transaction allows the statement to be generated for multiple customers and company code.
the problem that we face is that the first customer account statement is generated properly the rest of the statements are not displayed properly.When we open the customer account statement we get the error 'Cannot extract the embedded font arial.Some characters might not be displayed properly'.All the Headings that we of the font BOLD ARIAL were not displayed.
This error does not occur for the first pdf file generated.(say we have 3 customers for the 1st customer pdf is proper however for the next two error'.
 When I checked the application server I found that the first file has font subtype as Type1 and then rest of the files have font subtype as Truetype.
This is working fine.I downloaded the file to pc using cg3y.the pdf looks fine.
Directory: /home/nwfound/n_us_cas
Name: 0010798791US13122009.PDF
%PDF-1.3#
2 0 obj#
/WinAnsiEncoding#
endobj#
3 0 obj#
<<#
%Devtype SAPWIN Font COURIER normal Lang EN#
/Type /Font#
/Subtype /Type1# ****see here font subtype is type1
/BaseFont /Courier#
/Name /F001#
/Encoding 2 0 R#
>>#
endobj#
4 0 obj#
<<#
/Filter 5 0 R#
/Length 6 0 R#
/Length1 352224#
>>#
stream#
This one is not working fine.Gives an error 'Cannot extract the embedded font arial.Some characters might not be displayed properly when downloaded to pc using cg3y.The heading that were of font ARIAL BOLD were not displayed.
Directory: /home/nwfound/n_us_cas
Name: 0010000105US13062010.PDF
%PDF-1.3#
2 0 obj#
/WinAnsiEncoding#
endobj#
3 0 obj#
<<#
%Devtype SAPWIN Font COURIER normal Lang EN#
/Type /Font#
/Subtype /TrueType# - see here the subtype is True type.How can i change this?
/BaseFont /Courier#
/Name /F001#
/Encoding 2 0 R#
>>#
endobj#
4 0 obj#
<<#
/Filter 5 0 R#
/Length 6 0 R#
/Length1 352224#
>>#
I have to change the tont subtype to Type1 as in my first cust. statement.Is there a way to this???????
There can be no problem with the code cause the account statement generated for first customer is perfect.
Any suggestions will be appreciated.
IF hotfdata[] IS NOT INITIAL.
*Convert otf data to pdf lines.
 CALL FUNCTION 'CONVERT_OTF'
 EXPORTING
 format = 'PDF'
 IMPORTING
 bin_filesize = l_size
 TABLES
 otf = hotfdata
 lines = li_pdf_output
 EXCEPTIONS
 err_max_linewidth = 1
 err_format = 2
 err_conv_not_possible = 3
 err_bad_otf = 4
 OTHERS = 5.
 IF sy-subrc <> 0.
MESSAGE ID SY-MSGID TYPE SY-MSGTY NUMBER SY-MSGNO
 WITH SY-MSGV1 SY-MSGV2 SY-MSGV3 SY-MSGV4.
 ENDIF.
*clear hotfdata otherwise next empty cust acc. statement may have
*this data.
 REFRESH:hotfdata.
 CLEAR hotfdata.
*If account statement is generated then set w_flag to generate email.
 w_flag = 'X'.
*The converted pdf lines are of char132 format.
*They have to be converted to char255 format.
 LOOP AT li_pdf_output INTO lwa_pdf_output.
 TRANSLATE lwa_pdf_output USING ' ~'.
 CONCATENATE l_gd_buffer lwa_pdf_output INTO l_gd_buffer.
 CLEAR lwa_pdf_output.
 ENDLOOP.
 TRANSLATE l_gd_buffer USING '~ '.
 REFRESH li_mess_att.
 DO.
 li_mess_att = l_gd_buffer.
 APPEND li_mess_att.
 SHIFT l_gd_buffer LEFT BY 255 PLACES.
 IF l_gd_buffer IS INITIAL.
 EXIT.
 ENDIF.
 ENDDO.
 CLEAR x_objcont.
 REFRESH x_objcont.
 LOOP AT li_mess_att.
 x_objcont = li_mess_att.
 APPEND x_objcont.
 ENDLOOP.
*application file name
 CONCATENATE po_filun kna1-kunnr save_bukrs datum02+4(2)
 datum02(4) lc_pdf INTO l_filename.
 CONDENSE l_filename NO-GAPS.
 CONCATENATE file l_filename INTO file.
 CONDENSE file NO-GAPS.
*Transer pdf data to app. server.
 "data len type i.
 OPEN DATASET file FOR OUTPUT IN BINARY MODE.
 LOOP AT x_objcont.
 TRANSFER x_objcont TO file.
 ENDLOOP.
 CLOSE DATASET file.
ELSE.
 REFRESH:hotfdata.
 CLEAR hotfdata.
ENDIF.

Hi,
this can be related to the compression of the PDF file.
You can turn off the FlateDecode compression again via report RSTXPDF3 as described in Note #843480. It is a little confusing. The option 'FLATE_COMPR_OFF' needs to be set to 'On' to turn off the FlateDecode compression.
To set this please run as follows:
se38 -> RSTXPDF3 -> enter 'FLATE_COMPR_OFF' in the 'Name' field -> Select 'Change Settings' radio button
You will get a pop-up 'Do not use flat compression'.
Select the 'on' button.
After this check if the PDF is created correctly.
regards,
Aidan

Cannot extract the embedded font 'F2'. Some characters may not display or print correctly.

This question was previously published but no answer has been found.
Error message is :
"Cannot extract the embedded font 'F2'. Some characters may not display or print correctly."
Many pdf documents display this error with Adobe Reader 8.1.0. We have errors for embedded fonts 'F0','F3','F7' and so on.
It looks like a Adobe Reader bug because :
- All PDF files can be opened with Acrobat Reader 5.0.5, 6.x and 7.x, can't be opened with 8.1.0 version.
- the 8.1.1 update removes only the bug for 'F0' error message (issue #1572280).
The solution :
- to publish a 8.1.2 update to fix this important bug
- is there a registry parameter or tool option to disable the checking added in 8.x version of Adobe Reader ? The 8.x version catches more errors to be compliant with Adobe specification but Adobe reader must be
compliant with all documents generated by third party products.
This Adobe Reader bug applies to Windows Vista, XP Pro SP2, 2000.
Thx,
Regards

Just to let you know, for anyone else with this problem. I had this problem occur on a MAC when you tried to do save to PDF in excel. This was all happening at the point of generation of the PDF in my case.
The fix was to delete ALL the microsoft preferences, but perahps only the font cache needed to be deleted.
I deleted the following areas from the local users userprofile on the mac. On windows, I would probably log in as a differnet user to try to see if the problem just exsists for one particular user.
Here are the sections I deleted:
Library/caches/metadata/Microsoft*
Username/library/preferences/com.microsoft* ( and anyhting with microsoft in it)
I did leave the entourage settings though.
hope it helps someone with a similar issue.

Error message: "Cannot extract the embedded font DAAAAA+Arial MT..."

hello,
after open an pdf file I get the error message "Cannot extract the embedded font DAAAAA+Arial MT..." I get this message with Adobe Reader Version 6.x, 7.x, 8.x and 9.0 on three different pc (Windows XP). The file was written with Adobe Acrobat Version 8.1
I found the kb402950 but the solution to use Version 7.x didn`t work.
Any other ideas?

Hey k%20street,
There seems to be a font issue within this particular file, but you cannot do anything using only Reader.
So, as long as it displays correctly, there should be no problem with printing. It is simply a warning that apparently something was a miss in an embedded font and result might not be as expected.
Regards,
Anubha

Acrobat Professional 8.1.6 "Cannot extract the embedded font...."

I created a PDF exporting from InDesign and when the PDF opens I get the message: "Cannot extract the embedded font...Some characters may not diplay or print correctly." I made sure the font was postscript font and is not corrupted, I changed the font and the same problem happened.
I have looked all over the internet for an answer and have not found a solution. Can you please give me some advised how to solve the problem?
iMac G5-PowerPC G5
Mac OSX version 10.5.6
Adobe Acrobat Professional 8.1.6
InDesign CS3 5.0.4
Thank you.

Hi Markus ,
It is a problem with the pdf file being encoded as ascii by the email program when it is really binary. The way to get around the problem is to zip the file or use a different email program or change the encoding method if possible.
Regards
Sukrit Dhingra

How to fix error "Cannot extract the embedded font ...."?

When I open the pdf file on the following webpage, I get the following error message "Cannot extract the embedded font 'BLKDIK+PLBIserif-Medium'. Some characters may not display or print correctly". Is there a way to get rid of such error in the pdf file?
http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.0030213

Install Foxit PDF reader, and then you can see the PDF document without any problem.
I think your book was generated from Quartz.
As I know there is no free solution for this problem. Updates of Abobe reader will not work.
tips:
You don't need Foxit reader as a default PDF reader program.
Foxit reader shows document with plain standard font.

Cannot extract the embedded font "Times_New_Roman"

About a month ago we started receiving this error on some .pdfs being sent to our company. "Cannot extract the embedded font "Times_New_Roman". Some characters may not display correctly".
Any idea whats causing this and what I can do correct the issue? I've tried upgrading the reader to 9, but still no luck.

I am getting the same error on making pdfs fomr Illustrator. Though everything looks good, this is not acceptable to send a pdf to the client for review, and was not happening before.
My system
OS 10.8.4
Acrobat 10.1.7
This does not happen on all files, but repetitively on certain series of files, so must be a font conflict in the files.
hmm might be this zapf dingbat, did nto get a warning about missing

Cannot extract the embedded font...

I am encountering the following message when opening a particular .pdf:
Cannot extract the embedded font "Arial,Bold". Some characters may not display or print correctly.
I've attempted to open his file at different PCs-message persists.
Adobe Reader 8.1.2
Windows XP SP3
Dell Optiplex 745
2GB RAM

According to this thread: http://choorucode.wordpress.com/2010/02/02/adobe-acrobat-embedded-font-error/
the error relates to a font that is used in the document that is incompatible with the version of Adobe Reader being used (or is an outdated font). On the extreme side, the font itself could be corrupted on the PC of the person who created the PDF. So, you'd want to check:
1) Is the font file on the PC corrupted (does this happen only for documents created by one user on one machine)?
2) Is the font outdated/unsupported?
3) Is the font incompatible with your version of Adobe Reader?

Extract PDF embedded in XML

Similar Messages

Maybe you are looking for