PDF to XML to Docbook
Anyone have experience with this?
I have a customer who would like to have textbook InDesign files converted to Docbook. I am thinking the simplest way would be to take the PDF files extract the content to XML then port to DocBook.
Any thoughts?
Thanks for your feedback.
Rich
hi Van Kurtz,
i have convert pdf to word document and also wordML format..
after this how i can proceed?
word doc => docbook?
or
wordml => docb0ok?
please explain the nex steps in framemaker...
Similar Messages
-
Print PDF Report(XML) in Dot matrix Printer in Oracle E-Biz R12
Hi Friends,
Is it possible to print a PDF Report (xml report) in Dot Matrix Printer in Oracle E-Biz R12 assuming the Dot Matrix Printer supports Post Script.
If so , what configuration needs to be performed from Oracle E-Biz?
Regards,
DBuser4525564 wrote:
Hi Friends,
Is it possible to print a PDF Report (xml report) in Dot Matrix Printer in Oracle E-Biz R12 assuming the Dot Matrix Printer supports Post Script.
If so , what configuration needs to be performed from Oracle E-Biz?
Regards,
DBI do not think it is possible -- Please confirm with your printer vendor if your printer support PDF printing first.
Thanks,
Hussein -
Post processing PDF to XML.
Topic
Post-processing PDF into XML.
Compton MacKenzie - 08:48am Oct 29, 2008 Pacific
Hi,
Sorry for the basic question... We want to have users fill out a fillable PDF form using Aacrobat Reader and then upload it to a web page. Once we get the PDF, we need to extract the data that they have entered. Short of using LiveCycle Data Services (not currently feasible as we have no Java presence on our server platform), is there any API that I can use to extract the data or convert the PDF to XML. I understand that it is possible to export XML using the Acrobat client (and it might be possible to script this with COM) but I don't think this would work reliably in a server environment.
(We need both the PDF and the data as the PDF will contain an electronically captured image of a customer's signature and need to preserve the actual image of the document.)
Any suggestions?
Thanks!There are server based products under the LiveCycle banner for this but they all run on a Java based app server. You can use a turn key install where the app server (JBoss) and a Database (MySql) are provided for you but you need to have the Java SDK present. The LiveCycle servers can run on Windows, Linux, AIX to name a few.
Note that if you script Acrobat to do this on the server you are in violation of your license agreement. -
Large Pdf using XML XSL - Out of Memory Error
Hi Friends.
I am trying to generate a PDF from XML, XSL and FO in java. It works fine if the PDF to be generated is small.
But if the PDF to be generated is big, then it throws "Out of Memory" error. Can some one please give me some pointers about the possible reasons for this errors. Thanks for your help.
RM
Code:
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;
import org.apache.fop.apps.Driver;
import org.apache.fop.apps.Version;
import org.apache.fop.apps.XSLTInputHandler;
import org.apache.fop.messaging.MessageHandler;
import org.apache.avalon.framework.logger.ConsoleLogger;
import org.apache.avalon.framework.logger.Logger;
public class PdfServlet extends HttpServlet {
public static final String FO_REQUEST_PARAM = "fo";
public static final String XML_REQUEST_PARAM = "xml";
public static final String XSL_REQUEST_PARAM = "xsl";
Logger log = null;
Com_BUtil myBu = new Com_BUtil();
public void doGet(HttpServletRequest request,
HttpServletResponse response) throws ServletException {
if(log == null) {
log = new ConsoleLogger(ConsoleLogger.LEVEL_WARN);
MessageHandler.setScreenLogger(log);
try {
String foParam = request.getParameter(FO_REQUEST_PARAM);
String xmlParam = myBu.getConfigVal("filePath") +"/"+request.getParameter(XML_REQUEST_PARAM);
String xslParam = myBu.SERVERROOT + "/jsp/servlet/"+request.getParameter(XSL_REQUEST_PARAM)+".xsl";
if((xmlParam != null) && (xslParam != null)) {
XSLTInputHandler input = new XSLTInputHandler(new File(xmlParam), new File(xslParam));
renderXML(input, response);
} else {
PrintWriter out = response.getWriter();
out.println("<html><head><title>Error</title></head>\n"+
"<body><h1>PdfServlet Error</h1><h3>No 'fo' "+
"request param given.</body></html>");
} catch (ServletException ex) {
throw ex;
catch (Exception ex) {
throw new ServletException(ex);
public void renderXML(XSLTInputHandler input,
HttpServletResponse response) throws ServletException {
try {
ByteArrayOutputStream out = new ByteArrayOutputStream();
response.setContentType("application/pdf");
Driver driver = new Driver();
driver.setLogger(log);
driver.setRenderer(Driver.RENDER_PDF);
driver.setOutputStream(out);
driver.render(input.getParser(), input.getInputSource());
byte[] content = out.toByteArray();
response.setContentLength(content.length);
response.getOutputStream().write(content);
response.getOutputStream().flush();
} catch (Exception ex) {
throw new ServletException(ex);
* creates a SAX parser, using the value of org.xml.sax.parser
* defaulting to org.apache.xerces.parsers.SAXParser
* @return the created SAX parser
static XMLReader createParser() throws ServletException {
String parserClassName = System.getProperty("org.xml.sax.parser");
if (parserClassName == null) {
parserClassName = "org.apache.xerces.parsers.SAXParser";
try {
return (XMLReader) Class.forName(
parserClassName).newInstance();
} catch (Exception e) {
throw new ServletException(e);Hi,
I did try that initially. After executing the command I get this message.
C:\>java -Xms128M -Xmx256M
Usage: java [-options] class [args...]
(to execute a class)
or java -jar [-options] jarfile [args...]
(to execute a jar file)
where options include:
-cp -classpath <directories and zip/jar files separated by ;>
set search path for application classes and resources
-D<name>=<value>
set a system property
-verbose[:class|gc|jni]
enable verbose output
-version print product version and exit
-showversion print product version and continue
-? -help print this help message
-X print help on non-standard options
Thanks for your help.
RM -
How to automatically create fillable PDFs from XML
We are looking for a way to create fillable PDFs that users can enter data into and save (so the PDF needs Reader Rights applied) and we want to be able to automatically create these PDFs from XML files. We are able to do what we want in a manual fashion using Acrobat Pro v8 but we need a way to automate this due to the volume of XML files that we will need to convert.
Is there any way that this can be scripted from a command line interface with Acrobat.. like:
Acrobat.exe –input FileName.xml –output FileName.pdf –applyReaderRights true
If there is no command line options for this is there any way that this could be coded using .NET or any other programming language?You cannot automate applying usage rights with Acrobat. For that you'd need to use Adobe's LiveCycle Reader Extensions.
How exactly are you currently converting XML into fillable forms? Are you using an XDP to somehow convert to an XFA-based PDF? -
Creating PDF from XML directly in a content management system?
Hi!
This is my first post here and I've tried to find any previous posts that could answer my question but to no avail. Also I think and hope this is the correct sub forum to post it in.
I work at a company that produces a product catalogue that is published as a webpage, a PDF document (used as the basis for a tablet app) and a printed catalogue from a PDF. For the PDF (used in the tablet app and the printed catalogue) we are using a CMS based on XML that produces Adobe FrameMaker documents which we then export the PDF from. We are looking in to updating the system and make it flow much better but are a bit uncertain of what the best way to go would be.
A solution I'm thinking of would be to have the content of the product catalogue in some kind of XML based service that can export the information in XML. This would hopefully make it possible to send the documents either directly to PDF by XML and some style sheets or import the XML into some InDesign templates (for the more complicated designs at intro pages etc).
One important aspect of the product catalogue is that we have all information saved in different languages so there has to be some kind of connection between the templates and different language versions -- ie. the page design but different language text flows for each language edition.
What I wonder is. What kind of services/solutions would there be that handles XML to PDF for a quite complicated product catalogue (ie. the different language versions)?
Thanks in advance!The difference between the two packages is that PatternStream effectively works on a "pull" principle (the content is retrieved into the template(s) by queries at the appropriate locations), while Miramo is a "push" (the tagged content is processed by Miramo using templates to create the DTP files). SInce Miramo allows programatical processing before the content is pushed into the DTP app, you can do all sorts of manipulations, conditional processing, automatically insert markers and variables, etc. so it allows for a fairly complex layout, even with FrameMaker. It also allows api's and scripts to be triggered at the backend when the publication has been assembled for further processing/manipulation.
Is there any particular reason that you want to move from the FM engine to the ID one? In terms of throughpu,t FM streams run very much faster than ID ones. Also, unless the layouts are extremely complex, in an automated environment, there are very few catalogue layouts that I've seen that couldn't also be handled using a FM workflow.
Are there any samples on line of the types of catalogues that you are currently producing? This would help in assessing which tools and workflows might be more appropriate to your situation. -
I have a button on a PDF form which emails a PDF and XML attachment of the current form in two seperate emails. The code is as follows:
event.target.submitForm({cURL:"mailto:" + "[email protected]" +"?subject="+"Request for action " + "&body=Please find attached..." ,cSubmitAs:"XML",cCharset:"utf-8"});
event.target.submitForm({cURL:"mailto:" + "[email protected]" +"?subject="+"Request for action " + "&body=Please find attached..." ,cSubmitAs:"PDF",cCharset:"utf-8"});
This results in 2 seperate emails, when all I need is a single email with two attachments. Is there a way to do this?As far as I know you can only attach one attachment at once.
-
Hi,
Can somebody pls provide me software for PDF to XML converter.
Thanks,
Nikesh Shahhi nikesh...
Try this...
http://www.pdf2text.com/ConvertPDFToText-standard-edition.htm
Regards,
Sudheer -
Fastest Way to Render PDF from XML
Hi
I have used XSLTC(pre-compiled XSL files) to render PDFs.Parser i have used are the latest from XALAN.
Using FOP 0.20.5
Using JAXP 1.3 i am using the validate framework for validating XMLs against XSDs.
This is the fastest approach which i could possibly come up with for rendering XMLs into PDFs
I am just wondering..with so many API etc available is there any accepted fastest way of Rendering PDFs through XML.
I mean this should also take into account the rendering capability of the transformers.
Any insights would be welcome.
Thanks.Logging and Capturing? Or Logging and TRANSFERRING? This being P2...I'll assume the latter. It should take about a gig a min to transfer...the same time needed to back up a card.
But you are L&T right from the camera...so I assume that you are not backing up the footage...just importing? Why not back up? That's step #1. Unless you are backing up later. Connected to the computer via USB or Firewire? And the drive is connected to the computer how? USB or Firewire? -
Hi Experts,
I have pdf which i created in adobe acrobat pro.by using acrobat pro i was able to export pdf to xml(More form options -->Export Data->SaveAsType-->xml) which i want .but i want to do this in a button click to do the same export option. is it possible in acrobat pro using JavaScript. Kindly suggest me any other method for the same. Thanks in advance.Try following forum:
http://forums.adobe.com/community/acrobat/acrobat_scripting -
Hi All..
I could see many posts regarding XML to PDF conversion.
I just want to know if there is any command line tool available to convert PDF to XML
(I can find many tools but they are of GUI. I am searching for a command line tool so that I can invoke it like c:>TOOL <source PDF file> <destination XML file>)
Thanks in advance,
SasiI cannot imagine that there is a free tool which converts pdf - file to xml - file, because adobe requires an expensvie lincence for using their pdf api to read pdf files. Therefore you will find only commerical products which makes that!
Or do somebody know some tools? -
Call from Java to combine PDF and XML
What adobe software do I need to combine PDF form template and XML data to produce PDF with XML data embedded.
Also is there any samples/references available to make a call to the above from a Java program ?
Any help is deeply appreciated.
Thanks!
PriyaHi Priya,
Just curious if you figured out how to generate a PDF out of a template? I've a similar requirement where I transform XML schemas into templates, convert the template into a PDF document, and at run time bind the PDF with the form data.
To sort of answer your question of how to bind an XML instance file to a form, if that form is an XML-FORM, then all you'd need to do is open that document as a PDF document using the PDFFactory, and invoke the importFormData() method on it, passing the XML instance as the input stream.
The harder part though is getting the PDF out of the template in the first place, without using any adobe user interface period.
Karthick -
Producing pdf from XML via XSL:FO
Anyboy point me in the direction of a good tutorial, book or any other reference on how to produce a pdf from XML, please?
Elliott Rusty Harold's XML Bible has a decent chapter re: XSL-FO. The on-line version might even be up-to-date (the print edition that I have is not).
The Apache FOP documentation is pretty good. -
Send pdf and xml to URL on one click
My scenario is:
-online Adobe form in ZCI layout with XML based interface created in webdynpro.
-In webdynpro (form is embedded) Vendor fills out the form with the requested Payment, clicks the button and the form is saved at URL.
-Then Approver will open PDF from that link, make his change and save it at that URL.
-Then another Approver will open PDF from that link, make his change and save it at that URL.
-Finaly Webdynpro program will upload the XML data from that URL into the embedded form and User will initiate sending the form data to SAP.
I understand that every time the form is saved - it is needs to be saved in both PDF and XML.
PDF is for the user to make changes on the form.
XML is for webdynpro to upload the form data and then send it to SAP.
In Livecycle Designer there is Submit buttom in Standard library.
The button has dropdown of Submit As : xdp, pdf, xml.
So I can select only pdf OR xml - not both.
How could I send to URL both PDF and XML on one click?
Thank you ,
Tatyana.From this conversation's thread it appears that Acrobat is not the application / service at issue.
As this user forum has as its focus "Acrobat" it seems that the OP is in the wrong place.
Acrobat's one click send feature provides the user a "draft" email. The user still has to process ("click") through steps with the email client/service to actually get the email out and on its way.
I'm sure that "one click" does everything is available somewhere. Considering all it'd have to do and do right it'd be rather expensive.
Be well... -
How do I convert multiple PDFs to XML without having to open each file individually?
How do I convert multiple PDFs to XML without having to open each file individually?
XML for what? XML is just a way to structure data - it's nothing in and of itself.
However, you have to open the file (even if it's not user visible) in order to perform any operation on it - otherwise how would the data be read??
Are you looking to do this on a desktop or server machine? What OS platform(s)? What programming environment?
Maybe you are looking for
-
Active directory recycle bin in Windows STORAGE server 2008 R2
I'd like to find a tool that helps me to monitor deleted server files. I understand that this tool is called Active directory recycle bin. I can't enable it. I can't even open lpd.exe. I'm using Windows storage server 2008 r2 essentials. Could someon
-
Not sure how i should go about creating this flash to accomplish what i need.
First let me start by saying my flash file is currently 740kb in .swf form and I'm not done with it. I'm taking this which was developed from another developer and trying to customize it to work for what i need. The flash was bought from template mon
-
PDWordFinder and PDETextGetText , How to get font, color... ?
I use 2 methods to develop a plugin<br />1. PDWordFinder to extract text(Japanese & Chinese character), I can extract the text, but I donnot know how to get the text format information.<br />2. PDETextGetText to GetText, I know I can get format infor
-
Profit center line for free of charge items
Hi all Is it possible to force SAP to write the create a profit center items for free of charge items of SD invoice (positions TANN), of course with zero value but with the quantity data? Thanks. Davide
-
Calling multiple actions in a single event of the adobe component
Hi , We have a requirement to call both these events Java script ======== 1)app.eval("event.target.SAPValueHelp(\"" + this.somExpression + "\");"); 2)app.eval("event.target.SAPCheckFields();"); on the <b>enter</b> event of drop down list on adobe for