Converting images as well as text from HTML to PDF

I used Acrobat XI to convert a text and photo document I created in Squarespace (HTML) to PDF. Acrobat successfully retained the format of the document and all the text but left blank all the spaces where photos had been placed. I then had to re-import all the photos, re-size and re-caption them -- a lot of work! Is there a way for Acrobat to import the entire document as it displays on the web, including all the photos in their proper placement?

Thanks for your suggestion. Where do I find the setting for View Large Images?

Similar Messages

  • Problem to extract text from HTML document

    I have to extract some text from HTML file to my database. (about 1000 files)
    The HTML files are get from ACM Digital Library. http://portal.acm.org/dl.cfm
    The HTML page is about the information of a paper. I only want to get the text of "Title" "Abstract" "Classification" "Keywords"
    The Problem is that I can't find any patten to parser the html files"
    EX: I need to get the Classification = "Theory of Computation","ANALYSIS OF ALGORITHMS AND PROBLEM COMPLEXITY","Numerical Algorithms and Problem","Mathematics of Computing","NUMERICAL ANALYSIS"......etc .
    The section code about "Classification" is below.
    Please give any idea to do this, or how to find patten to extract text from this.
    <div class="indterms"><a href="#CIT"><img name="top" src=
    "img/arrowu.gif" hspace="10" border="0" /></a><span class=
    "heading"><a name="IndexTerms">INDEX TERMS</a></span>
    <p class="Categories"><span class="heading"><a name=
    "GenTerms">Primary Classification:</a></span><br />
    � <b>F.</b> <a href=
    "results.cfm?query=CCS%3AF%2E%2A&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">Theory of Computation</a><br />
    � <img src="img/tree.gif" border="0" height="20" width=
    "20" /> <b>F.2</b> <a href=
    "results.cfm?query=CCS%3A%22F%2E2%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">ANALYSIS OF ALGORITHMS AND PROBLEM
    COMPLEXITY</a><br />
    � � � <img src="img/tree.gif" border="0" height=
    "20" width="20" /> <b>F.2.1</b> <a href=
    "results.cfm?query=CCS%3A%22F%2E2%2E1%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">Numerical Algorithms and Problems</a><br />
    </p>
    <p class="Categories"><span class="heading"><a name=
    "GenTerms">Additional�Classification:</a></span><br />
    � <b>G.</b> <a href=
    "results.cfm?query=CCS%3AG%2E%2A&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">Mathematics of Computing</a><br />
    � <img src="img/tree.gif" border="0" height="20" width=
    "20" /> <b>G.1</b> <a href=
    "results.cfm?query=CCS%3A%22G%2E1%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">NUMERICAL ANALYSIS</a><br />
    � � � <img src="img/tree.gif" border="0" height=
    "20" width="20" /> <b>G.1.6</b> <a href=
    "results.cfm?query=CCS%3A%22G%2E1%2E6%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">Optimization</a><br />
    � � � � � <img src="img/tree.gif" border=
    "0" height="20" width="20" /> <b>Subjects:</b> <a href=
    "results.cfm?query=CCS%3A%22Linear%20programming%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">Linear programming</a><br />
    </p>
    <br />
    <p class="GenTerms"><span class="heading"><a name=
    "GenTerms">General Terms:</a></span><br />
    <a href=
    "results.cfm?query=genterm%3A%22Algorithms%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">Algorithms</a>, <a href=
    "results.cfm?query=genterm%3A%22Theory%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">Theory</a></p>
    <br />
    <p class="keywords"><span class="heading"><a name=
    "Keywords">Keywords:</a></span><br />
    <a href=
    "results.cfm?query=keyword%3A%22Simplex%20method%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">Simplex method</a>, <a href=
    "results.cfm?query=keyword%3A%22complexity%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">complexity</a>, <a href=
    "results.cfm?query=keyword%3A%22perturbation%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">perturbation</a>, <a href=
    "results.cfm?query=keyword%3A%22smoothed%20analysis%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
    target="_self">smoothed analysis</a></p>
    </div>

    One approach is to download Htmlparser from sourceforge
    http://htmlparser.sourceforge.net/ and write the rules to match title, abstract etc.
    Another approach is to write your own parser that extract only title, abstract etc.
    1. tokenize the html file. --> convert html into tokens (tag and value)
    2. write a simple parser to extract certain information
    find out about the pattern of text you want to extract. For instance "<class "abstract">.
    then writing a rule for extracting abstract such as
    if (tag is abstract ) then extract abstract text
    apply the same concept for other tags
    Attached is the sample parser that was used to extract title and abstract from acm html files. Please modify to include keyword and other fields.
    good luck
    import java.io.BufferedReader;
    import java.io.FileReader;
    import java.io.IOException;
    import java.io.InputStream;
    import java.io.InputStreamReader;
    import java.util.ArrayList;
    import java.util.List;
    public class ACMHTMLParser
         private String m_filename;
         private URLLexicalAnalyzer lexical;
         List urls = new ArrayList();
         public ACMHTMLParser(String filename)
              super();
              m_filename = filename;
          * parses only title and abstract
         public void parse() throws Exception
              lexical = new URLLexicalAnalyzer(m_filename);
              String word = lexical.getNextWord();
              boolean isabstract = false;
              while (null != word)
                   if (isTag(word))
                        if (isTitle(word))
                             System.out.println("TITLE: " + lexical.getNextWord());
                        else if (isAbstract(word) && !isabstract)
                             parseAbstract();
                             isabstract = true;
                   word = lexical.getNextWord();
              lexical.close();
         public static void main(String[] args) throws Exception
              ACMHTMLParser parser = new ACMHTMLParser("./acm_html.html");
              parser.parse();
         public static boolean isTag(String word)
              return ( word.startsWith("<") && word.endsWith(">"));
         public static boolean isTitle(String word)
              return ( "<title>".equals(word));
         //please modify according to the html source
         public static boolean isAbstract(String word)
              return ( "<p class=\"abstract\">".equals(word));
         private void parseAbstract() throws Exception
              while (true)
                   String abs = lexical.getNextWord();
                   if (!isTag(abs))
                        System.out.println(abs);
                        break;
         class URLLexicalAnalyzer
           private BufferedReader m_reader;
           private boolean isTag;
           public URLLexicalAnalyzer(String filename)
              try
                m_reader = new BufferedReader(new FileReader(filename));
              catch (IOException io)
                System.out.println("ERROR, file not found " + filename);
                System.exit(1);
           public URLLexicalAnalyzer(InputStream in)
              m_reader = new BufferedReader(new InputStreamReader(in));
           public void close()
              try {
                if (null != m_reader) m_reader.close();
              catch (IOException ignored) {}
           public String getNextWord() throws IOException
              int c = m_reader.read();   
              if (-1 == c) return null; 
              if (Character.isWhitespace((char)c))
                return getNextWord();
              if ('<' == c || isTag)
                return scanTag(c);
              else
                   return scanValue(c);
           private String scanTag(final int c)
              throws IOException
              StringBuffer result = new StringBuffer();
              if ('<' != c) result.append('<');
              result.append((char)c);
              int ch = -1;
              while (true)
                ch = m_reader.read();
                if (-1 == ch) throw new IllegalArgumentException("un-terminate tag");
                if ('>' == ch)
                     isTag = false;
                     break;
                result.append((char)ch);
              result.append((char)ch);
              return result.toString();
           private String scanValue(final int c) throws IOException
                StringBuffer result = new StringBuffer();
                result.append((char)c);
                int ch = -1;
                while (true)
                   ch = m_reader.read();
                   if (-1 == ch) throw new IllegalArgumentException("un-terminate value");
                   if ('<' == ch)
                        isTag = true;
                        break;
                   result.append((char)ch);
                return result.toString();
    }

  • Read Text from HTML-Pages and want to solve "ChangedCharSetException"

    Hello,
    I have an app that connect via threads with pages and parse them an gives me only the Text-version of a HTML-page. Works fine, but if it found a page, where the text is within images, than the whole app stopps and gave me the message:
    javax.swing.text.ChangedCharSetException
            at javax.swing.text.html.parser.DocumentParser.handleEmptyTag(DocumentParser.java:169)
            at javax.swing.text.html.parser.Parser.startTag(Parser.java:372)
            at javax.swing.text.html.parser.Parser.parseTag(Parser.java:1846)
            at javax.swing.text.html.parser.Parser.parseContent(Parser.java:1881)
            at javax.swing.text.html.parser.Parser.parse(Parser.java:2047)
            at javax.swing.text.html.parser.DocumentParser.parse(DocumentParser.java:106)
            at javax.swing.text.html.parser.ParserDelegator.parse(ParserDelegator.java:78)
            at aufruf.main(aufruf.java:33)So I tried to catch them with "getCharSetSpec()" and "keyEqualsCharSet( )" from the class "javax.swing.text.ChangedCharSetException" and hoped that this solved the problem. But still doesen't work...
    Then I looked at the web and found, that I have to add the line:
    doc.putProperty("IgnoreCharsetDirective", new Boolean(true));"doc." is a new HTML Dokument, created with the HTMLEditorKit. I do not have much knowledge about that and so I hope, that someone can explain me, how I can solve that problem, within my code.
    Here we go:
    import javax.swing.text.*;
    import java.lang.*;
    import java.util.*;
    import java.net.*;
    import java.io.*;
    import javax.swing.text.html.*;
    import javax.swing.text.html.parser.*;
    public class myParser extends Thread
            private String name;
            public void run()
                    try
                            URL viele = new URL(name);                       // "name" ia a variable with a lot of links
                    URLConnection hs = viele.openConnection();
                    hs.connect();
                    if (hs.getContentType().startsWith("text/html"))
                            InputStream is = hs.getInputStream();
                            InputStreamReader isr = new InputStreamReader(is);
                            BufferedReader br = new BufferedReader(isr);
                            Lesen los = new Lesen();
                            ParserDelegator parser = new ParserDelegator();
                            parser.parse(br,los, false);
            catch (MalformedURLException e)
                    System.err.print("Doesn't work");
            catch (ChangedCharSetException e)
                    e.getCharSetSpec();
                    e.keyEqualsCharSet();
                    e.printStackTrace();
            catch (Exception o)
            public void vowi(String n)
                    name = n;
    }and for the case that it is important here is the class "Lesen"
    import java.net.*;
    import java.io.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;
    import javax.swing.text.html.parser.*;
    class Lesen extends HTMLEditorKit.ParserCallback
            public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos)
                    try
                            if ((t==HTML.Tag.P) || (t==HTML.Tag.H1) || (t==HTML.Tag.H2) || (t==HTML.Tag.H3) || (t==HTML.Tag.H4) || (t==HTML.Tag.H5) || (t==HTML.Tag.H6))
                                    System.out.println();
                    catch (Exception q)
                            System.out.println(q.getMessage());
            public void handleSimpleTag(HTML.Tag t,MutableAttributeSet a, int pos)
                    try
                            if (t==HTML.Tag.BR)
                                    System.out.println(); // Neue Zeile
                                    System.out.println();
                    catch (Exception qw)
                            System.out.println(qw.getMessage());
            public void handleText(char[] data, int pos)
                    try
                            System.out.print(data);                                           // prints the text from HTML-pages
                    catch (Exception ab)
                            System.out.println(ab.getMessage());
    }Thanks a lot for helping...
    Stephan

    parser.parse(br,los, false);
    parser.parse(br,los, true);

  • How to retrive the font color, style and size of the copied text from html

    I have requirement, where I need to retrive font size and style of the copied text from html page. Here copied text I mean, the text we select and copy using either windows copy command or using Ctrl+C.
    Please help me to get the solution for this req.
    Thanks in advance,
    Amodnk.

    You can also try this, especially if you've got the Text Inspector and the Color Picker open already.
    Select the text to be colored (note that if the text is already multiple different colors the swatch under Color & Alignment section of the Text Inspector still only shows one out of the several)
    Find the color you want in the Color Picker
    Click and drag from the Color Picker into the swatch under Color & Alignment in the Text Inspector
    That will also change all the selected text to the chosen color.
    Also, regarding web safe colors, that SHOULD come as a part of the Color Picker. With the Color Picker open, select the third icon at the top (If you mouse over it, it should indicate Color Palettes. Click the popup menu button and you should see Web Safe Colors as one of the choices. With this and the Text Inspector open, you can drag and drop your way to identical colors in no time!
    That same drag and drop trick works for text on the slide as well. If you just created a bit of text and you want to apply a color, scroll until you find the color you want, then drag and drop over to the text (it will highlight in blue showing you what you're about to color).

  • Cannot copy text from DRM-free PDF on PC

    I can on my Mac. Both use Adobe Reader v11.x. I think the behaviour started when I updated to version 11 on the PC.
    I cannot copy text from DRM-free PDFs that I read. I can switch to the Select Tool cursor but drag-select does not work as it did before. Double-click can highlight (and copy) a word. A triple-click can do so for a line. This is clumsy and inefficient and I cannot grab more than a line of text.
    File > Properties: Security shows 'No Security'. I am stumped. Is there some detail I am missing? (I downloaded and switched to FoxIt Reader to take notes now but prefer my knowledge of navigation shortcuts in Adobe Reader.)

    Hi jroth,
    How are you "re-printing" the PDF? Once you have a PDF file, it shouldn't be necessary to re-create it just because you've made changes. You could simply save the file at that point. But it sounds like you're concerned about preventing changes to the comments. If that's the case, rather than reprinting, you could assign some document permissions to the PDF to prevent people from changing the content. To assign that security, choose File > Properties and click the Security tab.
    I'm guessing here, because I'm not sure what process you're using, but it sounds as though that second iteration is being printed as an image. That would change the searchable text to an image, and could certainly cause the issue you're encountering.
    Best,
    Sara

  • Read text from a simple PDF file

    Is it possible to extract text from a simple PDF (Non-Interactive) in ABAP? May be using the class CL_FP_PDF_OBJECT ?
    Let's say I have a pdf document with a couple of lines of text. How can I read the "actual" text in an ABAP program?
    Thanks for your help!

    Of course you can do this, but not using the standard SAP/ABAP/Adobe.
    Check for the Java library called iText. Now you think you don´t need Java, of course. If you cannot use Java, at least you can check the coding of this library and do it in ABAP for yourself.
    But there is no such a tool in standard SAP because why would one buy a Interactive form licence if it would be so easy?

  • How to convert a Word document to text or html in an ABAP program

    Hi,
    At my client's site, for the recruitment system, they have the word processing system set to RTF, instead of SAP Script. This means that all the correspondence is in Word format. A standard SAP program takes the word letter, loads word, does the mail merge with the applicant's info and then sends the document to a printer.
    The program name is RPAPRT05. The program creates a document proxy (interface I_OI_DOCUMENT_PROXY) and manipulates the document using the methods of the interface.
    Now what we want to do is to instead of sending the document to a printer, we want to email the document contents to the applicant. But I don't know how to get the content from the Word document into text or html format so that I can make an email from it.
    I know I can send an email with the word document as an attachment, but we'd prefer not to do that.
    I would appreciate any help very much.
    Thanks

    Ok, here's what I ended up doing:
    First of, in order to call FM 'CONVERT_RTF_TO_ITF' you need the RTF document in a table with line length 156. The document is returned from FM 'DP_CREATE_URL' in a table with line length 132. So first I convert the table:
        Transform data table from 132 character lines to
        256 character lines
          LOOP AT data_table INTO dataline.
            IF newrow = 'X'.
            Add row to new table
              APPEND INITIAL LINE TO xdatatab ASSIGNING .
              newrow = space.
            ENDIF.
          Convert the raw line of old table to characters
            ASSIGN dataline TO .
          Check line lengths to determine how to add the
          next line of old table
            newlinelen = STRLEN( newline ).
            ADD addspaces TO newlinelen.
            linepos = linemax - newlinelen.
            IF linepos > datalen.
            Enough space available in new table line for all of old table line
              newline+newlinelen = oldline.
              oldlinelen = STRLEN( oldline ).
              addspaces = datalen - oldlinelen.
              CONTINUE.
            ELSE.
            Fill up new table line
              newline+newlinelen(linepos) = oldline(linepos).
              ASSIGN newline TO .
              newrow = 'X'.
            Save the remainder of old table to the new table line
              IF linepos < datalen.
                oldlinelen = STRLEN( oldline ).
                addspaces = datalen - oldlinelen.
                CLEAR newline.
                newline = oldline+linepos.
              ELSE.
                CLEAR newline.
              ENDIF.
            ENDIF.
          ENDLOOP.
        Write the last line to the table
          IF newrow = 'X'.
            APPEND INITIAL LINE TO xdatatab ASSIGNING .
    Next I call FM 'CONVERT_RTF_TO_ITF' to get the document in SAPScript format:
        Convert the RTF format to SAPScript
          CALL FUNCTION 'CONVERT_RTF_TO_ITF'
            EXPORTING
              header            = dochead
              x_datatab         = xdatatab
              x_size            = xsize
            IMPORTING
              with_tab_e        = withtab
            TABLES
              itf_lines         = itf_table
            EXCEPTIONS
              invalid_tabletype = 1
              missing_size      = 2
              OTHERS            = 4.
    This returns the document still containing the mail merge fields which needs to be filled in:
          LOOP AT itf_table INTO itf_line.
            WHILE itf_line CS '«'.
              startpos = sy-fdpos + 1.
              IF itf_line CS '»'.
                tokenlength = sy-fdpos - startpos.
              ENDIF.
              token = itf_line+startpos(tokenlength).
              REPLACE '_' IN token WITH '-'.
              ASSIGN (token) TO .
              ENDIF.
              MODIFY itf_table FROM itf_line.
            ENDWHILE.
          ENDLOOP.
    And finally I use FM 'CONVERT_ITF_TO_ASCII' to convert the SAPScript to text. I set the line lengths to 60, since that's a good length to format emails to.
        Convert document to 60 char wide ascii document for emailing
          CALL FUNCTION 'CONVERT_ITF_TO_ASCII'
            EXPORTING
              formatwidth       = 60
            IMPORTING
              c_datatab         = asciidoctab
              x_size            = documentsize
            TABLES
              itf_lines         = itf_table
            EXCEPTIONS
              invalid_tabletype = 1
              OTHERS            = 2.
    And then the text document gets passed to FM 'SO_NEW_DOCUMENT_ATT_SEND_API1' as the email body.

  • Acquiring HTML Text From HTML Page

    Just wanting to make a simple html editor. I read the documentation, and yes I know that not all pages form to the specifiecations, but I just want to know how to get all that yummie HTML into text rather than serving up a webpage. I am using the JEditorPane, and I think this answer may lay in the javax.swing.text.*; packages.

    Just trying to make a code editor by allowing the loading of an html page as a text file. Assuming you have the HTML source and now you wish to get the plain text from the HTML source,the following piece of code can help you:
    private String getPlainText(String htmlText)
    String str=null;
    JTextPane text=new JTextPane();
    text.setContentType("text/html");
    text.setText(htmlText);
    try{
      str=text.getDocument().getText(0,text.getDocument().getLength());
    }catch(BadLocationException){e.printStackTrace();}
    return str;
    }

  • Need help...retrieving text from html

    i have successfully convert my text into html... however, i have problem restoring the html back to text...
    below is my html:
    <html>
    <head>
    <style> <!-- p.default { italic:; size:3; bold:normal; family:Times New Roman; foreground:#000000; } --> </style>
    </head>
    <body> <p class=default>
    <span style="color: #000000; font-size: 72pt; font-family: Times New Roman"> <u><i><b>HELLO WORLD</b></i></u>
    </span>
    </p>
    </body>
    </html>
    this is the method i used to restore the text:
    String htmlInString=CustomPanel.getHTML();
    JTextPane p=new JTextPane(new HTMLDocument());
    p.setContentType("text/html");
    jPanel4.add(p);
    p.setText(htmlInString);
    the problem is, when i restore the text, i can only get the text,its bold,underline and italic... my font and font size are unable to be restored...
    for example, based on the html i provided above, it should come up with "HELLO WORLD", bold, underline and italic with the font size 72... but when i restored, my font size and font are not as it is stated in the html... why is this so????

    Try this ..
    import java.awt.BorderLayout;
    import java.awt.event.ActionEvent;
    import java.awt.event.ActionListener;
    import javax.swing.JButton;
    import javax.swing.JFrame;
    import javax.swing.JScrollPane;
    import javax.swing.JTextPane;
    import javax.swing.text.BadLocationException;
    import javax.swing.text.html.HTMLEditorKit;
    public class HtmlText extends JFrame {
         JTextPane m_area;
         public HtmlText(){
         m_area = new JTextPane();
         HTMLEditorKit kit = new HTMLEditorKit();
         m_area.setEditorKit(kit);
         JButton m_btn = new JButton("Show Text");      
         m_btn.addActionListener(new ActionListener(){
         public void actionPerformed(ActionEvent e){
              System.out.println("Html :-");
              System.out.println(m_area.getText());                                    
              System.out.println("Text :-");
              try{
              System.out.println(m_area.getDocument().getText(0,m_area.getDocument().getLength()));
              }catch (BadLocationException ee){
                   System.out.println("Exception " + ee.toString());
              JScrollPane m_scrollPane = new JScrollPane();
              m_scrollPane.setVerticalScrollBarPolicy(JScrollPane.VERTICAL_SCROLLBAR_ALWAYS);
         m_scrollPane.getViewport().add( m_area );
              getContentPane().add( m_scrollPane, BorderLayout.CENTER );
              getContentPane().add( m_btn, BorderLayout.SOUTH );                    
         public static void main(String[] args) {
              HtmlText ht = new HtmlText();
              ht.setBounds(30,30,400,400);
              ht.setTitle("Html Text");
              ht.show();          

  • How do I copy text from an uploaded pdf file?

    I have a document and want to copy the text of one page into a new document.  How do I proceed to do that?  I am using the Adobe cloud product.
    Thanks in advance.

    Just some observations.
    PDF page content that is an image (and music notes may very well be that in a given PDF) then all that can be exported is the image.
    Only an image editor can "edit" such eh.
    As to textual content - once in TXT, RTF, DOC or DOCX yes, expect to have to do "cleanup"
    Of course there's the old school alternative that is always available.
    Paper copy beside you as you transcribe to a fresh word processing file.
    Be well...

  • Looking for a convertor from HTML to PDF with J2SE

    Hi all,
    My current project would generate some XML data and display it on the web. Needless to say I have to prepare a matching set of XSL to transform the XML into HTML. I got that part done.
    However, my project also requires a PDF output of the same XML data for download. Well, I planned to do the XSL-fo stuff on it as well. But at some point the client thought since new kind of XML data may be added and it's too troublesome to maintain 2 sets of XSLs, they are requesting a converter to translate HTML directly into PDF now.
    I've searched the web, and found a few converter. But all of them are written in C and can't be embeded directly into my Java code. And my servlet might not have rights to call an external program on the production machine.
    I would appreciate very much if anyone can share their experience on similar issue, using a Java component to convert HTML into PDF. Suggestion of commercial products or open source freeware are all welcome.
    Thank you very much.

    HTML to PDF with Java, using OpenOffice.org - example here: [http://www.dancrintea.ro/html-to-pdf/|http://www.dancrintea.ro/html-to-pdf/]
    You can use OpenOffice.org, running as a server and command it remotely for document convertion.
    Besides HTML to PDF, there are also possible other convertions:
    doc --> pdf, html, txt, rtf
    xls --> pdf, html, csv
    ppt --> pdf, swf
    Code example:
    import officetools.OfficeFile; // this is my tools package
    FileInputStream fis = new FileInputStream(new File("c:/test.html"));
    FileOutputStream fos = new FileOutputStream(new File("c:/test.pdf"));
    // suppose OpenOffice.org runs on localhost, port 8100
    OfficeFile f = new OfficeFile(fis,"localhost","8100", true);
    f.convert(fos,"pdf");
    -----------------------------------------------------------------------------------------------------------------------------------------

  • Select a text from a word/pdf document for tagging.

    Hi,
    After an overwhelming response for my [first thread|Thumbnail creation during Image Upload; , I am hoping that my second post will have a solution.
    I am currently working on a knowledge management tool using ABAP WD. The requirement is to open the existing documents and select some text and tag/categorize it.  I also found out that using flash islands, it is possible to get the selected text from textField and pass it back to the ABAP WD binded variable.
    Is there a way to display word/pdf file and perform the text tagging in WD ? using office integration ? Kind help would be greatly appreciated.

    Never mind--I figured it out.

  • Extracting text from .doc,.ppt,.pdf files

    How can i extract ascii text from the file types like .doc , .ppt , .pdf ,. xls ..etc.
    Any tips/hints would be helpful
    Thanks
    Rama

    HI I tried for pdf, but didn't succeed
    Following is for text/Doc files
    <pre>
    import java.io.*;
    public class Doc
         public static void main(String[] args)
              try{
                   File file=new File("c:\\downloads\\WP2001.doc");
                   LineNumberReader buffer=new LineNumberReader(new FileReader(file));
                   StringBuffer buff=new StringBuffer("");
              boolean valid=true;
              while(valid)
                   //System.out.println(buffer.readLine());
                   buff=buff.append(buffer.readLine()+"\n");
                   if(buffer.read()==-1)
                        valid=false;
                   else
                   buffer.setLineNumber(buffer.getLineNumber()+1);
                   System.out.println(buff);
              catch(Exception fne)
                   System.out.println("File Not Found"+fne);
    </pre>
    pathreading

  • Help with converting image to B&W apart from a single area/item??

    Hello
    I am new to Adobe and have downloaded the trial version of Master Collection. No idea what it's all about but have a month to work it out!!
    What I would like to know how to do (hopefully I can explain it!) is to convert a picture to black and white apart from a single object i.e. a wedding photo of a couple in black and white but his flower on the lapel is in it's original colour.
    Can this be done in any of the Master Collection programs and if so, can anyone explain in laymans terms how I do it please and if I first need to make sure my picture has been taken correctly i.e. if it needs to be in raw format?
    Thank you in advance to anyone who can help me and who has the patience to explain it to me very basically!!

    Photoshop: View> Actual Pixels
    use Lasso Tool (press L key) to draw a selection around part you want to maintain color
    Select> Modify> Feather (1 pixel)
    then Command+J (Mac), or Layer> New> Layer via Copy
    (this copies your selection to a new layer)
    in Layers Pallet, select your original Layer
    then Command+J (Mac), or Layer> New> Layer via Copy
    to make a backup copy of your original layer
    now your copy layer should be highlighted in Layers Pallet
    Image> Adjustments> Black & White (should Desaturate the Layer and leave your colored selection on top)....
    The Layers and History look like this:
    Of course there are many ways to do this...google for tutorials...you may want to post in the Photoshop forum since this is not a colormanagement issue...

  • Convert TEXT or HTML to PDF in SharePoint 2013 Without Third Party tool

    Dear All,
    I want to convert the HTML content or TEXT or SharePoint list item to PDF file.
    Scenario.
    1) There will be a form where student will submit the Feedback / survey (custom survey , it's kind of objective question;s).
    2) Then the Feedback / survey result should be sent in the PDF Format by email  to the student once user submitted.
    Without third party tool.
    Please it's bit urgent ,
    Thanks in Advance.
    --Murthy

    Hi,
    Here is blog with the source code for your reference, you can refer the source code to customize your own solution to achieve your requirement.
    http://www.codeproject.com/Articles/28050/Generate-PDF-Using-C
    Best Regards
    Dennis Guo
    TechNet Community Support

Maybe you are looking for