Extract text of JTextArea in WindowClosing event()...

I downloaded the Framework.Java sample code, which creates a new instance of a window. I want to be able to retrieve the text from the JTextArea of a JFrame when the user closes the window.
I am using this code:
        frame.addWindowListener
       (new WindowAdapter() {
         public void windowClosing(WindowEvent e)
            System.out.println("pane.getText()=" + frame.pane.getText());
          System.exit(0);
        });Unfortunately, I am getting this error:
    "Error: variable pane not found in class.javax.swing.JFrame"How do I get at this text? How do I get the content pane in general? Full source code below:
import java.net.*;
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
public class Framework extends WindowAdapter {
    public int numWindows = 0;
    private Point lastLocation = null;
    private int maxX = 500;
    private int maxY = 500;
    public Framework() {
        Dimension screenSize = Toolkit.getDefaultToolkit().getScreenSize();
        maxX = screenSize.width - 50;
        maxY = screenSize.height - 50;
        makeNewWindow();
    public void makeNewWindow() {
        final JFrame frame = new MyFrame(this); //*
        numWindows++;
        System.out.println("Number of windows: " + numWindows);
        if (lastLocation != null) {
            //Move the window over and down 40 pixels.
            lastLocation.translate(40, 40);
            if ((lastLocation.x > maxX) || (lastLocation.y > maxY)) {
                lastLocation.setLocation(0, 0);
            frame.setLocation(lastLocation);
        } else {
            lastLocation = frame.getLocation();
        System.out.println("Frame location: " + lastLocation);
        frame.setVisible(true);
        frame.addWindowListener
       (new WindowAdapter() {
         public void windowClosing(WindowEvent e)
          System.out.println("pane.getText()=" + frame.Container.getContentPane.getText());
          System.exit(0);
    public static void main(String[] args) {
        Framework framework = new Framework();
class MyFrame extends JFrame {
    protected Dimension defaultSize = new Dimension(200, 200);
    protected Framework framework = null;
    private Color color = Color.yellow;
    private Container c;
    public MyFrame(Framework controller) {
        super("New Frame");
        framework = controller;
        setDefaultCloseOperation(DISPOSE_ON_CLOSE);
        setSize(defaultSize);
        //Create a text area.
        JTextArea textArea = new JTextArea(
                "This is an editable JTextArea " +
                "that has been initialized with the setText method. " +
                "A text area is a \"plain\" text component, " +
                "which means that although it can display text " +
                "in any font, all of the text is in the same font."
        textArea.setFont(new Font("Serif", Font.ITALIC, 16));
        textArea.setLineWrap(true);
        textArea.setWrapStyleWord(true);
        textArea.setBackground ( Color.yellow );
        JScrollPane areaScrollPane = new JScrollPane(textArea);
        //Create the status area.
        JPanel statusPane = new JPanel(new GridLayout(1, 1));
        ImageIcon icoOpen = null;
        URL url = null;
        try
            icoOpen = new ImageIcon("post_it0a.gif"); //("doc04d.gif");
        catch(Exception ex)
            ex.printStackTrace();
            System.exit(1);
        setIconImage(icoOpen.getImage());
        c = getContentPane();
        c.setBackground ( Color.yellow );
        c.add ( areaScrollPane, BorderLayout.CENTER ) ;
        c.add ( statusPane, BorderLayout.SOUTH );
        c.repaint ();
}

Hi,
you get the error-message because pane is not declared in your program anywhere. The JTextArea holds the text - it is declared in the constructor of class MyFrame. The variable textarea holds the reference to this JTextArea - the JTextArea is put in a JScrollPane and the JScrollPane added to the ContentPane of the Frame. After that all is done - at the end of the constructor of class MyFrame - the variable textarea is no longer accessible, because it is declared in the constructor itself and is local to the constructor - which means, it "lives" as long as the constructor is executed.
One way to let the variable textarea live longer is to declare it outside the constructor - for example in the body of the class where the Container c is declared.Add simply "JTextArea textarea;" there. In the constructor you only use then
textarea = new JTextArea(....) instead of JTextArea textarea = new JTextArea(...)
Now the textarea-variable will "live" as long as the instance of the MyFrame-class will be there. If you would be able to access the instance of the MyFrame-class from the windowClosing-method, you would be able to access the textarea too - but you can't in the moment.
This instance of the MyFrame-class ist constructed in the makeNewWindow-method - here is the same problem as above with the textarea-variable. The variable frame holds the reference to the newly created frame but it is declared inside the method and will not be accessible, when the method has come to an end. So you should declare it outside in the body of class Framework - for example in the place where maxY is declare. Add simply "JFrame frame;" behind the declaration of maxY.
In the makeNewWindow-method you use
frame = new MyFrame(this) instead of final JFrame frame = new MyFrame(this)
Now we are near the solution - the JTextArea is now accessible by frame.textarea - to get the text in the windowClosing-method you now can use frame.textarea.getText().
There are some other syntax-errors in your text too - your compiler will show them.
Hope this will help you - I have tried, to explain why it should be done this way - I don't want to post only a few lines of code - sorry, when it is now much to read.
greetings Marsian

Similar Messages

Clickable text in JTextArea?

Hi all,
I'm writing a program that populates a textarea with medical terms and such. And, the problem is I need to give an explanation to the more difficult words by allowing the user to click on the words and a popup window with the explanations will appear. I'm not familiar with the mouse clicks technology in Java, so please help if you can!
Also, any suggestions on how the data can be stored/retrieved when I'm using a MySQL database?
Many thanks!

For this you have to add mouse listener in a text area. So that on mouse click you can do something.
try following code, Run it and double click on some word.
import java.awt.BorderLayout;
import java.awt.event.MouseAdapter;
import java.awt.event.MouseEvent;
import javax.swing.*;
public class TextAreaDemo extends JPanel{
private JTextArea text = null, answer = null;
public TextAreaDemo(){
text=new JTextArea("Hi all,\nI'm writing a program that populates a textarea with medical terms and " +
"such. And, the problem is I need to give an explanation to the more difficult words " +
"by allowing the user to click on the words and a popup window with the explanations " +
"will appear. I'm not familiar with the mouse clicks technology in Java, so please " +
"help if you can!\n\nAlso, any suggestions on how the data can be stored/retrieved " +
"when I'm using a MySQL database?\n\nMany thanks!");
answer=new JTextArea(3, 20);
answer.setEditable(false);
setLayout(new BorderLayout());
add(new JScrollPane(text));
add(new JScrollPane(answer), BorderLayout.SOUTH);
text.addMouseListener(new MyMouseListener());
class MyMouseListener extends MouseAdapter{
public void mouseClicked(MouseEvent e){
if(e.getClickCount() == 2){
answer.setText("Do you want to know about:\n\t" + text.getSelectedText());
public static void main(String argv[]){
JFrame frame=new JFrame();
frame.setContentPane(new TextAreaDemo());
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.setSize(400, 300);
frame.setVisible(true);
May help you.
Regards,

How to extract text from a PDF file?

Hello Suners,
i need to know how to extract text from a pdf file?
does anyone know what is the character encoding in pdf file, when i use an input stream to read the file it gives encrypted characters not the original text in the file.
is there any procedures i should do while reading a pdf file,
File f=new File("D:/File.pdf");
               FileReader fr=new FileReader(f);
               BufferedReader br=new BufferedReader(fr);
               String s=br.readLine();any help will be deeply appreciated.

jverd wrote:
First, you set i once, and then loop without ever changing it. So your loop body will execute either 0 times or infinitely many times, writing the same byte every time. Actually, maybe it'll execute once and then throw an ArrayIndexOutOfBoundsException. That's basic java looping, and you're going to need a firm grip on that before you try to do anything as advanced as PDF reading. the case.oops you are absolutely right that was a silly mistake to forget that,
Second, what do the docs for getPageContent say? Do they say that it simply gives you the text on the page as if the thing were a simple text doc? I'd be surprised if that's the case.getPageContent return array of bytes so the question will be:
how to get text from this array? i was thinking of :
    private void jButton1_actionPerformed(ActionEvent e) {
        PdfReader read;
        StringBuffer buff=new StringBuffer();
        try {
            read = new PdfReader("d:/getjobid2727.pdf");
            read.getMetaData();
            byte[] data=read.getPageContent(1);
            int i=0;
            while(i>-1){
                buff.append(data);
i++;
String str=buff.toString();
FileOutputStream fos = new FileOutputStream("D:/test.txt");
Writer out = new OutputStreamWriter(fos, "UTF8");
out.write(str);
out.close();
read.close();
} catch (Exception f) {
f.printStackTrace();
"D:/test.txt" hasn't been created!! when i ran the program,
is my steps right?

How to extract text from a PDF file using php?

How to extract text from a PDF file using php?
thanks
fabio

> Do you know of any other way this can be done?
There are many ways. But this out of scope of this forum. You can try this forum: http://forum.planetpdf.com/

How to read/extract text from pdf

Respected All,
I want to read/extract text from pdf. I tried using etymon but not succed.
Could anyone will guide me in this.
Thanks and regards,
Ajay.

Thank you very much Abhilshit, PDFBox works for reading pdf.
Regards,
Ajay.

Infoobject change for extracting texts data.

Hi BW guys,
Here is my requirement.
I have one info object 'salesmen', which is already used in some other ODS & Cube's.
Now I want to extract texts data for the object 'salesmen', for that I will need to change my infoobject (changes are : adding credit control are object under compounding).
But while i am activating the info object again it is giving errors.
Error messages:
1) InfoObject XXXXX (or ref.) is used in data targets with data -> Error:
2) Characteristic XXXXX: Compound or reference was changed
3)InfoObject XXXXX being used in InfoCube XXXX (contains data)
etc....
But i don't want to delete the data in any data target.
Is there any way to solve this problem?
Thanks in advance......

Hi,
If you have not many cubes and ODSs with this salesman, you can consider another, beter, but more time-consuming way.
1. Create a new IO for your salesman, add a compounding attribute as you want.
2. Load master data for the new IO.
3. Create copies of your infoproviders.
3. In each of them delete an old salesman IO and insert a new one.
4. Create export datasourses for old cubes.
5. Create update rules for new data targets based on old ones.
6. In URs map your new IO with the old one. All other IOs should be mapped 1:1 (new<-old).
7. Reload data targets.
That's all.
The way I proposed earlier is less preferrable. Because anyway you'll have to change loaded into data targets data. And in this case it's better to change data model as you want.
Best regards,
Eugene

How to print diffrent color and diffrent size of text in JTextArea ?

Hello All,
i want to make JFrame which have JTextArea and i append text in
JTextArea in diffrent size and diffrent color and also with diffrent
fonts.
any body give me any example or help me ?
i m thanksfull.
Arif.

You can't have multiple text attributes in a JTextArea.
JTextArea manages a "text/plain" content type document that can't hold information about attributes ( color, font, size, etc.) for different portions of text.
You need a component that can manage styled documents. The most basic component that can do this is JEditorPane. It can manage the following content types :
"text/rtf" ==> via StyledDocument
"text/html" ==> via HTMLDocument
I've written for you an example of how a "Hello World" string could be colorized in a JEditorPane with "Hello" in red and "World" in blue.
import javax.swing.JEditorPane;
import javax.swing.text.StyleConstants;
import javax.swing.text.StyledEditorKit;
import javax.swing.text.StyledDocument;
import javax.swing.text.MutableAttributeSet;
import javax.swing.text.SimpleAttributeSet;
import java.awt.Color;
public class ColorizeTextTest{
 public static void main(String args[]){
 //build gui
 JFrame frame = new JFrame();
 JEditorPane editorPane = new JEditorPane();
 frame.getContentPane().add(editorPane);
 frame.pack();
 frame.setVisible(true);
 //create StyledEditorKit
 StyledEditorKit editorKit = new StyledEditorKit();
 //set this editorKit as the editor manager [JTextComponent] <-> [EditorKit] <-> [Document]
 editorPane.setEditorKit(editorKit);
 StyledDocument doc = (StyledDocument) editorPane.getDocument();
 //insert string "Hello World"
 //this text is going to be added to the StyledDocument with positions 0 to 10
 editorPane.setText("Hello World");
 //create and attribute set
 MutableAttributeSet atr = new SimpleAttributeSet();
 //set foreground color attribute to RED
 StyleConstants.setForeground(atr,Color.RED);
 //apply attribute to the word "Hello"
 int offset = 0; //we want to start applying this attribute at position 0
 int length = 5; //"Hello" string has a length of 5
 boolean replace = false; //should we override any other attribute not specified in "atr" : anwser "NO"
 doc.setCharacterAttributes(offset,length,atr,replace);
 //set foreground color attribute to BLUE
 StyleConstants.setForeground(atr,Color.BLUE);
 //apply attribute to the word "World"
 offset = 5; //we include the whitespace
 length = 6;
 doc.setCharacterAttributes(offset,length,atr,replace);
}

Problem to extract text from HTML document

I have to extract some text from HTML file to my database. (about 1000 files)
The HTML files are get from ACM Digital Library. http://portal.acm.org/dl.cfm
The HTML page is about the information of a paper. I only want to get the text of "Title" "Abstract" "Classification" "Keywords"
The Problem is that I can't find any patten to parser the html files"
EX: I need to get the Classification = "Theory of Computation","ANALYSIS OF ALGORITHMS AND PROBLEM COMPLEXITY","Numerical Algorithms and Problem","Mathematics of Computing","NUMERICAL ANALYSIS"......etc .
The section code about "Classification" is below.
Please give any idea to do this, or how to find patten to extract text from this.
<div class="indterms"><a href="#CIT"><img name="top" src=
"img/arrowu.gif" hspace="10" border="0" /></a><a name="IndexTerms">INDEX TERMS</a>
<a name=
"GenTerms">Primary Classification:</a> 
� F. <a href=
"results.cfm?query=CCS%3AF%2E%2A&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Theory of Computation</a> 
� <img src="img/tree.gif" border="0" height="20" width=
"20" /> F.2 <a href=
"results.cfm?query=CCS%3A%22F%2E2%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">ANALYSIS OF ALGORITHMS AND PROBLEM
COMPLEXITY</a> 
� � � <img src="img/tree.gif" border="0" height=
"20" width="20" /> F.2.1 <a href=
"results.cfm?query=CCS%3A%22F%2E2%2E1%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Numerical Algorithms and Problems</a> 

<a name=
"GenTerms">Additional�Classification:</a> 
� G. <a href=
"results.cfm?query=CCS%3AG%2E%2A&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Mathematics of Computing</a> 
� <img src="img/tree.gif" border="0" height="20" width=
"20" /> G.1 <a href=
"results.cfm?query=CCS%3A%22G%2E1%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">NUMERICAL ANALYSIS</a> 
� � � <img src="img/tree.gif" border="0" height=
"20" width="20" /> G.1.6 <a href=
"results.cfm?query=CCS%3A%22G%2E1%2E6%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Optimization</a> 
� � � � � <img src="img/tree.gif" border=
"0" height="20" width="20" /> Subjects: <a href=
"results.cfm?query=CCS%3A%22Linear%20programming%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Linear programming</a> 

 
<a name=
"GenTerms">General Terms:</a> 
<a href=
"results.cfm?query=genterm%3A%22Algorithms%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Algorithms</a>, <a href=
"results.cfm?query=genterm%3A%22Theory%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Theory</a>
 
<a name=
"Keywords">Keywords:</a> 
<a href=
"results.cfm?query=keyword%3A%22Simplex%20method%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">Simplex method</a>, <a href=
"results.cfm?query=keyword%3A%22complexity%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">complexity</a>, <a href=
"results.cfm?query=keyword%3A%22perturbation%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">perturbation</a>, <a href=
"results.cfm?query=keyword%3A%22smoothed%20analysis%22&coll=ACM&dl=ACM&CFID=22820732&CFTOKEN=38147335"
target="_self">smoothed analysis</a>
</div>

One approach is to download Htmlparser from sourceforge
http://htmlparser.sourceforge.net/ and write the rules to match title, abstract etc.
Another approach is to write your own parser that extract only title, abstract etc.
1. tokenize the html file. --> convert html into tokens (tag and value)
2. write a simple parser to extract certain information
find out about the pattern of text you want to extract. For instance "<class "abstract">.
then writing a rule for extracting abstract such as
if (tag is abstract ) then extract abstract text
apply the same concept for other tags
Attached is the sample parser that was used to extract title and abstract from acm html files. Please modify to include keyword and other fields.
good luck
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
public class ACMHTMLParser
 private String m_filename;
 private URLLexicalAnalyzer lexical;
 List urls = new ArrayList();
 public ACMHTMLParser(String filename)
 super();
 m_filename = filename;
 * parses only title and abstract
 public void parse() throws Exception
 lexical = new URLLexicalAnalyzer(m_filename);
 String word = lexical.getNextWord();
 boolean isabstract = false;
 while (null != word)
 if (isTag(word))
 if (isTitle(word))
 System.out.println("TITLE: " + lexical.getNextWord());
 else if (isAbstract(word) && !isabstract)
 parseAbstract();
 isabstract = true;
 word = lexical.getNextWord();
 lexical.close();
 public static void main(String[] args) throws Exception
 ACMHTMLParser parser = new ACMHTMLParser("./acm_html.html");
 parser.parse();
 public static boolean isTag(String word)
 return ( word.startsWith("<") && word.endsWith(">"));
 public static boolean isTitle(String word)
 return ( "<title>".equals(word));
 //please modify according to the html source
 public static boolean isAbstract(String word)
 return ( "".equals(word));
 private void parseAbstract() throws Exception
 while (true)
 String abs = lexical.getNextWord();
 if (!isTag(abs))
 System.out.println(abs);
 break;
 class URLLexicalAnalyzer
 private BufferedReader m_reader;
 private boolean isTag;
 public URLLexicalAnalyzer(String filename)
 try
 m_reader = new BufferedReader(new FileReader(filename));
 catch (IOException io)
 System.out.println("ERROR, file not found " + filename);
 System.exit(1);
 public URLLexicalAnalyzer(InputStream in)
 m_reader = new BufferedReader(new InputStreamReader(in));
 public void close()
 try {
 if (null != m_reader) m_reader.close();
 catch (IOException ignored) {}
 public String getNextWord() throws IOException
 int c = m_reader.read();
 if (-1 == c) return null;
 if (Character.isWhitespace((char)c))
 return getNextWord();
 if ('<' == c || isTag)
 return scanTag(c);
 else
 return scanValue(c);
 private String scanTag(final int c)
 throws IOException
 StringBuffer result = new StringBuffer();
 if ('<' != c) result.append('<');
 result.append((char)c);
 int ch = -1;
 while (true)
 ch = m_reader.read();
 if (-1 == ch) throw new IllegalArgumentException("un-terminate tag");
 if ('>' == ch)
 isTag = false;
 break;
 result.append((char)ch);
 result.append((char)ch);
 return result.toString();
 private String scanValue(final int c) throws IOException
 StringBuffer result = new StringBuffer();
 result.append((char)c);
 int ch = -1;
 while (true)
 ch = m_reader.read();
 if (-1 == ch) throw new IllegalArgumentException("un-terminate value");
 if ('<' == ch)
 isTag = true;
 break;
 result.append((char)ch);
 return result.toString();
}

Reversed brackets in Arabic extracted text

I'm working on a system that is reasonably good at extracting text from two different PDF documents and comparing them. It's built using PDFL (I'm hoping the community for Acrobat SDK will be willing to help me out since I can't find a forum for PDFL.)
I run into a problem when working with Arabic text. The issue is reversible symbols like brackets ( ( ), { }, etc) and some other things (like < >) are visibly identical in the two documents, but are encoded as their opposites.
i.e.
Document 1 - Text looks like (ABCDE) and is encoded with the unicode values for (ABCDE)
Document 2 - Text looks like (ABCDE) and is encoded with the unicode values for )ABCDE(
I figure this has something to do with right-to-left read order and mixed font detection or perhaps some other font-setting.
I need a way of detecting when this reversal happens so I can compensate for it when extracting the text. I'm stumbling in the dark at this point and would appreciate any direction that could be given.
Thanks,
NN

I can't actually post the PDF (confidentiality agreement prevents it), but I can give you some info:
Document 1: (where encoding is correct)
Created by easyPDF SDK and uses SimplifiedArabic font family for the problem characters and their surrounding text.
Using the Acrobat TextSelection tool to copy/paste the problem text from this document into Notepad results in text that looks right.
Document 2: (where encoding is reverse of displayed character)
It was generated by InDesign CS3 and is using the WinSoft Pro font family for the problem characters and their surrounding text.
Using the Acrobat TextSelection tool to copy/paste the problem text from this document into Notepad results in text with brackets reversed.
What should I be looking for to be missing/wrong from the font definition or content stream?

Extract Text from pdf using C#

Hi,
We are Solution developer using Acrobat,as we have reuirement of extracting text from pdf using C# we have downloaded adobe sdk and installed. We have found only four exmaples in C# and those are used only for viewing pdf in windows application. Can you please guide us how to extract text from pdf using SDK in C#.
Thanks you for your help.
Regards
kiranmai

Okay so I went ahead and actually added the text extraction functionality to my own C# application, since this was a requested feature by the client anyhow, which originally we were told to bypass if it wasn't "cut and dry", but it wasn't bad so I went ahead and gave the client the text extraction that they wanted. Decided I'd post the source code here for you. This returns the text from the entire document as a string.
 private static string GetText(AcroPDDoc pdDoc)
 AcroPDPage page;
 int pages = pdDoc.GetNumPages();
 string pageText = "";
 for (int i = 0; i < pages; i++)
 page = (AcroPDPage)pdDoc.AcquirePage(i);
 object jso, jsNumWords, jsWord;
 List<string> words = new List<string>();
 try
 jso = pdDoc.GetJSObject();
 if (jso != null)
 object[] args = new object[] { i };
 jsNumWords = jso.GetType().InvokeMember("getPageNumWords", BindingFlags.InvokeMethod, null, jso, args, null);
 int numWords = Int32.Parse(jsNumWords.ToString());
 for (int j = 0; j <= numWords; j++)
 object[] argsj = new object[] { i, j, false };
 jsWord = jso.GetType().InvokeMember("getPageNthWord", BindingFlags.InvokeMethod, null, jso, argsj, null);
 words.Add((string)jsWord);
 foreach (string word in words)
 pageText += word;
 catch
 return pageText;

Extract text from pdf

Hi, is it possible to extract text from a pdf file using the command line to get an output like you would get by using the File menu and then 'Save as text..."?
I also noticed that in the installation folder there is a small executable called AcroTextExtractor which sounds interesting, but I was unable to figure out how to use it.

what's wrong with using automator for this? this certainly seems the easiest. I'm not aware of any built in apple script commands that will do this. But You should also ask on the Apple script forum under Mac OS Technologies.
Message was edited by: V.K.

PDWordFinder does not extract text in order

Hi,
My word document had few comments.
I converted the word document to PDF by File->SaveAs->Adobe PDF.
I did not convert the comments to sticky notes. Hence they appear the same as in word document.
My application uses PDWordFinder API to extract text from the document.
I notice that the text in these comments is retrived only at the last.
Why the text in the comments (not sticky notes) is retrieved at last and not in the order they appear in the document?
Is there any option to make the wordfinder retrieve text in the order of appearance?

I need to extract text in 'reading' order, but it's not very clear how to use PDWordFinderAcquireWordList parameters.
Can I use different 'reading order' for PDDocCreateWordFinderUCS method, or can I use xySortTable?
Which are sorting parameters (if they exist) for AcquireWordList or WordFinder ? Thanks

Extracting text to Unicode (Korean, Japanese, ...)

Hi,
I am using the PDFWordFinder to extract text from PDFs in Unicode.
This works fine for a lot of documents, even with Japanese, Korean, Chinese ones.
But, I have some documents, using Korean fonts, which do not seem to be 'compatible' with the PDFWordFinder API.
The returned char codes are using Unicode surrogates range (ie the first value is 0xDBC0 and the next one 0xD801 for example).
It seems that the font has an internal /ToUnicode table (I have see this resource using a COS viewer).
I thought that the PDFWordFinder was able to read and process internal /ToUnicode tables in order to return the corresponding Unicode chars. Am I wrong ?
If the PDFWordFinder is able to do the job, what option am I missing if it does not work ?
Thanks for your help.
Pierre

When I copy / paste text into Word I get squares...not the characters that are displayed in the PDF itself.
If I do the same with docs for which text extraction using PDFWordFinder is working, copy / paste is OK.

Titles are not visible, they show up on timeline as a grey box no text does not effect other events or projects

titles are not visible, they show up on timeline as a grey box no text does not effect other events or projects. could this have something to do with motion

Without seeing the timeline it sounds as if they've been off. Maybe you pressed the V or the role is switched off in the timeline index.

How do I extract text from an email?

Hello!
I am in the process of trying to automate orders from my website. How do I extract text from an email and paste it into specific cells in an Excel spreadsheet using Automator?
Many thanks,
Toby Bateson

If you select the message on the Inbox list, or open the message, you can then go to the Message menu of Mail and select Remove Attachments.
Bob N.
Mac Mini 1.5 GHz; iBook 900 mHz; iPod 20 GB Mac OS X (10.4.7)

Extract text of JTextArea in WindowClosing event()...

Similar Messages

Maybe you are looking for