Extract text from hebrew pdf using adobe ifilter 6.0 reverse the letters

Hello pdf Users
I'm using adobe Ifilter 6.0 to extract pdf text from Hebrew documents. The text returned from the filter is reversed both in the letters inside a word, and in the word order.
Example (given in English letters)
Who am I
will give
I ma ohW
This is a known issue in bidi (bidirectional, meaing right-to-left) languages lie Hebrew and Arabic, but I think I saw that Ifilter should supports hebrew OK?
Any help?
Roee

Try the Adobe Acrobat Pro forums.

Similar Messages

  • How do I extract pages from a pdf using 'Adobe PDF Pack'?

    How do I extract pages from a pdf using 'Adobe PDF Pack'?

    I think you have to buy extractor for 1.99 a month to extract PDF.  But I am having trouble activating it.  Good luck.

  • How to extract text from a PDF file using php?

    How to extract text from a PDF file using php?
    thanks
    fabio

    > Do you know of any other way this can be done?
    There are many ways. But this out of scope of this forum. You can try this forum: http://forum.planetpdf.com/

  • How to extract text from a PDF file?

    Hello Suners,
    i need to know how to extract text from a pdf file?
    does anyone know what is the character encoding in pdf file, when i use an input stream to read the file it gives encrypted characters not the original text in the file.
    is there any procedures i should do while reading a pdf file,
    File f=new File("D:/File.pdf");
                   FileReader fr=new FileReader(f);
                   BufferedReader br=new BufferedReader(fr);
                   String s=br.readLine();any help will be deeply appreciated.

    jverd wrote:
    First, you set i once, and then loop without ever changing it. So your loop body will execute either 0 times or infinitely many times, writing the same byte every time. Actually, maybe it'll execute once and then throw an ArrayIndexOutOfBoundsException. That's basic java looping, and you're going to need a firm grip on that before you try to do anything as advanced as PDF reading. the case.oops you are absolutely right that was a silly mistake to forget that,
    Second, what do the docs for getPageContent say? Do they say that it simply gives you the text on the page as if the thing were a simple text doc? I'd be surprised if that's the case.getPageContent return array of bytes so the question will be:
    how to get text from this array? i was thinking of :
        private void jButton1_actionPerformed(ActionEvent e) {
            PdfReader read;
            StringBuffer buff=new StringBuffer();
            try {
                read = new PdfReader("d:/getjobid2727.pdf");
                read.getMetaData();
                byte[] data=read.getPageContent(1);
                int i=0;
                while(i>-1){ 
                    buff.append(data);
    i++;
    String str=buff.toString();
    FileOutputStream fos = new FileOutputStream("D:/test.txt");
    Writer out = new OutputStreamWriter(fos, "UTF8");
    out.write(str);
    out.close();
    read.close();
    } catch (Exception f) {
    f.printStackTrace();
    "D:/test.txt"  hasn't been created!! when i ran the program,
    is my steps right?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • Hello, is there a way to redact a word or item from a pdf using adobe pro X the same way you can if you convert a word doc using pro X on a pc?

    hello, is there a way to redact a word or item from a pdf using adobe pro X the same way you can if you convert a word doc using pro X on a pc?

    If the document is not a scanned image or protected from editing then you should be able to edit it. I would have to guess you have a scan and when you converted to Word, you ran OCR (Optical Character Recognition) on it converting the scanned image to live text.
    This is the forum for the free Adobe Reader which can not edit or redact.

  • I want to extract data from a PDF using Java

    I would prefer to extract data from a PDF and convert it to XML. Is there an API that will convert a PDF to some Adobe format XML? Ideally I would like to add some JAR files to my classpath, similar to PDFBox. I don't want to install a bunch of server side componets or anything like that.
    Thanks!

    Thank you for the reply!
    If I installed the server side components, how would a Java client invoke a service to export data from a PDF? RMI, Web Services?

  • Strange text display in pdf using adobe reader exported by Indesign CS5

    Hi there,
    I have export an interactive pdf from indesign CS5 PC version with version 7 update, and it looks fine in most of the PC and Mac Adobe reader.
    But 2 of the computer had strange text display. One is the iMAC with latest version of Adobe Reader
    <== you can see the text "78" and "454" are abnormal
    But with the same computer using Safari with perview plugin is showing following result:
    <==you can see the text are normal.
    The pdf is embedded with following fonts and property:
    Another Windows XP with Adobe Acrobat 9 will had same display as the iMac.
    I ensure the Windows XP don't have the font installed with HelveticaNeue and the Mac is using OS Lion Bundle with that font.
    Now the customer blame it is the problem of the pdf created by me and I had no idea how to explain to my customer and find out solution.
    I had a trial version of CC but same result and no error during the export.
    Any idea is highly appreciate. The customer hotline and the live chat is hopeless as they send me to this community to ask since I use CS5 not the latest version.
    Regards,
    Ichi

    Thanks Willi,
    I had the fonts of Helvetica Neue bold light etc which created by Adobe and the font was created in 1993,shown in font property.
    my PDF output is set to interactive not the print one. Other settings are default.
    the purpose of this PDF is for user to print out and read on website. The funny thing is print out had no problem at all with HP and Ricoh printer using PCL 6 driver.
    Despite Safari, the strange font happened in only Adobe product, other PDF reader had no problem such as foxit reader, xchange reader.
    I tried to not embed the font but not able to do indesign, search in Google, it said it always embedded. Also that font family is preloaded in Mac OS.
    I can try any test if there is a way to find out the solution.
    or what should I do so the PDF should display correctly in all computer
    thanks a lot.

  • Extract text from a line using power shell

    I'm beginner, exploring options in powershell.
    File with the following info:
    Line 1: 22-Aug-2014 : Entry 1.Info : Alex : here it says "Hello World" ok?
    Line 2: 24-Aug-2014 : Entry 1.Info : Micheal : here it says "Welcome to my world"
    Line 3: 24-Aug-2014 : Entry 1.Info : Alex : here it say "Remember? i said Hello world", done.
    Search through the file and only extract specific info related to Alex
    Alex: Hello World
    Alex: Remember? i said Hello world
    Any suggestions to achieve this output using powershell?

    Hello jrv, Infact i have started writing some scripts (not an expert though ;) )
    tried something like this..
    Get-Content .\demo.txt | Select-String -Pattern "Alex" | %{$_.Line.split(":")[3,4]}
    Alex
    here it says "Hello World"
    Alex
    here it say "Remember? i said Hello world"
    however as i mentioned.. trying to get output in this specific format:
    Alex: Hello World
    Alex: Remember? i said Hello world
     cheers!

  • Hello... I can't convert an ebook in pdf using Adobe Digital Editions... the PDF files I get are not be read...

    Hello
    I am a Mac User
    I follow the procedures to convert an ebook into a pdf... (document ==> digital documents...) but, the pdf I find there are not redable.... this drives me nuts !   ;-)
    As I click on the pdf file, it leads me to an adobe website...
    Thanks a lot ! I need to stamp the book i just bought...
    Thanks for your help.
    Daurig.

    Can you please specify the exact steps that you followed? Did you purchase the book(an acsm file gets downloaded) and try to open it?
    Thanks

  • Hi I've a big problem with adobe acrobat reader XI pro and I hope you can help me. The problem is; when I past copied text from some pdf books (not all of them) it past symbols only! wherever I past it! and even if I coped that text from another pdf reade

    Hi
    I've a big problem with adobe acrobat reader XI pro and I hope you can help me.
    The problem is; when I past copied text from some pdf books (not all of them) it past symbols only! wherever I past it! and even if I coped that text from another pdf reader (adobe pdf reader, internet browsers, ...etc.).
    This problem started to happen since yesterday when I installed adobe acrobat reader XI pro to try it before I buy it, and before that when I was using the free adobe pdf reader I was totally able to copy any text from any pdf and past it anywhere with nothing wrong.
    What can I do?
    thank you a lot.

    There is no product called Adobe Acrobat Reader Pro. There is
    - Adobe Acrobat Pro ($$)
    - Adobe Reader (free)
    Which do you have? And are you a programmer?

  • How to replace numbers with text in tax return pdf using Adobe Acrobat X Pro

    How do I replace numbers with text in tax return pdf using Adobe Acrobat X Pro? The tax return was created using CCH software. Thanks for your review.

    Thanks Bill for your quick reply. CCH software is one of the major
    suppliers of tax return software. I found an internal source that helped me
    make the changes from numbers to text i.e. "$123,456" to "See Schedule O".
    I am not sure if I am working in form or final text. Thanks again! Kelly

  • Extract text from pdf

    Hi, is it possible to extract text from a pdf file using the command line to get an output like you would get by using the File menu and then 'Save as text..."?
    I also noticed that in the installation folder there is a small executable called AcroTextExtractor which sounds interesting, but I was unable to figure out how to use it.

    what's wrong with using automator for this? this certainly seems the easiest. I'm not aware of any built in apple script commands that will do this. But You should also ask on the Apple script forum under Mac OS Technologies.
    Message was edited by: V.K.

  • How do I extract pages from a PDF?

    I am trying to extract pages from a pdf document through Adobe XI and the command is not there through page thumbnails or through tools.  What can I do?  We have the current version which I verified through checking for updates.
    [Please choose only a short description for the thread title.]
    Message was edited by: Jim Simon

    "Adobe XI" does not exist.
    There is Acrobat XI (Standard or Pro).
    There is Adobe Reader XI.
    Acrobat XI Pro (and maybe Standard) can extract pages from a PDF.
    Adobe Reader XI cannot extract pages from a PDF.
    Once either application is open it is easy to determine which you have open.
    The "name" is in the top most "ribbon" of the application window.
    From what you've written it appears that you are using Adobe Reader XI.
    Be well...

  • How can I extract pages from a PDF? The Tools menu is missing.

    I used to be able to extract pages from my PDF file. I don't see the tools icon anymore. How can I access the tools icon?

    Hi lenm,
    To extract pages, you need to use Acrobat (not Adobe Reader). As I can attest (because I do have both Reader and Acrobat installed on the same computer), it is quite easy to open files in Reader when you mean to open then in Acrobat. So, please make sure you have the right app open. (I pull this one all the time!)
    Now, if the Tools menu is missing from Acrobat, choose View > Show/Hide > Toolbar Items > Show Toolbars to make them reappear.
    Please let us know how it goes.
    Best,
    Sara

  • Extracting data from a pdf form

    Hi,
    livecycle es2, workbench 9.0
    I'm new to workbench and have a problem extracting data from a pdf form submitted to a short lived process.
    I have set up the following very simple process :
    default startpoint >  ProcessForm > exportData > set value > set value > Write Document
    The intention is to update the document and write it to disk. So far, each step works except for the 'export data' where I cannot get the pdf to extract to xml.
    The Input to the 'export data' step is a variable (myDoc), Data Type: Document,  created from the incoming PDF form.
    If I write out myDoc it is an exact copy of the incoming document, so I guess the start and finish steps of of the process are OK.
    The incoming (PDF) form I was given had no data schema, but  I thought I could access the form data by exporting to an xml variable....
      Service : FormDataIntegration  / exportData
    input (PDF Document)    variable : myDoc
      output(Data extracted)     variable : myXMLData
    Then in the next step (set value) access the xml element I am after ..
    Mappings
    Location:  /process_data/@groupId      Expression: /process_data/myXMLData/xdp/datasets/data/form1/mainPage/groupId
    This is did not work, so I got the incoming form, exported the form data to an xml file,  and created a schema using  Stylus Studio. I then imported that into the myXMLdata definition. ( BTW - Do I need to specify the root node after importing it ? )
    Still not working !
    Extra info : The XML view of my incoming  form shows I have a minimal dataset definition- is this OK ??
    <connectionSet xmlns="http://www.xfa.org/schema/xfa-connection-set/2.8/">
       <?originalXFAVersion http://www.xfa.org/schema/xfa-connection-set/2.4/?></connectionSet>
    <xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
       <xfa:data xfa:dataNode="dataGroup"/>
    </xfa:datasets>
    The schema created by stylus studio has none of the xfdf, xfa settings I have seen on other schemas - is this OK ?
    Any help to get this fixed greatly appreciated
    thanks
    steve

    hey thanks for the offer, but I am now sorted after I found a simple working example on line.
    This is a similar process to the one I am working on, and is clearly described and easy to follow...
    http://eslifeline.wordpress.com/2009/04/25/extracting-data-from-signed-pdf-using-livecycle -server/
    girish bedekar - I thank you !

Maybe you are looking for