Layout in Arabic, Russian and Chinese. Exporting text from a PDF

I am laying out long documents in Arabic, Russian and Chinese. The text has been provided as a PDF when I copy and paste this into Indesign it comes up as boxes question marks and other characters having nothing to do with the text I am trying to layout.  I have set the typeface to the Myriad Arabic and the Arabic dictionary still nothing resembling Arabic or any language for that matter. Same with Chinese and Russian. Any suggestions on how to get the text in from the PDF where it is the actual language. Appreciate any help with this.  Thank you.

Thanks for the callout, Ellis
Soooo, KK: you are in for a world of hurt. The intials "WP" at the beginning of these fonts means that the text came out of WordPerfect. Doing multilingual layouts in WP was annoying, but possible. It was developed in the pre-Unicode world where every single method of complex-script layout was a dirty hack. If you like knowing All of the Nerdy Dirty Details, I can tell you how it worked, but suffice it to say that trying to harvest non-Latin-script text from WP and repurpose it for use in InDesign is just pure pain. The WordPerfect-specific codepages were never really supported anywhere outside of WP.
That being said, I have a script laying around somewhere for conversion of WP-Cyrillic into Unicode. (Actually, I think it does Windows CP 1251, but that works just as well.) But that is only one out of forty-five languages? And the Chinese has been rasterized? And the PDFs were originally generated by Distiller 3? If you have any choice, it's time to walk away. If you don't have any choice, I really hope you are billing hourly. My experience in this area (painfully extensive) is that it will cost three to five times as much to extract the text as it would to have a translation professional rekey the text, and then to have a second translation professional review the rekeyed text looking for typos.
Russian OCR is pretty damn good these days, but Chinese OCR is hit-or-miss. I have never seen good Arabic OCR - doesn't mean it's not out there, but I couldn't help you find it.  But chances that all 45 languages have reliable OCR available, and that the result of said OCRing will not need to be reviewed by someone who knows the language, are basically nil.

Similar Messages

  • How do i export text from a pdf?

    I don't know which adobe to purchase in order to export text from a pdf and move it to a spreadsheet.... can someone please help?

    Hi losjovenes1,
    You can use Adobe Acrobat for the purpose.
    If you need just a part of the PDF file in another format, you don’t need to convert the entire file and then extract the relevant content. You can select parts of a PDF file and save it in one of the supported formats: DOCX, DOC, XLSX, RTF, XML, HTML, or CSV.
    Use the Select tool and mark the content to save.
    Right-click on the selected content and choose Export Selection As.
    Select a format from Save As Type list and click Save.
    Regards,
    Rave

  • Using SQL*Loader to Load Russian and Chinese Characters

    We are testing our new 11.2.0.1 database using Oracle Linux 6. We created the database using the AL32UTF8 NLS Character set. We have tried using sqlldr to insert a few records that contain Russian and Chinese characters as a test. We can not seem to get them into the database in the correct format. For example, we can see the correct characters in the file we are trying to load on the Linux server, but once we load them into a table in the database, some of the characters are not displayed correctly (using SQL*Developer to select them out).
    We can set the values within a column on the table by inserting them into the table and then select them out and they are correect, so it appears the problem is not in the database, but in the way sqlldr inserts them. We have tried several settings on the Linux server to set the NLS_LANG environment to AMERICAN_AMERICA.AL32UTF8, AMERICAN_AMERICA.UTF8, etc. without success.
    Can someone provide us with any guidance on this? Would really appreciate any advice as to what we are not getting here.
    Thanks!!

    The characterset of the database does not change the language used in your input data file. The character set of the datafile can be set up by using the NLS_LANG parameter or by specifying a SQL*Loader CHARACTERSET parameter. I suggest to move this question to the appropriate forum: Export/Import/SQL Loader & External Tables for closer topic alignment.

  • Russian and Chinese Flash movies - general advice needed please

    Hi all -
    This is a plea for some general 'jumping off' advice. I am an experienced Flash developer but now have a request to convert an existing xml-fed movie into both Russian and Chinese. I speak neither of these languages so we have had the content of the movie translated by a professional translation service.
    The movie contains both png/jpgs with embedded text - created in Fireworks and also (for the bulk of the content) external xml files. I still need to be able to develop in an English environment - so purchasing a full version of Flash/Fireworks in Russian/Chinese would be folly. How should I go about this? If it is a matter of fonts - where should I get them from? And are there any considerations to be met with regards the xml files? Basically, I would really appreciate some general advice on this subject as it is completely new ground for me.
    Much obliged,
    Hugh

    Thank you. Having the airports all in proximity was the key and, of course, I eventually found the same advice in an apple help file. I set the new AExtreme up as WDS main with an ethernet disk for backups and music. An old AExtreme as WDS remote serves the Cube by ethernet and a usb printer. An AExpress as WDS remote serves one stereo. The other AExpress is WDS relay serving another stereo and helping the network reach the office where the last old AExtreme is WDS remote with another USB printer. The 3 mac laptops are happy. I have yet to try any PCs.

  • Hello, I am doing a layout in arabic language, and I am missing option of change direction in paragraph. I read on forum that I am supposed to download some kind of an plug in or version of indesign for arabic language support, but I just don't know what

    Hello, I am doing a layout in arabic language, and I am missing option of change direction in paragraph. I read on forum that I am supposed to download some kind of an plug in or version of indesign for arabic language support, but I just don't know what to download.
    Can you tell me step by step what to do and download.
    Thank you very much!

    InDesign is from a technical viewpoint a Plugin Activator.
    Every function is a plugin.
    With CC and higher you can change the language in the CC app preferences.
    When you change the language, additonal plugins, necessary for that language are installed, with the MENA (Middle East North Africa) versions RTL functionality is added into your primary version.
    InDesign chooses to start the program in the same language as you OS is set up, RTL functionality appears translated into your normal User Interface.
    After installing a MENA version you can switch back to your main language.

  • Exporting text from fcp

    I have a lot of separate text boxes in my sequence. Since I can' t do a spell check in fcp, I was wondering if i can export all the text boxes out as a plain text document to do a spell check.
    I tried exporting as a xml document, but navigating through all the tags was quite a problem.
    TIA for any suggestions.

    Usual workflow is following.
    1. Put all Text-generator clips to one track
    2. Export sequence to XML
    3. Launch TitleExchange, point to exported XML and say where to put converted file.
    Exporting to STL you'll get text file like that
    +00:01:27:26 ,00:01:29:26 ,They had this idea that+
    +00:01:29:30 ,00:01:30:31 ,behind scary+
    ... which you can open in any spell-checker.
    We use TitleExchange to bring subtitles from FCP to DVD Studio and back (in case we need subtitled DVD and Betacam output)
    Though, this may be much expensive way of checking spelling (135 Euro per TE licence).
    There are lots of subtitle software, I only wanted to mention a way of exporting text from FCP with help of subtitle processors.

  • Exporting Text from multiple text boxes?

    I'm using InDesign CS3 on the Macintosh. I need to export text from multiple text boxes/stories into one text file. The Export File command only exports text when the text tool is selected and the cursor is in the text box. Unfortunately, I have 8-20 individual text boxes per page, none are linked, and my document is 100+ pages, so selecting each text box individually is much too time consuming. There must be a better way - Please help!
    Thanks!
    Carolyn

    The text exporter plug-in seems to work! I did a quick test - text still will need some clean-up to make sure it's in the correct order, but MUCH better than exporting each story individually. THANKS!

  • Exporting Highlighted Text from a PDF

    I'm creating a publication in Indesign, a pdf of each chapter being reviewed by its author, usually highlighting "keywords".
    I need to be able to export text "highlighted" in PDFs as a text file.
    In Acrobat XI's Comment/Comments List I get a list of the highlighted items, page and time and choose "Export all to Data File..." which creates a smaller ".fdf" file which is of no use as it needs the source file to open.
    Creating a "Comment summary" also adds the fact that the document was highlighted but not the content or text that was highlighted.
    The PDF documents I have will be reviewed by their authors and have selected words highlighted and added to an index or reference. I need a simple way of allowing them to highlight the "keyword" in the pdf which I can export as a text document.

    I think you're more likely to get a good answer over in the Acrobat forums....

  • Export text from pdf to csv or xls / identity-H text

    Hello,
    Is it possible to export text in a pdf to a csv or an excel file (with coresponding page numbers where the text was found)
    For a product we make we need to put all the text of a page into the metadata of the page. Normally we use a ghostscript for this but when customers provide a PDF with identity-H text this won't work most of the time. When this doesn't work we create a postscript of this PDF and recreate a PDF with distiller, quite often after this the ghostscript will recognize the text, but if it doesn't work... then we need to put all the text manually in an excel file and with all the text boxes and lay-out in the PDF this is a quite frustrating task.(especially a few hundred pages)
    The question on top is a sulotion which will work if it is possible because that is failproof but, if someone knows an other solution to the actual problem we experience with identity-H that would be very helpfull too.
    Thanks in advance!

    Here is a visual example of what I am referring to, showing the right-click location and the different options that appear. (The blue selection box is from me selecting and dragging around the question from a non-text area)

  • Hi I've a big problem with adobe acrobat reader XI pro and I hope you can help me. The problem is; when I past copied text from some pdf books (not all of them) it past symbols only! wherever I past it! and even if I coped that text from another pdf reade

    Hi
    I've a big problem with adobe acrobat reader XI pro and I hope you can help me.
    The problem is; when I past copied text from some pdf books (not all of them) it past symbols only! wherever I past it! and even if I coped that text from another pdf reader (adobe pdf reader, internet browsers, ...etc.).
    This problem started to happen since yesterday when I installed adobe acrobat reader XI pro to try it before I buy it, and before that when I was using the free adobe pdf reader I was totally able to copy any text from any pdf and past it anywhere with nothing wrong.
    What can I do?
    thank you a lot.

    There is no product called Adobe Acrobat Reader Pro. There is
    - Adobe Acrobat Pro ($$)
    - Adobe Reader (free)
    Which do you have? And are you a programmer?

  • I have the new iOS 7 and am getting texts from another family members phone and they are getting mine how do I disable this without changing accounts

    I have the new iOS 7 and am getting texts from another family members phone and they are getting mine how do I disable this without changing accounts

    Welcome to the Apple Community.
    You could simply set each device to only use a single telephone number (settings > messages > send & receive, but this isn't really an ideal solution since the other person can always change their settings and see your messages and send messages that appear as if they are from you. There is only one real solution and that's to have your own accounts.

  • I have the new iOS 7 and am getting texts from another family members phone and they are getting mine how to I disable this ??

    I have the new iOS 7 and am getting texts from another family members phone and they are getting mine how to I disable this ??

    You each need to have separate Apple IDs and separate user accounts. Better yet, separate computers as well.

  • How do i disable copy and paste so a reader can not copy text from my pdf document?

    how do i disable copy and paste so a reader can not copy text from my pdf document? i have gone into my security preferences but can not find out how to change the settings so i can disable the copying option.

    See http://www.adobe.com/content/dam/Adobe/en/products/acrobat/pdfs/adobe-acrobat-xi-protect-p df-file-with-permissions-tutorial-ue.pdf

  • Trying to copy and paste text from a pdf to a webpage

    trying to copy and paste text from a pdf to my iWeb edit page shows up as red lines,  only shows up on the website if I highlight it, please help

    If it only shows up when you highlight it, then it could be your font or your font color. Try changing both to a different font to see if that helps. If not, copy it into a word processor such as Microsoft Word, Pages, or Text Edit to see if it shows up there. If it does, try copying it again from the processor to your webpage.

  • HT201269 how do i unregister my iphone? I have the galaxy s4 now and cannot receive texts from iphone users.

    How do i unregister my old Iphone, i no longer have it in my possesion. I now have the galaxy s4 and cannot receive texts from iphone users

    Hello AnnH2
    If you have access you your old iPhone, just sign out of Messages. If you do not have access to it, then change your Apple ID password.
    iOS and OS X: Link your phone number and Apple ID for use with FaceTime and iMessage
    http://support.apple.com/kb/HT5538
    Unlink a phone number
    To remove a phone number from an Apple ID, sign out of FaceTime and Messages on your iPhone:
    Settings > Messages > Send & Receive. Tap your Apple ID, then tap Sign Out.
    Settings > FaceTime. Tap your Apple ID, then tap Sign Out.
    This should remove your phone number from other devices using the same Apple ID with FaceTime and Messages. If the phone number is still available on other devices after you sign out of FaceTime and iMessage on the iPhone, you may need to sign out of iMessage and FaceTime on all your devices, then sign in to FaceTime and Messages again on devices you want to use.
    Note: If you no longer have access to the iPhone that is using the number you want to remove, reset your Apple ID password.
    Regards,
    -Norm G.

Maybe you are looking for

  • How to Eliminate Special Character in SQL LOADER Script

    How to eliminate special character from SQL LOADER script file which suppose not to insert in TABLE example.CSV lile like this <ABC/ , 7747> <DEF/ , 7763> <NEW/ , 7779> <OLD/, 7795> I have to remove < > and / character at the time of loading into tab

  • How to create a floating header

    I would like to have a sym as a header with all its elements float as the page is scrolled vertically.... Is there any way to achieve this? Thanks

  • Finder crash when searching server

    We recently bought a bunch of new iMacs to replace our old machines and, in the process, inheirited all sorts of new troubles with the switch to 10.8. The one that's bugging me at the moment is what appears to be a ubiquitous problem among all of use

  • RE: REST Web Services Connector

    I like your community and definitely i am going to try your links!

  • Can't see external hard drives

    I have a 2 TB external hard drive with a lot of video on it. I have always been able to open it up on my powermac G5, but lately it does not appear on my desktop and I can't open it up. It does appear on the desktop and I can open it up when I connec