Converting MS-WORD characters such as smart quotes into standard quotes

Hi,
This is a problem when taking text from a MS Word file and pasting it into a browser editor. The Java code doesnt interpret these smart quotes correctly.
Any ideas on how to convert these to standard characters that can be understood by Java
Regards
Chris

sorry what i meant was an application called MorelloNot a web application, then? Doesn't run in a browser?
If so, then where does Java come into this? Is Morello written in Java, or do you have a Java application that is interfacing with Morello in some unspecified way? And where does the console come into it? You said that people were pasting text into Morello, so surely it isn't console-based?
You clearly have an encoding problem but from your description it's impossible to tell where it arises or what you can do to fix it.

Similar Messages

  • Is there a way to convert a Word document with Red and Black into Spot colors in Acrobat X?

    Customer sent us PDF files created in MS Word. Of course, Word creates CMYK versions of colors. I need the red text and black text converted into spot colors for 2 color output on press. Is there a way to do this? I saw another thread that said there is a Fixup in the Print Production/Preflight section of Acrobat X that has a convert to Spot, but I'm not seeing that. All I see is Convert to CMYK. I'm using Acrobat X 10.1.13 on OS X. Any help would be appreciated. He also sent the Word docs so I can drop that into InDesign and reformat the whole thing and fix the colors myself but if I can do it in Acrobat, that would be a lot quicker. Thanks!

    Frequently in this scenario the red will be 100% of either magenta or yellow with 0% black. In these cases you can merely plate the Black & Magenta (or yellow) for press.
    I've created a Preflight Fixup on occasion for these also.
    An example shown - Note it requires the document is cmyk. MS Files will be rgb. You could create a Preflight looking for rgb colors.
    With Fixup tab selected - Options > Create New Fixup > Type of Fixup - search for Spot in the field

  • Pages 2 -- return of a smart-quotes problem

    In Pages 2, if one puts a word or phrase in smart quotes within parentheses ("like this, except that here there aren't any smart quotes"), the opening smart-quote mark faces the wrong way -- as though it were the "closing" mark. This was originally the case in Pages 1, but it seemed to be cured in a later bug fix. The bug appears to be back with Pages 2. This is a major problem for lawyers, among others, who use words and phrases in quote marks and within parentheses, to define certain terms in a document. Obviously, one can do a cut-and-paste fix, but it is a pain.

    DennisG you da man, Now I know re those pesky quotes that look backwards.
    Not going for brownie points or literary awards with my Newsletter, when the quotes look away from the text, I'll be able to explain to critics like Veronica says on the computer, "Not my fault". Blame Apple.
    Next time those quotes go bonkers, I'll try your tip. Thanks. Ol' Jim.

  • How to convert a word file into PDF in command line ?

    Hi,
    I want to convert a word (excel, powerpoint, ...) into a PDF using a windows command line.
    The command :
    c:\> acrodist toto.ps
    is working file and the output is a PDF file (toto.pdf)
    But if i try with a MS office word file it is not working
    c:\> acrodist toto.doc
    The output in the console is:
    %%[ Error: undefined; OffendingCommand: ÐÏࡱá ]%%
    %%[ Flushing: rest of job (to end-of-file) will be ignored ]%%
    %%[ Warning: PostScript error. No PDF file produced. ] %%
    If i do a right clic on the file with the mouse and select in the context menu the option convert to PDF it is working fine :-\
    How can i convert that kind of file with a command line ?
    Thanks for your help
    Xav

    >But, from the windows explorer, if i do a right clic on the word document, with the context menu, i can directly convert to PDF
    This is equivalent to using the PDFMaker facility in Word - that is,
    the Acrobat button. Which is also the same thing that is done when you
    use File > Create PDF > From File in Acrobat.
    What it does is print to a PS file *and* do a lot of additional
    processing to write stuff about links, tags etc. into the PS file.
    >(no tmp PS file is used, cause the links are still working).
    This isn't true, but it's certainly the case that you won't get links.
    There seems to be no API, via any mechanism, still less the "obsolete"
    (Microsoft's view) command line, to use PDFMaker from your program.
    Aandi Inston

  • Convert smart quotes and other high ascii characters to HTML

    I'd like to set up Dreamweaver CS4 Mac to automatically convert smart quotes and other high ASCII characters (m-dashes, accent marks, etc.) pasted from MS Word into HTML code. Dreamweaver 8 used to do this by default, but I can't find a way to set up a similar auto-conversion in CS 4.  Is this possible?  If not, it really should be a preference option. I code a lot of HTML emails and it is very time consuming to convert every curly quote and dash.
    Thanks,
    Robert
    Digital Arts

    I too am having a related problem with Dreamweaver CS5 (running under Windows XP), having just upgraded from CS4 (which works fine for me) this week.
    In my case, I like to convert to typographic quotes etc. in my text editor, where I can use macros I've written to speed the conversion process. So my preferred method is to key in typographic letters & symbols by hand (using ALT + ASCII key codes typed in on the numeric keypad) in my text editor, and then I copy and paste my *plain* ASCII text (no formatting other than line feeds & carriage returns) into DW's DESIGN view. DW displays my high-ASCII characters just fine in DESIGN view, and writes the proper HTML code for the character into the source code (which is where I mostly work in DW).
    I've been doing it this way for years (first with GoLive, and then with DW CS4) and never encountered any problems until this week, when I upgraded to DW CS5.
    But the problem I'm having may be somewhat different than what others have complained of here.
    In my case, some high-ASCII (above 128) characters convert to HTML just fine, while others do not.
    E.g., en and em dashes in my cut-and-paste text show as such in DESIGN mode, and the right entries
        –
        —
    turn up in the source code. Same is true for the ampersand
        &
    and the copyright symbol
        ©
    and for such foreign letters as the e with acute accent (ALT+0233)
        é
    What does NOT display or code correctly are the typographic quotes. E.g., when I paste in (or special paste; it doesn't seem to make any difference which I use for this) text with typographic double quotes (ALT+0147 for open quote mark and ALT+0148 for close quote mark), which should appear in source code as
        “[...]”
    DW strips out the ASCII encoding, displaying the inch marks in DESIGN mode, and putting this
        "[...]"
    in my source code.
    The typographic apostrophe (ALT+0146) is treated differently still. The text I copy & paste into DW should appear as
        [...]’[...]
    in the source code, but instead I get the foot mark (both in DESIGN and CODE views):
    I've tried adjusting the various DW settings for "encoding"
        MODIFY > PAGE PROPERTIES > TITLE/ENCODING > Encoding:
    and for fonts
        EDIT > PREFERENCES > FONTS
    but switching from "Unicode (UTF-8)" to "Western European" hasn't solved the problem (probably because in my case many of the higher ASCII characters convert just fine). So I don't think it's the encoding scheme I use that's the problem.
    Whatever the problem is, it's caused me enough headaches and time lost troubleshooting that I'm planning to revert to CS4 as soon as I post this.
    Deborah

  • Characters such as apostrophes and smart quotes turning into boxes or question marks

    We recently upgraded from CF5 to CF7 and are having a problem
    with previously saved text that no longer displays correctly. Some
    characters (apparently, non-ASCII characters such as curly
    apostrophes and smart quotes) are rendering as boxes or question
    marks. We recently upgraded to Oracle 10g from Oracle 8i, but this
    problem appears to be independent of the database that the text is
    stored in. Here is sample code that will illustrate the problem:
    <CFSET string1="Department’s">
    <CFSET string2="hey—there">
    <CFOUTPUT>
    string1 is #string1#
    <BR>
    string2 is #string2#
    </CFOUTPUT>
    output looks like this:
    string1 is Department?s
    string2 is hey?there
    These are rendered as boxes when viewed in Internet Explorer.
    (They show up as question marks when I copy and paste them here.)
    The Demoronize UDF helps *some* of the time, but this is
    still happening with a lot of text, especially text that gets
    pasted from a website into a form, then saved to a database. Does
    anybody have a solution for this? This is breaking my applications
    and is incredibly annoying. I'd like to either replace the
    problematic characters at the time they are displayed, or replace
    them when they are input in the database in the first place (and go
    back and update all the previously saved data to replace the
    problematic characters with plain text equivalents).
    Any suggestions appreciated.

    I finally isolated the problematic characters so I edited the
    DeMoronize UDF (available at cflib.org) by adding the following
    text replacements at the bottom:
    text = Replace(text, chr(8208), "-", "ALL");
    text = Replace(text, chr(8209), "-", "ALL");
    text = Replace(text, chr(8210), "&ndash;", "ALL");
    text = Replace(text, chr(8211), "&ndash;", "ALL");
    text = Replace(text, chr(8212), "&mdash;", "ALL");
    text = Replace(text, chr(8213), "&mdash;", "ALL");
    text = Replace(text, chr(8214), "||", "ALL");
    text = Replace(text, chr(8215), "_", "ALL");
    text = Replace(text, chr(8216), "&lsquo;", "ALL");
    text = Replace(text, chr(8217), "&rsquo;", "ALL");
    text = Replace(text, chr(8218), ",", "ALL");
    text = Replace(text, chr(8219), "'", "ALL");
    text = Replace(text, chr(8220), "&ldquo;", "ALL");
    text = Replace(text, chr(8221), "&rdquo;", "ALL");
    text = Replace(text, chr(8222), """", "ALL");
    text = Replace(text, chr(8223), """", "ALL");
    text = Replace(text, chr(8226), "&middot;", "ALL");
    text = Replace(text, chr(8227), "&gt;", "ALL");
    text = Replace(text, chr(8228), ".", "ALL");
    text = Replace(text, chr(8229), "..", "ALL");
    text = Replace(text, chr(8230), "...", "ALL");
    text = Replace(text, chr(8231), "&middot;",
    "ALL");

  • MS Word smart quotes don't paste the same into Forms 10g as Forms 6i

    Hi all,
    I have users who write text in Microsoft Word and then cut-and-paste it into Oracle Forms.
    After some prodding by the developers, the users have switched to using the 10G version of their application instead of the 6i version.
    As a background you should know that Microsoft Word uses Smart Quotes by default, which you can turn off. Smart Quotes are different ascii characters than the ascii 39 single quote and ascii 34 double quote.
    When the users cut-and-paste the Microsoft Word text into the Forms 10G field, the single apostrophe smart quote does not convert to ascii 39. I wouldn't be surprised about this except that in Forms 6i the form does convert the single apostrophe smart quote into ascii 39.
    So if the users use the 6i Form they can cut-and-paste and the form/database has the character as ascii 39. If the users paste into Forms 10G then the database shows the character as ascii 191 (hex 0xBF), which is an inverted question mark.
    Does anyone know of any settings in Forms 10g to revert back to 6i functionality for this?
    Thanks much,
    Troy

    I am afraid that Jan is right.
    And this might be a bit of a hassle, since it might affect your forms. If you are using some standard Windows lettertype in your forms, though, you should be OK.
    Bare in mind, that you the NLS_LANG.characterset will have to be compatible between database and forms (the latter at both compiletime and runtime).
    Good Luck!
    Remco

  • Microsoft Word "Smart Quotes"

    I hope this will save other developers some time.
    This may be obvious to others, but I just spent several hours Googling and testing to determine what actually happens when a user copies text containing "Smart Quotes" from Microsoft Word into a Java JTextComponent. For those not familiar with Smart Quotes, by default, MS Word changes double-quoted strings from using the US-ASCII character for quote (0x22) into left- and right- curly quotes (UTF-16: 0x201c and 0x201d). Word also does this with serveral other characters. This plays havoc with the display and Java Strings later encoded with java.beans.XMLEncoder, unless treated carefully. Here is what I discovered (obviously, this applies to MS Windows):
    All values are in hexadecimal.
    - Word is storing the character for double quote as UTF-16 internally (201C).
    - When the character is copied to the clipboard, it is copied as UTF-8 (E2 80 9C).
    - When the clipboard is pasted into Java, Java is assuming the it was originally Windows-1252 encoded, because that is the default for the US-EN locale in Windows XP (probably also Vista, but I only tested in XP).
    - Java translates this into a-circumflex, euro-sign, o-e-ligature, the characters corresponding to E2, 80, and 9C respectively in Windows-1252 and represents it internally in UTF-16 as 00E2 20AC 0153.
    -When the String is XML-encoded using java.beans.XMLEncoder, it is written in UTF-8 as C3A2 E282AC C593, which equates to UTF-16 00E2 20AC 0153 -- the characters a-circumflex, euro-sign, o-e-ligature.
    I am not sure how to fix this, but maybe another reader does. I am experimenting with the Clipboard (java.awt.datatransfer) to see if I can programmatically find out the original character encoding (in this case, UTF-16).

    Doesn't the DataFlavor contain the character encoding? What is the content of the InputStream returned by
                InputStream is = (InputStream)contents.getTransferData(DataFlavor.getTextPlainUnicodeFlavor());
    If I use
                    DataFlavor df = DataFlavor.getTextPlainUnicodeFlavor();
                    String mimeType = df.getMimeType();
                    String encoding = mimeType.replaceAll(".*?charset=(.*?)\\s*$", "$1");
                    InputStream is = (InputStream) contents.getTransferData(df);
                    ByteArrayOutputStream baos = new ByteArrayOutputStream();
                    byte[] buffer = new byte[1024];
                    for (int count = 0; (count = is.read(buffer)) != -1;)
                        baos.write(buffer, 0, count);
                    baos.close();
                    result = baos.toString(encoding);to transfer
    Hello "World"
    which Word changes the quotes to the smart 'smart quotes' version I get as a result
    Hello “World”
    which is what I expect.
    Am I missing something?
    Edited by: sabre150 on Sep 4, 2009 1:27 PM

  • MS Word -- Smart Quotes Query

    Anyone know how to query the CRX for "smart quotes" that come in through a copy and paste in MS Word? Examples: “ ” ’
    I thought this would be easy through CRXDE Lite but I might have to actually write code.

    I am afraid that Jan is right.
    And this might be a bit of a hassle, since it might affect your forms. If you are using some standard Windows lettertype in your forms, though, you should be OK.
    Bare in mind, that you the NLS_LANG.characterset will have to be compatible between database and forms (the latter at both compiletime and runtime).
    Good Luck!
    Remco

  • Text to speech not working with characters such as quotes? Any ideas

    text to speech not working with characters such as quotes? Any ideas

    Hi linz-kirby
    I posted this (link) https://discussions.apple.com/message/27015600?ac_cid=op123456#27015600

  • Convert smart quotes

    How can I replace smart quotes that have been entered into a form where the user has pasted from an MSWord document? I want to replace the smart quote with a simple apostrophe. I understand that the smart quote for a single quotation is represented by the following in Flex: ’
    Thank you!

    If you use regular expressions, you could do it in a single line and much faster:
    In the following example, only the first instance of "sh" (case-sensitive)  is replaced:
    var myPattern:RegExp = /sh/; 
    var str:String = "She sells seashells by the seashore.";
    trace(str.replace(myPattern, "sch")); 
        // She sells seaschells by the seashore.
    In the following example, all instances of "sh" (case-sensitive)  are replaced because the g (global) flag is set in the regular expression:
    var myPattern:RegExp = /sh/g; 
    var str:String = "She sells seashells by the seashore.";
    trace(str.replace(myPattern, "sch")); 
        // She sells seaschells by the seaschore.
    Dany

  • Pasting smart quotes and apostrophes in code view.

    Sine upgrading to Dreamweaver CS5, I haven't been able to copy/paste smart quotes and apostrophes into code view without them automatically being converted to straight quotes.
    For example, the following sentence (notice the curly quotes):
    John’s new song is called “DW Blues”
    would get pasted into Code View as:
    John's new song is called "DW Blues"
    Notice the smart quotes and apostrophe are replaced with single and double ticks, or "straight quotes."  While this seems like a minor detail, it's extremely important to our writers and editors to have them appear on the website exactly as typed.
    If I do the same copy/paste in Design View (doc type is XHTML Transitional), it appears as:
    John's new song is called &quot;DW Blues&quot;
    The characters are still replaced, and the straight quotes are then entity encoded (as expected).
    This doesn't happen with other valid UTF-8 characters like ™, ®, —, etc., or with any other code editors I've used, including DW CS3.
    Is there a hidden preference somewhere to disable this "feature," or is it just a bug?
    Please help!

    It's now 4 years since jsparacio posted this, and I just wanted to let everyone know that I had -- and am still having -- the exact same problem with Dreamweaver CS5 (running first under Windows XP, then Windows 7, and now again with Windows 8.1). So it's not just Macs that are affected.
    FWIW, I have set my DW CS5 Paste preferences to the 3rd of 4 options available
        1 - Text Only
        2 - Text With Structure
        3 - Text With Structure Plus Basic Formatting
        4 - Text With Structure Plus Full Formatting
    in the EDIT > PREFERENCES > Copy/Paste Preferences dialog box.
    But the Paste Special command ignores this setting, giving me only the first 2 options from which to choose, with option 2 the default selection for Paste Special operations (options 3 and 4 are grayed out, and can't be selected).
    According to David Sawyer McFarland's _Dreamweaver CS5: The Missing Manual_ (O'Reilly Media, 2010), the reason these are grayed out is because I am pasting unformatted ASCII text which I generated in a program editor called "UltraEdit":
        "... Choose EDIT > PASTE SPECIAL to open the Paste Special window. Here, you can choose which of the four techniques you wish to use ... sort of. You're limited to what Dreamweaver can paste. For non-Microsoft Office products, you can use only the first two options--the others are grayed out--whereas you can choose from any of the four with text copied from Word or Excel." (McFarland, p. 81)
    Regardless of such restrictions, standard copy-and-paste (CTRL+C followed by CTRL+V) works just fine for me using Dreamweaver CS4 (i.e., I have never needed to use the Paste Special command), but with DW CS5, neither Paste command (CONTROL+V or CTRL+SHIFT+V) works properly with typographic/curly/smart quotes.
    All typographic quotes -- ASCII-0146 and ASCII-0147 (double quote marks); plus ASCII-0145 and ASCII-0146 (single quote marks, for quotes within a quote) -- are converted to inch (&quot; is entered in the code) and foot (' is entered in the code) marks when I copy-and-paste text with these characters into Dreamweaver's Design View.
    When I copy this same plain ASCII text directly into the code (rather than using Design View), typographic double open & close quotes are converted to the inch (") mark, and typographic single open & close quotes are converted to the foot (') mark.
    The beginning of this week, I installed Dreamweaver CS5, ver. 11.0, Build 4909 under Windows 8.1 OS on my new Ultrabook. I was hoping that under Windows 8.1, DW CS5's handling of typographic quotes might improve so that I can actually use this program that I purchased 4 years ago. Alas, no such luck: I continue to have the same problem I had when I first upgraded to Dreamweaver CS5 back in August 2010 (then running under Windows XP on my desktop computer).
    Back in August 2020, when I first asked about fixes, I was told to change the Title/Encoding setting of Page Properties to "Western European" -- which I tried, but it didn't work then, and it doesn't work now ... and even if it did, it wouldn't be a proper fix for the problem as I have plenty of good reasons for wanting my HTML page Title/Encoding set to Unicode (UTF-8), not Western European ("charset=iso-8859-1").
    The ability to copy-and-paste typographic quotes is such a big deal for me that I chose back in August 2010 to revert to Dreamweaver CS4, which I've been using ever since.
    It is *very* frustrating that, 4 years later, I still can't use this program, and shall be reverting to DW CS4, yet again.
    I continue to be completely flummoxed by this. Every other program with which I am familiar converts non-typographic quotes to typographic/curly/smart quotes ... never have I seen the process automated in reverse!

  • "Smart Quotes" and Extensions

    I'm working on a text processing extension and want to implement a "Convert special characters to safe HTML entities" (that's just a working title!) menu option.  When I copy and paste some blocks of text from Word into the editor, then right-click my selection and choose my extension to process that text, the text that gets sent via the xml packet to my handler converts (in this case) Word's smart quotes (&#8220 and &#8221) to question marks.  This is before my handler processes anything.  I'm just using cfdump to look at the data.  Am I missing some intermediary processing that needs to be done somehow?  I can't figure out where it would be done since I don't have any control over the selected editor content until my handler is fired.
    Thanks!
    Andy

    The encoding is utf-8 (which is the default in CF 9 (and maybe CF 8 too), but I explicitly set it anyway, and still no luck.  It seems like Word Smart Quotes are actually seen by ColdFusion as 3 different characters.  I don't know that Smart Quotes are actually representable as specific entities, though.  When I copy a closing smart quote out of Word into a cfm file and so something like <cset q = "{smart quote here}">, and then loop over that variable one character at a time and output the ascii values, I get 3 values: 226, 8364, 65533.
    Here's a test that contains Smart Quotes that I just copied from MS Word into this editor, it will be interesting to see how they are represented after I post this message:
    “Test”
    Andy

  • Smart quotes in languages other than English

    Hi to all,
    As a technical writer, I deal with texts in several languages. One of the problems I have with FrameMaker, is that French and German "smart" (or "typographic") quotes are not entered automatically. Quotation marks remain in the English format. The solutions offered in Adobe's Knowledge Base are less than optimal:
    - Replacing the way FrameMaker writes smart quotes in the Maker.ini file is made on a general base, and not per language.
    - Manually entering all quotation marks cannot be done easily when you receive a text from others.
    Is there a viable solution to this problem?
    Paolo

    Paolo,
    One way would be to create a small custom config file for each specific language that modifies the definition of the standard double-quote characters used for the Smart quotes. The definition for the standard characters in cmds.cfg  (in the \fminit\configui folder) is:
    <Command CharLeftDblQuote
        <Label Left Double Quote>
        <Definition \xd2>>
    <Command CharRightDblQuote
        <Label Right Double Quote>
        <Definition \xd3>>
    where the Definition parameter specifes the hex value (i.e. character in the FrameRoman encoding) to use for the character that you wish to display in FM.
    For the French, version create a file with those two entries (call it French_smartquotes.cfg) and change the hex values to the appropriate language-specific quote, i.e. guillemotleft is \xc7 and guillemotright is \xc8. Likewise for the German.
    Then, when you need to work in any of theose languages, use View > Menus > Modify... and select the appropriate .cfg file. This will then temporarily change the mapping for the double-quote characters such that when you enter them from the keyboard, the appropraite language specific quote will be used. To switch languages, simply select another .cfg file using the View > Menus > Modify... The setting only persists for the current FM session.
    The alternative is to edit the maker.ini file to enable the language specific smart quotes, listed in the [Spelling] section. The entry is:
    ; Smart Quote Characters
    ; SmartQuotes \xd4\xd5\xd2\xd3 )  English curved quotes
    ; SmartQuotes \xe2\xd4\xe3\xd2 )  German-style quotes with base quotes
    ; SmartQuotes \xd5\xd5\xc7\xc8 )  French-style quotes using guillemets
    ; SmartQuotes \xd5\xd5\xd3\xd3 )  Swedish- and Finnish-style quotes
    ; SmartQuotes \xd4\xd5\xd2\xd3 )  Italian curved quotes
    ; English curved quotes:
    SmartQuotes=‘’“”
    Delete the ";" at the front to un-comment the appropriate entry and add a ";" to the english one.
    However, this route requires you to close FM every time yo uwish to make a change.

  • Smart quotes not working in Pages 5.5.2

    I'm unable to get smart quotes in Pages.
    I'm on a 2009 iMac, using Yosemite 10.10.1, Pages 5.5.2
    I've followed the instructions on Apple's page (http://help.apple.com/pages/mac/5.0/#/tanad45f9cce) to make all quotes smart quotes in my document and it didn't work.
    When I opened Substitutions in Pages the box for smart quotes was ticked, unticking and reticking and doing 'replace all' made no difference.
    Prior to this I'd just changed the font style setting for the document from Courier New to Times New Roman.
    I've also checked in System Prefs/Keyboard/Text where 'Use smart quotes' is ticked.
    I'm formatting a book, this is holding up production of my book as all quotes are non-curly, straight quotes and need to be changed to smart quotes.
    I'm wondering whether I shouldn't go back to the previous version of Pages where everything seemed to work, including auto capitalisation.

    Additionally re my first post above I've tested for smart quotes in MS Word 2008 and Mail 8.1 and they both support smart quotes. Word also converts between quote types.
    I've had to create a workaround to get my Pages document showing smart quotes:
    I tested smart quote support in Word by pasting a page from the Pages document into Word and then converting the straight quotes to curly quotes. which worked. I then copied that text and pasted it back into a new blank page in Pages, the smart quotes stayed curly. So, I'm about to copy and paste my whole book, 55,000+ words, into Word and convert quotes and then paste it back into Pages.
    It seems like the smart quotes issue is Pages 5.5.2 specific and has been for quite some while.

Maybe you are looking for