Unicode: non-Latin characters in identifiers and data

I would like to use Unicode escapes in identifiers, say create
a variable name that is Japanese. But I can't seem to get this
to work.
I have a product (Japanese Partner) that lets me key in latin
characters then converts these to Japanese (kana or Kanji,
depending on various options) in Unicode and passes them
to the input line.
But I can't get these to compile.
Also, if I code:
char Jletter = '\u2f80';
System.out.println("Jletter = " + Jletter);
The runtime output is:
Jletter = ?
I thought it was supposed to display as a Unicode escape.
TIA for any help.

Perhaps, but I'm going on:
"Programs are written in Unicode (�3.1), but lexical translations are provided (�3.2) so that Unicode escapes (�3.3) can be used to include any Unicode character using only ASCII characters."
Then ...
"3.2 Lexical Translations
A raw Unicode character stream is translated into a sequence of tokens, using the following three lexical translation steps, which are applied in turn:
1. A translation of Unicode escapes (�3.3) in the raw stream of Unicode characters to the corresponding Unicode character. A Unicode escape of the form \uxxxx, where xxxx is a hexadecimal value, represents the UTF-16 code unit whose encoding is xxxx. This translation step allows any program to be expressed using only ASCII characters.
2. A translation of the Unicode stream resulting from step 1 into a stream of input characters and line terminators (�3.4).
3. A translation of the stream of input characters and line terminators resulting from step 2 into a sequence of input elements (�3.5) which, after white space (�3.6) and comments (�3.7) are discarded, comprise the tokens (�3.5) that are the terminal symbols of the syntactic grammar (�2.3). "
I take this to mean you can Unicode escapes for Unicode characters. But
it doesn't seem to work, so maybe my understanding is deficient. Maybe
the docs need to be more clear.

Similar Messages

Cannot create file with Non-latin characters- I/O

I'm trying to create a file w/ Greek (or any other non-latin) characters ... for use in a RegEx demo.
I can't seem to create the characters. I'm thinking I'm doing something wrong w/ IO.
The code follows. Any insight would be appreciated. - Thanks
import java.util.regex.*;
import java.io.*;
public class GreekChars{
     public static void main(String [ ] args ) throws Exception{
          int c;
          createInputFile();
//          String input = new BufferedReader(new FileReader("GreekChars.txt")).readLine();
//          System.out.println(input);
          FileReader fr = new FileReader("GreekChars.txt");
          while( (c = fr.read()) != -1)
               System.out.println( (char)c );
     public static void createInputFile() throws Exception {
          PrintStream ps = new PrintStream(new FileOutputStream("GreekChars.txt"));
          ps.println("\u03A9\u0398\u03A0\u03A3"); // omega,theta,pi,sigma
          System.out.println("\u03A9\u0398\u03A0\u03A3"); // omega,theta,pi,sigma
          ps.flush();
          ps.close();
          FileWriter fw = new FileWriter("GreekChars.txt");
          fw.write("\u03A9\u0398\u03A0\u03A3",0,4);
          fw.flush();
          fw.close();
// using a printstream to create file ... and BufferedReader to read
C:> java GreekChars
// using a Filewriter to create files .. and FileReader to read
C:> java GreekChars
*/

Construct your file writer using a unicode format. If
you don't then the file is written using the platform
"default" format -probably ascii.
example:
FileWriter fw = new FileWriter("GreekChars.txt",
"UTF-8");I don't know what version of FileWriter you are using, but not that I know of take two string parameters. You should try checking the API before trying to help someone, instead of just making things up.
To the OP:
The proper way to produce a file in UTF-8 format would be this:
OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream("filename"), "UTF-8");Then to read the file, you would use:
InputStreamReader reader = new InputStreamReader(new FileInputStream("filename"), "UTF-8");

Non latin characters in Safari search

In Safari 6 the search and URL fields are combined. That's fine, except...
We can no longer search using non-Latin characters, because the field accepts only Latin characters. I was trying to search for a Japanese term, and when I switch to Hiragana input and move to the search field, the input switches back to English.
What's the workaround??

The problem has gone away. I suspect it was a problem with corrupted prefs. I trashed the Safari prefs and rebooted to clear another problem and no longer have the problem with search using Japanese characters.
FWIW, the problem I had when I trashed the prefs was with trying to mail a Safari page. I ran in to this originally a week or two ago and called Apple who told me to delete the Safari prefs and reboot. (Actually they gave an alternate procedure to try first, but I didn't bother.) That appears to be a recurrent problem and since there was no hesitation on the solution when I called I would guess that it will be fixed in an early patch. I had already tried trashing the mail prefs since that's where the problem actually appeared (an extra copy of Mail would open, and then hang) but it was in fact the Safari prefs that was causing the problem. I've had to do the delete-and-reboot routine every few days. Not sure why the reboot is required, but it obviously is since just quitting Safari or even logging out doesn't fix it.

Non latin characters in .cfm filename

Hi - I have users who want to name files with non latin characters. i.e.
Логотип_БелРусь_2500x1.cfm
We get a file not found error, it is not an IIS issue and we have UTF-8 encoding and are running CF8.
Yes we can rename the files but for now would like to know if non latin characters are allowed in .cfm file names.
Thank you!
Sapna

PaulH wrote:
en_US is the JRE locale. is that the same as the OS? and what file encoding?
(check via cfadmin).
i ask, because pretty sure you can't use non-ascii file names w/cf. there's an
open bug on that:
http://cfbugs.adobe.com/cfbugreport/flexbugui/cfbugtracker/main.html#bugId=77177
only can guess that file encoding isn't latin-1, etc. and/or OS locale equals
the same language as the file name.
cfadmin gives pretty much the same information. Here's a direct copy
Server Product
ColdFusion
Version
9,0,0,241018
Edition
Developer
Serial Number
Operating System
Windows 2000
OS Version
5.0
Update Level
/C:/ColdFusion9/lib/updates/hf900-78588.jar
Adobe Driver Version
4.0 (Build 0005)
JVM Details
Java Version
1.6.0_12
Java Vendor
Sun Microsystems Inc.
Java Vendor URL
http://java.sun.com/
Java Home
C:\ColdFusion9\runtime\jre
Java File Encoding
Cp1252
Java Default Locale
en_US
File Separator
Path Separator
Line Separator
Chr(13)

We cannot type Polish (non-latin) characters in WebDynpro applications

We cannot type Polish (non-latin) characters in WebDynpro application (in runtime) because 'Browser Help Shortcuts' are fired.
To type a polish character in polish keyboard you need to press AltGr + letter (ie. AltGr + a/c/e/s/o/l/z/x/n). To type an uppercase polish character you need to press AltGr + Shift + letter. This comination is in fact the same as pressing Alt + Ctrl + Shift + letter (because AltGr produces Alt + Ctrl) and it fires some of 'Browser Help Shortcuts'. For example AltGr + Shift + O should produce a letter O with a dash on it's top but instead it fires 'Show nesting of HTML containers'.
We tried to turn off sap-wd-lightspeed, but then other key combinations are reserved for u2018Browser Help Shortcutsu2019.
We need to be able to use AltGr + Shift + a/c/e/s/o/l/z/x/n in runtime.
Product: SAP NW 7.11 SP04
WebDynpro for Java
I hope there is a somewhere a hidden parameter that solves our problem Maybe we're in some kind of debug mode?
Thanks for your help!!

The funny thing is that bold font [when message unread in message list] shows OK, ie in greek, but when i click on unread message, it is assumed to have been read, so it changes over to medium [non bold] and the encoding changes as well into the one that is not greek and thus unreadable. In ~/.sylpheed/sylpheedrc the fonts are:
widget_font=
message_font=-microsoft-sylfaenarm-medium-r-normal-*-*-160-*-*-p-*-iso8859-7
normal_font=-monotype-arial-medium-r-normal-*-12-*-*-*-*-*-iso8859-7
bold_font=-monotype-arial-bold-r-normal-*-12-*-*-*-*-*-iso8859-7
small_font=-monotype-arial-medium-r-normal-*-12-*-*-*-*-*-iso8859-7
In /etc/gtk, for gtk1.2 apps the file refering to greek encoding [el] seems to be fine [exactly the same as in slackware 9.1].

Non-Latin Characters lead to finder distress.

One of the nicest features of Macintosh from the time I first played on an SE30 is the capacity to quickly type non-Latin characters. While to many this might not seem like a big deal, for me being able to write Tetris™ without a second thought is a great convenience.
So I was very surprised when I began typing µ in the Finder under Snow Leopard and didn't end up on a file that started with µ but rather on m. As if that wasn't irritating enough µ is not treated as m so the file becomes completely unreachable by alphabetic selection. This is something I use almost constantly in Finder so having a file unreachable is even worse than merely having the character interpreted incorrectly.
Just to be certain that this wasn't merely a flaw with that one character I examined other common characters.
ƒ, é, π, ∑ all suffer from the same problem, misinterpreted when typed and interpreted correctly during the comparison.
So the question is, “Is this an error in how I set up my machine, an error in the string comparison system, or an error in the Finder program?”

Yes, I just double checked, and I was in error, accented Latins do work as expected. I am certain that the inclusion of such in the prior list was a user error.
However the fact that the Greek key layout works begins to suggest the root of the problem.
Interestingly enough this also applies to the Greek layouts internal option modified keys.
I am strongly suspecting a bug here.

Non- Latin characters not available following iCloud problem

Following help from HKQ recovering my Reminders (it turned out there was a problem with iCloud), my non-Latin characters (and the little globe icon for them have just vanished !) This must have happened during the reset or switching on iCloud. I need this really urgently to send emails to people. Please help,
thank you !

it's ok I found and added foreign keyboard. But the numbers give $ instead of £ even though it says it's on UK English,

Replacing non latin characters

Hi experts,
i have to check some fields of non latin characters.
When the fields include some of non latin charcters I have to replace them
with an "Y".
Have somesone a code example for this case?
Thanks for help!
Alex

This should give you an Idea
WHILE p_faxno CA sy-abcde. " to check if varaible contains any abcde...Z
p_faxno+sy-fdpos(1) = 'Y'.
ENDWHILE.
CONDENSE p_faxno NO-GAPS

Non-Latin Characters

I am getting an error that my file name has non-latin characters, the file is named 01h_backM.jpg I am using save for web.

Are you saving to a folder with symbols in the name? Try saving to a different folder.
Benjamin

Non-latin characters in photo books

Has anybody used symbols or non-latin alphabets when designing photo books? They come out on screen and on proofs just fine (well... it's an Apple, isn't it?) and I assume iPhoto just goes ahead and puts Unicode characters into .pdf verbatim before uploading, however, online printing services may have a problem when processing text.
I am particularly interested in using Greek/Cyrillic alphabets when printing in the UK.

The problem has gone away. I suspect it was a problem with corrupted prefs. I trashed the Safari prefs and rebooted to clear another problem and no longer have the problem with search using Japanese characters.
FWIW, the problem I had when I trashed the prefs was with trying to mail a Safari page. I ran in to this originally a week or two ago and called Apple who told me to delete the Safari prefs and reboot. (Actually they gave an alternate procedure to try first, but I didn't bother.) That appears to be a recurrent problem and since there was no hesitation on the solution when I called I would guess that it will be fixed in an early patch. I had already tried trashing the mail prefs since that's where the problem actually appeared (an extra copy of Mail would open, and then hang) but it was in fact the Safari prefs that was causing the problem. I've had to do the delete-and-reboot routine every few days. Not sure why the reboot is required, but it obviously is since just quitting Safari or even logging out doesn't fix it.

Loading Non-English Characters using VBA and BAPI

Hi Experts,
I am trying to load Non-English characters (Chinese, Korean, Japanese, etc.) into a SAP Table using BAPI and VBA. I have set the connection language and codepage values but when I run the tool, the non-English characters display as ????? or #####. Do you know how to fix this issue?
Thanks!

If your language is a unicode tehn you need to change the options like IN SAP you need to change it to unicode in the initial screen Customize local layout(ALT F12) options 118 --> Encoding ....

Non-Latin characters no displaying properly in Safari on iPhone

I am having an issue where non-English (Chinese, Japanese etc) characters are displaying as squares on the iPhone's Safari browser. I've seen this issue on Windows when you haven't installed support for Asian languages.
It would appear that the version of OS-X on the iPhones does not have support for non-Latin character sets.
Has anyone else experienced this problem?

I am having an issue where non-English (Chinese,
Japanese etc) characters are displaying as squares on
the iPhone's Safari browser.
Could you provide the urls? It may be the pages have bad coding. Correctly coded pages will display according to tests I've seen:
http://homepage.mac.com/thgewecke/iphonesafarilang.jpg
But unlike OS X the iPhone has no way yet to manually correct for miscoded pages via a View > Text Encoding menu.

Can username, password have unicode(non english) characters

Does Oracle allow a username, password to have non english characters.

I found the answer to my own question. In essence it looks like the answer is Yes for UserName and a little confusing for the Password. Password has to be in Single byte characterset, not clear whether 7 bit or 8 bit. 8 bit from my understanding is Ascii + some Western European characters.
Following is from Oracle 10G Database SQL reference
user
Specify the name of the user to be created. This name can contain only characters from your database character set and must follow the rules described in the section "Schema Object Naming Rules". Oracle recommends that the user name contain at least one single-byte character regardless of whether the database character set also contains multibyte characters.
Note:
Oracle recommends that user names and passwords be encoded in ASCII or EBCDIC characters only, depending on your platform.
BY password
The BY password clause lets you creates a local user and indicates that the user must specify password to log on to the database. Passwords can contain only single-byte characters from your database character set regardless of whether the character set also contains multibyte characters.
Passwords must follow the rules described in the section "Schema Object Naming Rules", unless you are using the Oracle Database password complexity verification routine. That routine requires a more complex combination of characters than the normal naming rules permit.

Filenames with non-latin characters aren't found by the filesystem [S]

This might be a bug, but I'm hoping it's just a config file problem.
I have a few files here and there on my NTFS drive that have Japanese characters in their filenames. Sometime recently (I don't have an exact date when they disappeared), they stopped showing up at all. If I browse to a folder that used to contain filenames with Japanese characters, it just appears empty in Gnome. Using ls from a terminal also says the directory is empty. They used to work just fine, but a recent upgrade must have broken them.
Does anyone have any ideas what I can do to get my files to appear again? Is there some way to enable unicode support for filenames or something?
Many thanks!
Edit: Rebooting the system fixed it, though I still think that was a pretty strange problem. Any ideas what was up?
Last edited by ColdPie (2007-11-11 02:07:11)

The funny thing is that bold font [when message unread in message list] shows OK, ie in greek, but when i click on unread message, it is assumed to have been read, so it changes over to medium [non bold] and the encoding changes as well into the one that is not greek and thus unreadable. In ~/.sylpheed/sylpheedrc the fonts are:
widget_font=
message_font=-microsoft-sylfaenarm-medium-r-normal-*-*-160-*-*-p-*-iso8859-7
normal_font=-monotype-arial-medium-r-normal-*-12-*-*-*-*-*-iso8859-7
bold_font=-monotype-arial-bold-r-normal-*-12-*-*-*-*-*-iso8859-7
small_font=-monotype-arial-medium-r-normal-*-12-*-*-*-*-*-iso8859-7
In /etc/gtk, for gtk1.2 apps the file refering to greek encoding [el] seems to be fine [exactly the same as in slackware 9.1].

Non US characters in login and email generation

I have a design problem that I would like to check if anyone else has found a good solution to.
Once you leave the safe shores of the United States your users start having names that includes all kinds of funny characters. In the good old days this problem was resolved by the fact that the HR system only handled 7 bit US ascii characters but today you are likely to have to face an HR system that supports unicode or at least some kind of character set that includes lots of non US ascii characters. I just ran some stats on my current enterprise population and it seems like about 5% of the users have names containing "strangeness".
These strange characters causes big problems if you aren't allowed to include non US ascii characters in logins, email addresses and other generated fields. Exactly what a "strange character" is varies. RFC 5322 takes a quite liberal view towards special characters but explicitly disallows non US letters.
The simplistic solution is to drop any character that isn't a US ascii letter. This works if the problem is names like "O'Malley" as the "'" really shouldn't be part of the user login and probably not part of an email address either(can be debated). This solution breaks down when you get to Germany or Scandinavia where your users that are called "Örjan Åhs" may not appreciate an email address of rjan.hs@your_company.com.
What you would like to do is to convert "Örjan Åhs" to either "Orjan Ahs" or (possible) "Oerjan Aohs" but I haven't been able to find any java lab that does that conversion for you.
Anyone that has run into this problem before and solved it?
I wonder how certain characters in this post will be rendered on computers in different parts of the world :)
/Martin, who long ago converted his last name (Swedish) to be 7 bit ascii compliant

Thanks Daniel
The code above drops any non US ascii characters which is fine in some situations but doesn't work for me as that would result in (amongst other issues) unacceptable email addresses.
Example: The user "Jörgen Åhs" gets the email [email protected] (using drop strategy), what is needed is [email protected]
The solution to this problem is to write a transform function and as we have about 80 non US ascii characters in character set we are using this mapping can quite easily be externalized to a configuration file.
Good point about the preferred name. I have not seen this specific problem in my current system but it is very common in certain parts of the world i.e. people with Chinese heritage in south east Asia often have a Chinese legal name and a western name that they actually use in day to day interactions. If you base the email address of their name in HR much screaming ensures. The same thing should actually happen in the US as you are supposed to enter the name on your social security card into the HR system but that seems largely to be ignored.

Unicode: non-Latin characters in identifiers and data

Similar Messages

Maybe you are looking for