Identifier In Non English Lang

Hi Guys ,
I am new to java I tried to write a simple java
program trying to experiment a bit copied some
different lang text from a web page and paste it in
notepad declaring it as a identifier for int and
saved the file in a unicode format.
however this is generation round about 64 errors and
that to with very strange character.
So can i conclude that java program can be writen
only in english lang, i mean to say identifier . ias
far as i know even if the code are writen in asci at
the time of complilation they first get converted to
Unicode format .
thanks for your concern.any help wil be appreciated.Incorrect assumption my friend. You perhaps used the wrong encoding when compiling. Take this for example:
public class Test {
public static String حرمان = "This is nice.";
public static void main(String[] argv) { System.out.println(حرمان); }
}The variable there is in Arabic coding. Here is the command used to compile it:
javac -encoding UTF-16 Test.javaHere is the output:
java TestThis is nice.
So, it does work. (Though I had never tried before just now...)

Similar Messages

How to Identify non-english characters in a Text

Hi Experts,
I have a text coming from KNA1-NAME1 which contains non-english characters / language at times. I want to identify them in my code so that I can skip them.
Can you please guide with some Command / FM that help to identify these non-english characters?
Regards,
Nirmal

Hi,
I am fine with english characters A-Z, a-z or 0-9 or special characters. But it contains some chinese, japanes or non-english language characters which I dont want.
The logic explained by you above would expect me to list all the valid characters. Also it would be a performance constraint. Hence i wanted something as FM or standard procedure. Can we use ASCII somehow ?
Regards,
Nirmal

Identify Non English Character in a String

All,
We have a requirement to Identify the Non English Characters from the User Key In data and return an error message saying only valid English, Numeric and some special characters are allowed.
For Example, If the User enters data like "This is a Test data" then the return value should be true. or if he enters something like "My Native Language is inglés" then it should return false. Similarly any Chinese, russian or japansese character entryies should also return false.
How can we achieve this?
Thanks,
Nagarajan.

Hi Nagarajan,
You could use Unicode character blocks or simply craft a regular expression that contains all the characters you need. The latter is easy to understand and gives you full control over which characters you want to allow. E.g. I assume you might want something like this:
if(!"This is a proper input string".matches("[\\s\\w\\p{Punct}]+")) {
// Issue error message and re-get input string
The String method matches() takes a regular expression as input parameter. If you haven't dealt with regular expressions before, check out the Java API help for class java.util.regex.Pattern. Here's a short breakdown of the pattern I used:
<ol>
<li>The square brackets [] enclose a list of allowed characters; here you can explicitly list all allowed characters.</li>
<li>You can specify ranges like a-z as a character class, list individual characters like ;:| or utilize predefined character classes (\s for any whitespace character, \w for all letters a-z and A-Z, underscore and 0-9 and the posix class \p for a list of punctuation symbols). For a complete list check Java API help on java.util.regex.Pattern.
<li>The + at the end indicates that the characters listed can occur once or more.</li>
</ol>
There's other ways to achieve what you want, but I think this might be an easy way to start with.
Cheers, harald

Non english characters in DN cannot be retrieved

We are using Netscape directory server 4, protocal V3. We have a problem related to non-english characters appearing in RDN.
We publish to Ldap entries using the values from database. For example, we have pubulished an entry to Ldap, based on DB values, the entry should have a DN like: ou=Liege BELGIUM ... LGG1a, <other components of DN>. However, when we call netscape search API (search against uid attribute which does not have non-english characters), the search return the entry, but when further call getDN() method on the returned Ldap Entry, it only returns Li, instead of the complete DN value.
It seems the entry is corrupted in Ldap. I wanted to delete the corrupted entry and re create new one to test. I tried many ways, but none of them worked, I think it is because DN is corrupted, there is no key value to identify the Ldap entry for any operation(modify, delete).
You help and insights are much appreciated.
Thanks.
Han Shen

LDAP uses the UTF8 encoding. You must store data in the directory using the UTF8 encoding. This includes DN values. This also means that if you want to be able to view the values in your native character set and font, you must use an application that can convert the UTF8 LDAP data back to the native character encoding. The directory console by default should work for LATIN-1 (ISO 8859) languages if the LOCALE is set correctly.

UTL_RAW.REVERSE for non english characters

I'm trying to reverse a non-English work like the following and it does not work ....
SELECT '中国' Original, UTL_RAW.cast_to_varchar2(UTL_RAW.REVERSE (UTL_RAW.cast_to_raw ('中国'))) Not_correctly_Reversed FROM DUAL;
ORIGINAL NOT_CORRECTLY_REVERSED
中国 ��
Any thoughts please ?
Appreciate responses. Thanks !

chris227 wrote:
Works well for meNo, it does not. It will fail if table has duplicate strings of two digit length but, what is even worse, it will produce wrong results if table has small duplicate strings:
SQL> with testdata as (
2                    select '19 character string' str from dual union all
3                    select '19 character string' str from dual
4                   )
5 select distinct
6    listagg(substr(str,level,1))within group ( order by level desc) over (partition by str) r
7 from testdata
8 connect by
9 level <= length(str)
10 and str = prior str
11 and prior sys_guid() is not null
12 /
from testdata
ERROR at line 7:
ORA-01489: result of string concatenation is too long
Elapsed: 00:00:36.08
SQL> with testdata as (
2                    select 'ABC' str from dual union all
3                    select 'ABC' str from dual
4                   )
5 select distinct
6    listagg(substr(str,level,1))within group ( order by level desc) over (partition by str) r
7 from testdata
8 connect by
9 level <= length(str)
10 and str = prior str
11 and prior sys_guid() is not null
12 /
R
CCCCCCCCBBBBAA
Elapsed: 00:00:00.00You need to identify rows uniquely. With real table it is easy - rowid. With subquery factoring we need another view with, for example, ROW_NUMBER. However here we use subquery factoring to create sample table on-the-fly and assume OP will have real table. So I'd leave your example as is, but would let OP know to use:
select distinct
listagg(substr(str,level,1))within group ( order by level desc) over (partition by str) r
from testdata
connect by
level <= length(str)
and rowid = prior rowid
and prior sys_guid() is not null
/But this still will not work. Why? Same answer - duplicates:
SQL> create table testdata as (
2      select 'ABC' str from dual union all
3      select 'ABC' str from dual
4     );
Table created.
Elapsed: 00:00:00.42
SQL> select distinct
2    listagg(substr(str,level,1))within group ( order by level desc) over (partition by str) r
3 from testdata
4 connect by
5 level <= length(str)
6 and rowid = prior rowid
7 and prior sys_guid() is not null
8 /
R
CCBBAA
Elapsed: 00:00:00.01
SQL>Again, partition by str doesn't identify rows uniquery. We need to partition by rowid. But even this will not help:
SQL> select distinct
2    listagg(substr(str,level,1))within group ( order by level desc) over (partition by rowid) r
3 from testdata
4 connect by
5 level <= length(str)
6 and rowid = prior rowid
7 and prior sys_guid() is not null
8 /
R
CBA
Elapsed: 00:00:00.00
SQL>We got one row back instead of two. You probably put DISTINCT trying to resolve all these issues caused by building hierarchy and partitions on a non-unique bases. So now, when we identify rows uniquely by rowid, DISTINCT is not needed and should be replaced by GROUP BY (along with using aggregate LISTAGG instead of analytic LISTAGG). So final solution would be:
select listagg(substr(str,level,1))within group ( order by level desc) r
from testdata
connect by
level <= length(str)
and rowid = prior rowid
and prior sys_guid() is not null
group by rowid
R
CBA
CBA
Elapsed: 00:00:00.00
SQL>And with 19 character string:
SQL> insert
2    into testdata
3 select '19 character string' str from dual union all
4                    select '19 character string' str from dual;
2 rows created.
Elapsed: 00:00:00.00
SQL> select listagg(substr(str,level,1))within group ( order by level desc) r
2 from testdata
3 connect by
4 level <= length(str)
5 and rowid = prior rowid
6 and prior sys_guid() is not null
7 group by rowid
8 /
R
CBA
CBA
gnirts retcarahc 91
gnirts retcarahc 91
Elapsed: 00:00:00.00
SQL>SY.

Non-english input in Wine[SOLVED]

Hi,
Non-english input characters in Windows apps under Wine appear either as blanks or as question marks. Any how-to is welcome .
Meanwhile, I've googled an interesting advice (for Cyrillic):
$ sudo ln -s en_US.UTF-8 /usr/share/X11/locale/ru_RU.UTF-8
The most interesting part is a mystery: where am I supposed to create such a link?
Last edited by Llama (2009-05-29 19:23:46)

Peanut wrote:In other words, that command would overwrite the russian UTF-8 locale with an english UTF-8 locale. I can't see any reason why that should fix anything.
For one thing, the ln command refused to overwrite anything . Yes, I've been curious about the reason.
Peanut wrote:Have you:
1) Installed and enabled fonts that support cyrillic characters?
(1) ru_RU.UTF-8 alongside en_US.UTF-8:
$ locale
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE=C
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=
Peanut wrote:Have you:
2) Tried using other encodings than UTF8?
(2) No, I avoid non-UTF locales.
Luckily, in this day and age the solution turned out to be fairly straightforward:
Starting sequence:
LANG=ru_RU.UTF-8 wine ...
Aliasing, they say, is also possible :
echo "alias wine='LANG=ru_RU.UTF-8 wine'" >> ~/.bashrc
echo "alias wine='LANG=ru_RU.UTF-8 wine'" >> ~/.profile

Entry of non-English characters into the db

Hi
We are facing a problem in inserting non-English characters into the database.For example, we have a company name field which can accept German characters. This field has been defined as of varchar2 type of size 50 in the db. When we enter 49 English characters and then one German character, the database is throwing the error that the inserted value is too large for the column.Is it that the German character is taken as equivalent to two English characters ? Or is there any database level setting that can be done for this ? For the time being we have identified certain critical fields and have doubled the size of their fields in the db. But I guess there has to be another solution to this....
Please help.
null

Indeed, your German character is using two bytes to store itself. Consult the Oracle JDBC Developer's Guide.
null

Entry of non-English characters into database

Hi
We are facing a problem in inserting non-English characters into the database.For example, we have a company name field which can accept German characters. This field has been defined as of varchar2 type of size 50 in the db. When we enter 49 English characters and then one German character, the database is throwing the error that the inserted value is too large for the column.Is it that the German character is taken as equivalent to two English characters ? Or is there any database level setting that can be done for this ? For the time being we have identified certain critical fields and have doubled the size of their fields in the db. But I guess there has to be another solution to this....
Please help.
TIA
Vinoj

Indeed, your German character is using two bytes to store itself. Consult the Oracle JDBC Developer's Guide.
null

Predictive text non-English characters to be made ...

I just filled an enhancement request on this feature, please vote for it HERE:
Predictive text non-English characters to be made optional
The story is: while using predictive text for non-English languages (Polish in my case) the dictionary words are grammar correct which include special characters like: ą,ę,ć,ś,ż,ź,ó,ł etc. For texting (SMS) operators count these as 3 characters making a message much longer than it looks. Therefore I can tell you no one uses these characters while texting and people use EN only characters instead a,e,c,s,z,o,l... which makes using the predictive text useless for eg Polish language.
I'd like to have an option to switch using these non-en chars off for predicting text, which is grammatically not correct but in real life that's how people type.
So basically if there's an option to disable lang specific characters I would be getting an example suggestion of 'Prosze' instead of grammatically correct 'Proszę'. 'Prosze' is a 6 character word, 'Proszę' is a 5+3=8 character word. Considering a single SMS message a 300 chars, than it really makes a difference.
Simple solution would be to replace every char ą with a, ć with c, ó with o etc... in each word suggested for the ones who have this option enabled.

HI,
You can write a code in PAI of main screen. there by using loop at screen you can make that field editable or disabled.
Code sample:
loop at screen.
****condition for value check
if screen-name = 'TEXT_EDIT_NAME'
screen-output = 1.
screen-input = 0.
modify screen.
endif.
endloop.
Hope this will help you.

Reading .txt file and non-english chars

i added .txt files to my app for translations of text messages
the problem is when i read the translations, non-english characters are read wrong on my Nokia. In Sun Wireless Toolkit it works.
See the trouble is because I don't even know what is expected by phone...
UTF-8, ISO Latin 2 or Windows CP1250?
im using CLDC1.0 and MIDP1.0
What's the rigth way to do it?
here's what i have...
String locale =System.getProperty("microedition.locale");
String language = locale.substring(0,2);
String localefile="lang/"+language+".txt";
InputStream r= getClass().getResourceAsStream("/lang/"+language+".txt");
byte[] filetext=new byte[2000];
int len = 0;
try {
len=r.read(filetext);
then i get translation by
value = new String(filetext,start, i-start).trim();

Not sure what the issue is with the runtime. How are you outputing the file and accessing the lists? Here is a more complete sample:
public class Foo {
     final private List colons = new ArrayList();
     final private List nonColons = new ArrayList();
     static final public void main(final String[] args)
          throws Throwable {
          Foo foo = new Foo();
          foo.input();
          foo.output();
     private void input()
          throws IOException {
         BufferedReader reader = new BufferedReader(new FileReader("/temp/foo.txt"));
         String line = reader.readLine();
         while (line != null) {
             List target = line.indexOf(":") >= 0 ? colons : nonColons;
             target.add(line);
             line = reader.readLine();
         reader.close();
     private void output() {
          System.out.println("Colons:");
          Iterator itorColons = colons.iterator();
          while (itorColons.hasNext()) {
               String current = (String) itorColons.next();
               System.out.println(current);
          System.out.println("Non-Colons");
          Iterator itorNonColons = nonColons.iterator();
          while (itorNonColons.hasNext()) {
               String current = (String) itorNonColons.next();
               System.out.println(current);
}The output generated is:
Colons:
a:b
b:c
Non-Colons
a
b
c
My guess is that you are iterating through your lists incorrectly. But glad I could help.
- Saish

[SOLVED] Non english chars kdemod 4 problem

Hello, I have a little problem with KDE and the non english charactes.
If I open a file with non english chars in its name I get something like this:
(In this case kwrite opens "other" file but in other applications it fails with an error of file not found)
Other sympton is that in KDE menu my name have bad chars too:
(It must be López)
And the third sympton is that if try to rename a file in the desktop, I can't write accented chars (á é í ó ú). At the begining the keyboard in this rename dialog was totally in english but i have got a semi spanish keyboard (i can write ñ letters) with the apropiate /etc/hal/fdi/policy/10-keymap.fdi file.
But the most strange is that in general, in all Kde and non-kde applications and even in the console, non english chars works ok. I can go to the file->Open menu of the application and open a file with non english chars in its name. The problem seems to reside in the part of kde that passes the name of the file to the application (¿kwin?)
my locale is es_ES@UTF8 and as I said I have configured correctly the 10-keymap.fdi file.
I have read in some forums that something like this could be a kde or qt bug, but for me it's not clear as i don't see a general complaining about this.
Any idea will be apreciated.
Thanks in advance,
Christian.
Last edited by christian (2009-03-27 14:52:17)

SanskritFritz wrote:
That should be "es_ES.utf8"
Sorry, i mispelled it in the post.
Of course, my locale is es_ES.utf8:
LANG=es_ES.utf8
LC_CTYPE="es_ES.utf8"
LC_NUMERIC="es_ES.utf8"
LC_TIME="es_ES.utf8"
LC_COLLATE=C
LC_MONETARY="es_ES.utf8"
LC_MESSAGES="es_ES.utf8"
LC_PAPER="es_ES.utf8"
LC_NAME="es_ES.utf8"
LC_ADDRESS="es_ES.utf8"
LC_TELEPHONE="es_ES.utf8"
LC_MEASUREMENT="es_ES.utf8"
LC_IDENTIFICATION="es_ES.utf8"
LC_ALL=
I don't think this could be the source of the problem, because, except in the places I said in the firs post, the rest of my system works perfectly.

Parsing Non English characters from OPEN DATASET

Hi Team,
I'm try to download the .txt file using the OPEN DATASET in the windows environment from application server, system language is set up for English.
The problem is, in the .txt file some times I will be getting the non-English characters (Spanish). When I try to download the data into SAP it's not downloading properly eg: ñ when it downloaded into sap it took something like A+_.
My goal is I need to download the Spanish characters properly into sap.
Please advice how to solve this problem.
Thanks,
Selvaraj

Hi Selvaraj!
I haven't checked the situation for 4.5B, nor can I predict exact behavior, but you can give following statement a try:
SET LOCALE LANGUAGE lg.
This should even change content of sy-lange for whole role area (internal session), so change the value back afterwards.
Regards,
Christian

Converting a given string to non-english language

{color:#0000ff}Hi can anybody help me how to convert an entered string in textfield to french or spanish or to anyother non-english language?{color}

Hi,
I don't think you get a language translator package.
What you can do is store the fraises, words in a database.
//SQL Code
CREATE TABLE [Language_Data]
[ID]    INT NOT NULL IDENTITY PRIMARY KEY,
[Lang] VARCHAR(30) NOT NULL,                             //Lang English/French.....
[Type] CHAR(1) NOT NULL,                                 //is Fraise or Word
[Words] VARCHAR(100) NOT NULL                             //Fraise or Word data
GO
CREATE TABLE [Translate]
[ID]       INT NOT NULL IDENTITY PRIMARY KEY,
[FK_Orig] INT NOT NULL REFERENCES [Language_Data]([ID]), //ID of the original language
[FK_Trans] INT NOT NULL REFERENCES [Language_Data]([ID]) //ID's for all known translations
GO Create Stored procedures to add a new word/fraise to the [Language_Data] table,
Create a Stored Procedure to add a translation to the [Translate] table
Please note that to add a translation you will first insert into the [Language_Data] table then
insert the original's ID and the translation ID to the [Translate] table Also make prevision for backwards translation

Acrobat XI Std register issue on non-english os

Hi,
I'm deploying Acrobat XI via SCCM in unattended mode.
MST has been created with all settings EULA, license key, etc and following language options:
in "Installation Options" (silent mode):
Application Language English (US). No other options marked.
Also, I made some change in "Direct Editor" in "Property" table. "Lang list" has been changed to "en_US".
Installation and register works perfect on all "English" systems (by meaning English, it is location in OS). But on all non-English systems after installation following error occurs:
Of course no need to know norwegian to understand that this is something wrong with license key or installation itself. So started googling and this article found:
Error "Language mismatch between entered serial number and Acrobat launch language" | Windows
But what is strange, default language is English. Checked this in Acrobat option and in Modify option.
To clarify, I want to deploy Acrobat XI Std do different locations over the word with English language as default.
How to solve this issue?

I'm installing Adobe 11.0.0 above solution is for 11.0.1
I checked this key and couldn't find any difference HKLM\Software\Adobe\Adobe Acrobat\11.0\Language\current\ "acrobat.dll"
Also, checked wow6432Node and acrobat.dll was missing so I added it but this didn'r resolve whole issue.
I made additional tests today.
Downloaded package once again from: http://helpx.adobe.com/acrobat/kb/acrobat-downloads.html
Created new MST files with different options options:
1. Install all language marked
but I'm getting error during installation (msiexec /i AcroStan.msi /qb TRANSFORMS=AcrobatXIStandard3.mst)
2. Only en_US chosen. Installation completed successfully. So i started to test Acrobat.
     a. Convert Web page to PDF -> OK
    b. Print to PDF -> License box appeard ! -> so I manually put there license key and got this freaky ERROR (S/N doesn't match) once again!
How hard it can be?

Word Replacements for Non- English Characters

Hi
Does anyone have an idea on implementing Word Replacements for non- english characters in TCA- DQM 11i.
We are trying to identify, capture and cleanse common accented characters like à, â , ê
However, the default language for replacement is American English , So even if we add these in the existing lists it will not take any effect
Is creating a new Word replacement list for every language the solution ?? any patch recommendations???
Thanks in advance

It seems that this is an issue that has popped up in various forums before, here's one example from last year:
http://forum.java.sun.com/thread.jspa?forumID=16&threadID=490722
This entry has some suggestions for handling mnemonics in resource bundles, and they would take care of translated mnemonics - as long as the translated values are restricted to the values contained in the VK_XXX keycodes.
And since those values are basically the English (ASCII) character set + a bunch of function keys, it doesn't solve the original problem - how to specify mnemonics that are not part of the English character set. The more I look at this I don't really understand the reason for making setMnemonic (char mnemonic) obsolete and making setMnemonic (int mnemonic) the default. If anything this has made the method more difficult to use.
I also don't understand the statement in the API about setMnemonic (char mnemonic):
"This method is only designed to handle character values which fall between 'a' and 'z' or 'A' and 'Z'."
If the type is "char", why would the character values be restricted to values between 'a' and 'z' or 'A' and 'Z'? I understand the need for the value to be restricted to one keystroke (eliminating the possibility of using ideographic characters), but why make it impossible to use all the Latin-1 and Latin-2 characters, for instance? (and is that in fact the case?) It is established practice on other platforms to be able to use characters such as '�', '�' and '�', for instance.
And if changes were made, why not enable the simple way of specifying a mnemonic that other platforms have implemented, by adding an '&' in front of the character?
Sorry if this disintegrated into a rant - didn't mean to... :-) I'm sure there must be good reasons for the changes, would love to understand them.

Identifier In Non English Lang

Similar Messages

Maybe you are looking for