Native2ascii tool

i need to show the japanese text in my web application. this is how i am doing it.
first i develop my code on windows NT and then preapre a jar file and migrate it to unix server which will host the application.
in my windows NT , i convert the native characters to ascii using native2ascii tool and use these unicode characters in my properties file. every thing works fine and when i host the application on my windows NT machine , i could see the japanese text correctly.
now when i create jar file and migrate to unix server , the japanese characters get screwed up. why ? is it that the unicode characters are different on windows NT and unix server ? do i have to convert the characters on unix machine and use those unicode characters ? assuming that i did that , are these characters different from version of solaris machine to another ? should the jdk version be same ?
on my windows
java version "1.3.1_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_02-b02)
Java HotSpot(TM) Client VM (build 1.3.1_02-b02, mixed mode)
on my unix server
java version "Solaris_JDK_1.2.2_12"
Solaris VM (build Solaris_JDK_1.2.2_12, native threads, sunwjit)
Thanks,
bala

if i were you i will try this:
1. what's the encoding you are using for native2ascii, if it's SJIS, i will check first if i can browse some other web pages with SJIS encoding, so i can determine if it's only my wep application can't show jap text correctly in the unix box.
2.if your web browser cannot display jap text with SJIS encoding, then you need to install some fonts.
you can find some clue from the following link.
http://users.erols.com/eepeter/chinesecomputing/programming/java.html
you can test your jap text by changing the following sample code with an unicode for your jap string
Displaying Chinese
Finding Chinese Fonts Java 2 allows the programmer to directly access the fonts on the machine. The code sample below gets a list of all the fonts on the system, and then checks each font to see if it can display a sample Chinese string. Matching fonts are printed. Variations of the below code can be used to automatically find Chinese fonts and set the font of the Swing components accordingly. A bug in the JVM currently also lists all the logical fonts, whether they support Chinese or not, so I remove them from the list.
// Determine which fonts support Chinese here ...
Font[] allfonts = GraphicsEnvironment.getLocalGraphicsEnvironment().getAllFonts();
int fontcount = 0;
String chinesesample = "\u4e00";
for (int j = 0; j < allfonts.length; j++) {
if (allfonts[j].canDisplayUpTo(chinesesample) == chinesesample.length()) {
if (fontcount == 0) {
ctextarea.setFont(new Font(allfonts[j].getFontName(), Font.PLAIN, 12));
if (allfonts[j].getFontName().startsWith("dialog") == false &&
allfonts[j].getFontName().startsWith("monospaced") == false &&
allfonts[j].getFontName().startsWith("sansserif") == false &&
allfonts[j].getFontName().startsWith("serif") == false) {
cfontchooser.addItem(allfonts[j].getFontName());
fontcount++;
}

Similar Messages

How to use native2ascii tool?

Hi Friends,
I created one property file in english named string.property for my application. I need to convert that property file for different languages like chinese, arabic, japanese, spanish, etc.
I found native2ascii too is use to convert propert file. But I have no idea how to use it. can any one plese explain me by giving proper example if possible.
I would appreciate your help for this.
Thank you

native2ascii [options] [inputfile [outputfile]]
-reverse
Perform the reverse operation: convert a file with Latin-1 and/or Unicode encoded characters to one with native-encoded characters.
-encoding encoding_name
Specify the encoding name which is used by the conversion procedure. The default encoding is taken from System property file.encoding. The encoding_name string must be taken from the first column of the table of supported encodings in the Supported Encodings document.
-Joption
Pass option to the Java virtual machine, where option is one of the options described on the reference page for the java application launcher. For example, -J-Xms48m sets the startup memory to 48 megabytes.

Using native2ascii

Hi,
we are using Oracle 8 db.Our application uses
XML and while inserting data , somehow illegal
XML characters are being inserted in the records.
And when the records are reterieved, the XML
is throwing exception as Illegal XML characters.
I have read about native2ascii tool.But, was
not clear on how to use for our purpose to
read the illegal XML characters.
If i use,
native2ascii -encoding 646 p1.txt p2.txt
If p1.txt contains illegal XML characters,
how r they represented in p2.txt(output file).
Since, i have to know how the illegal characters
are represented.
Please, advise me.
Thanks

Hi Ron,
This looks like an interesting problem.
I am familiar with native2ascii but have very little knowledge of XML.
I'm not sure exactly what you're thinking of doing, but here's how native2ascii works:
native2ascii assumes that you have a file of correctly formed characters in some encoding and that you want the java reader to be able to input this file ( for example, a .java or .properties file). Suppose the file p1.txt is encoded in UTF8. Then you would call
native2ascii -encoding UTF8 p1.txt p2.txt. In file p2.txt, all ASCII characters (all characters from p1.txt with code points less than 0x0080) are represented as single byte ASCII characters. Any characters with code points >= 0x0080 are represented as a sequence of six single byte ASCII characters representing the unicode code point. Thus, for example, if the input file contains the Chinese character for tea, this would be represented in the output file as \u8336 and would be correctly read in by the java reader.
What is that 646 encoding you used in your example?
Regards,
Joe

Is it possible to read native encodings without using native2ascii

Hi All,
I have a text file containing japanese characters saved in a UTF8 format. I can read this into my java application after converting it using the native2ascii tool.
However i'm wondering whether i can read this directly into my java application without going through the native2ascii tool since this file is already in UTF8 format. I have tried, but it doesnt work. Please advise.
Thanks.

You missed reading the java.util.Properties documentation:
"When saving properties to a stream or loading them from a stream, the ISO 8859-1 character encoding is used. For characters that cannot be directly represented in this encoding, Unicode escapes are used; however, only a single 'u' character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings."
What's so bad about using native2ascii?

Cannot find native2ascii.exe (JDK 1.5.0_05 on Windows)

Hi,
I am sorry if this is not the right forum for this issue, but I have a small problem:
I recently installed JDK 1.5 on my Windows XP box, and I need to use the native2ascii tool. For JDK 1.3 and 1.4, I have found this tool in the bin/ folder of the jdk installation. However, I cannot find the native2ascii tool in my JDK 1.5 installation, even though it is described in the documentation...
Any ideas?
Should I try re-installing the 1.5 JDK?

Never mind, I reinstalled as soon as I got the chance, and there it is, in the bin directory...

Weird character filter

I used to code in RedHat. Something happened (something nasty) and I changed to Mandrake to give it a try. Among my source file I had a Class with a method that filtered characters, some usual characters (dollar, punctuation) and some not so usual (yen, euro, dot in the middle of the line, long hyphen, long underline,...). When i reopened my source file in Mandrake, all of the characters in the second group had turned into "?\200" , "?\202\2004" and things of the sort, sometimes even worse. I guess this has to do with encoding and Mandrake, but it's not only a question of display, my classes (which have compiled for a long time without any problems) are now unable to go through this source file, due to the character stuff. Is there a way of going around this? I've tried to find out what character set those escape sequences come from, but I've been unable to find out.
Any ideas?
Here's (a version of) the method:
    private String filterDoc(String in){
     char c;
     char prec;
     StringBuffer out = new StringBuffer();
     for (int i = 0; i < in.length(); i++){
         c = in.charAt(i);
         if ( ( c >= 'a' ) && ( c <= 'z' ) ) out.append(c);
         else switch ( c ) {
         case '.' : {
          prec = in.charAt( i > 1 ? i-2 : i+2);
          if ( (prec == '.') || (prec == ' ') ) out.append(".");
          else out.append("\n");
         } break;
         case '��' : out.append('.');break; // this line : mid-height dot
         case ',' : out.append('\n'); break;
         case ';' : out.append('\n'); break;
         case ':' : out.append('\n'); break;
         case '?' : out.append('\n'); break;
         case '$' : out.append('\n'); break;
         case '��' : out.append('\n'); break; // this line : pound symbol
         case '�,B,(B' : out.append('\n'); break; // this line : euro symbol
         case '��' : out.append('\n'); break; // this line : yen symbol
         case '��' : out.append('\n'); break; // this line: copyright symbol
         case '*' : out.append('\n'); break;
         case '-' : out.append('-'); break;
         case '��$(D' (B: out.append('-'); break; // this line : hyphen
         case '_' : out.append('-'); break;
         default : out.append(c); break;
     return out.toString();
    }I say it's 'a version' of the method, because in windows it looks just like in mandrake, and when I copy paste it here I can see 'some' things have changed, for instance, I see squares (or black squares) up here, but I don't see any squares on emacs, I get the \20something.

The Java compiler uses the platform's default character encoding to read source files, so for portability you shouldn't write non-ASCII characters directly, instead you should use the \uXXXX unicode escape sequences, for instance \u20AC for euro.
It appears that in RedHat your default encoding was UTF-8 in which the characters you mention are encoded with multiple bytes, and when you transferred to Mandrake the default encoding changed and now it's some encoding where each character is encoded with a single byte.
There are a number of things you can do:
o Convert the source files to use unicode escape sequences. The easiest way is to use the native2ascii tool that comes with the SDK: run this command to all your source files:    native2ascii -encoding UTF-8 inputfile outputfileAlso, if you are looking for the escape sequence of a particular character you can find it from http://www.unicode.org/charts/
o Continue using UTF-8 and pass the option -encoding UTF-8 to the compiler:    javac -encoding UTF-8 MyClass.javao Change the platform's default character encoding by modifying locale settings, for instance     export LANG=en_US.UTF-8You can use the command "locale -a" to list all available locales.

How to realize czech texts with ResourceBundles

Hi!
I have got a big problem trying to make one of our german applications available in czech.
We are using ResourceBundles with .properties files for our texts and have now translated them into czech. When loading the bundle by calling:
ResourceBundle rb = ResourceBundle.getBundle("x.x.x.BundleName", locale);all seems to work and there is no exception thrown.
But when i try to get any text out of it i.e. by calling:
rb.getString("name");i get an exception:
java.util.MissingResourceException: Can't find resource for bundle java.util.PropertyResourceBundle, key name
I tryied to use native2ascii with several encodings (Cp1252, ISO-8859-1, ISO-8859-2, UTF-8, UTF-16, ...) but i never got one that realy works the way i want. Most of the encodings i tryied changes some of the special characters in the file but also left some of them so that i can get some translated texts but most of them with boxes or question symbols within. native2ascii with UTF-8 encoding only answers with a sun.io.MalformedInputException in the comand line. UTF-16 initially looked fine cause nearly all special characters was translated into \uXXXX escape sequences but also there i got a problem with one special character: � (hope it will be displayed correctly here). This character seems to annoy the native2ascii tool cause the following characters would also be translated into escape sequences even if this is not necessary. Also i got some question symbols in the GUI again (having the charset in the HTML/JSP file set to UTF-16).
I don't know why the original translated property file could not be read in getting an exception only when trying to get a text not when loading the file. Another thing confusing me is that all encodings (except UTF-16) that i tryied for the translated file corrupted the file by seperating each character with a blank one (so i.e. name = jm�no appered as n a m e = j m \u00e9 n o).
Some of the special characters in the translation are: � � � � � (and also some other characters with those symbols).
Please can anyone help me or does anyone has experiences with translating a (web) project to czech or any similar/related language.
Thanks a lot
Juergen

You also seem to have a problem with using native2ascii. You say that you tried to use it with Cp1252, ISO-8859-1 (among others), but how can you possibly have a Czech file in those code pages? They don't support the Czech characters set. You simply need to know which source code page your file is actually in, and then run native2ascii on the file specifying that encoding. I have never had a problem executing native2ascii on a Czech file, it seems to work just fine - as long as the correct source encoding is used.

Problem with Non English Chars

OS : Mac OS
Java : 1.5.0_07
Hi,
i have an Swing application that reads data from a database and shows them in a swing GUI. The text returned by the database is in Arabic and saved in a TextField object.
But once printed, the arabic chars are screwed up.or actually they r not arabic chars at all!!
For debugging i also write the result of the query in the console and in a log4j log file.
There, it is printed in the right form.
here the code:
System.out.println("D3"+java.nio.charset.Charset.defaultCharset().name());
System.out.println("singular "+dit.getData().getSingular());
log4j,debug("singular "+dit.getData().getSingular());
Font font = Font.decode("Geeza Pro");
textl.setFont(font);
textl.setText(dit.getData().getSingular());
The output in the console is (and log4j) :
D3MacRoman
singular صوف
The output in the Swing Textfield is
��
If i configure log4j to use UTF8 ,then even into log4j log file the same screwed
chars are written.
Looks like i've to tell Swing to use MacRoman, which is the default of the OS and
the used by the console&log4j. but i don't know how to.
Any clue??
Thanks,
Chris.

convert your strings to unicode:
example 1
import java.awt.*;
import java.awt.event.*;
public class ApplicationFrame
    extends Frame {
public ApplicationFrame() { this("ApplicationFrame v1.0"); }
public ApplicationFrame(String title) {
    super(title);
    createUI();
protected void createUI() {
    setSize(500, 400);
    center();
    addWindowListener(new WindowAdapter() {
      public void windowClosing(WindowEvent e) {
        dispose();
        System.exit(0);
public void center() {
    Dimension screenSize = Toolkit.getDefaultToolkit().getScreenSize();
    Dimension frameSize = getSize();
    int x = (screenSize.width - frameSize.width) / 2;
    int y = (screenSize.height - frameSize.height) / 2;
    setLocation(x, y);
import java.awt.*;
public class BidirectionalText {
public static void main(String[] args) {
    Frame f = new ApplicationFrame("BidirectionalText v1.0") {
      public void paint(Graphics g) {
        Graphics2D g2 = (Graphics2D)g;
        g2.setRenderingHint(RenderingHints.KEY_ANTIALIASING,
            RenderingHints.VALUE_ANTIALIAS_ON);
        Font font = new Font("Lucida Sans Regular", Font.PLAIN, 32);
        g2.setFont(font);
        g2.drawString("Please \u062e\u0644\u0639 slowly.", 40, 80);
    f.setVisible(true);
example2
Java Internationalization
By Andy Deitsch, David Czarnecki
ISBN: 0-596-00019-7
O'Reilly
import java.awt.event.*;
import java.awt.*;
import java.text.*;
import javax.swing.*;
public class ArabicDigits extends JPanel {
static JFrame frame;
public ArabicDigits() {
    NumberFormat nf = NumberFormat.getInstance();
    if (nf instanceof DecimalFormat) {
      DecimalFormat df = (DecimalFormat)nf;
      DecimalFormatSymbols dfs = df.getDecimalFormatSymbols();
      // set the beginning of the range to Arabic digits
      dfs.setZeroDigit('\u0660');
      df.setDecimalFormatSymbols(dfs);
    // create a label with the formatted number
    JLabel label = new JLabel(nf.format(1234567.89));
    // set the font with a large enough size so we can easily
    // read the numbers
    label.setFont(new Font("Lucida Sans", Font.PLAIN, 22));
    add(label);
public static void main(String [] argv) {
    ArabicDigits panel = new ArabicDigits();
    frame = new JFrame("Arabic Digits");
    frame.addWindowListener(new WindowAdapter() {
    public void windowClosing(WindowEvent e) {System.exit(0);}});
    frame.getContentPane().add("Center", panel);
    frame.pack();
    frame.setVisible(true);
To avoid having to type all the \u... notation manually, use the native2ascii tool (included with the SDK).
http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/

How to display unicode values in file to corresponding characters

Hello Java-ians !
Could you please calrify my doubt ! I am able to generate unicode values for arbic, russian characters. I did it by generating a UTF-8 format file and I used native2ascii tool to generate unicode values. No I am unable to dispaly the characters. I read the file using FileReader and used JTextFiled.setText method to redisplay the characters. Instead of displaying the corresponding character, I am getting only unicode values. Why it's happening ?
Also could you plese tell me how ResourceBundle works ? It reads the unicode values and displays proper charcters, how ?
Please help me ! I need it desperately !
Martin Sunder Singh D.S.

you have to do like this:
FileInputStream fos = new FileInputStream(new File("japanies.out"));
BufferedReader bw = new BufferedReader(new InputStreamReader(fos,"UTF8"));
System.out.println(bw.readLine());
bw.close();

ResourceBundle in foreign languages

Hi!
I'm using a resource bundle file. Works great in many different languages.
However when I try to use russian letters (Cyrillic) it appears in my web page as ??????? - ?????.
Question- do resource bundle files freak out at foreign character sets ?
Do I need to do a meta-equiv to charset ISO-8859-5 ?
Help please!

I don't know about the encoding mapping stuff... but you shouldn't need to do that string getBytes translation at all. That would only be if your strings were read in as one format when they really should have been read as the other. This happens with Tomcat (and maybe others) because it always by default, uses ISO8895-1 as the encoding. Since that's an 8-bit format, it doesn't really corrupt the bytes, but if a UTF-8 string having several multi-byte characters was read, it would look like more characters (usually jibberish) then it should.
If the page's specifed charset is UTF-8, then what the browser submits will be UTF-8 bytes. So you want the charset defined in the @page declaration cuz it'll be set in the response header that way. The meta tag is usually used for completeness, but probably isn't really needed. The setCharacterEncoding() tells the server to use that encoding instead of ISO8895-1 for reading request data. This page is something I had written to test Chinese (you should be able to see that on Win2K or XP, I think). Just copy/paste the text in the page to the form, I see no reason Russian or others won't work.
// _lang.jsp
<%@ page language="java" contentType="text/html; charset=UTF-8" %>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
     <title>Language Test</title>
     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body bgcolor="#ffffff" background="" text="#000000" link="#ff0000" vlink="#800000" alink="#ff00ff">
<%
request.setCharacterEncoding("UTF-8");
String str = "\u7528\u6237\u540d";
String name = request.getParameter("name");
// OR instead of setCharacterEncoding...
//if(name != null) {
//     name = new String(name.getBytes("ISO8859_1"), "UTF8");
System.out.println(application.getRealPath("/"));
System.out.println(application.getRealPath("/src"));
%>
req enc: <%= request.getCharacterEncoding() %><br />
rsp enc: <%= response.getCharacterEncoding() %><br />
str: <%= str %><br />
name: <%= name %><br />
<form method="POST" action="_lang.jsp">
Name: <input type="text" name="name" value="" >
<input type="submit" name="submit" value="Submit POST" />
</form>
<br />
<form method="GET" action="_lang.jsp">
Name: <input type="text" name="name" value="" >
<input type="submit" name="submit" value="Submit GET" />
</form>
</body>
</html>Using get or post should both work fine.
As for the resource bundles... The contents have to be ASCII text. Non-ASCII need to be converted to Unicode escapes (\uXXXX). So you can write the file in Russian (or whatever) text encoding, then use the native2ascii tool to convert the file to the proper escapes. Then it'll be read as Unicode, and there is no translation needed at all, just write the strings out to the JSP page.

LoadBundle with utf8 problem

hi everyone , i have problem with f:loadBundle , i saved properties file with utf8 encoding but can not load string in it correctly (aisa characters) , help me please !

At a guess - loadBundle uses property file syntax.
Have a look at the API doco for java.util.Properties .
It states that non8859 characters in property files have to be converted to unicode escapes, and that the native2ascii tool that comes with the JDK will do it for you:
http://java.sun.com/javase/6/docs/technotes/tools/windows/native2ascii.html
If you are using ANT to build your project, it may make sense to keep your original properties file in UTF-8 and edit it with your favourite editor, and to convert it as part of the build process.

Force jvm to use UTF8 encoding for properties?

hi,
my problem is i cannot display turkish characters that are retrieved from properties file [(key, value) pair].The value may contain turkish characters.
reference : java toc
"..When saving properties to a stream or loading them from a stream, the ISO 8859-1 character encoding is used. For characters that cannot be directly represented in this encoding, Unicode escapes are used; however, only a single 'u' character is allowed in an escape sequence..."
instead of iso 8859-1 or directly representing non-conforming characters in the form of UTF8 encoding , i want to use UTF8 . i dont know is there a way to enforce the encoding to UTF8 .is it possible ?
note : javac -encoding or java -Djava.encoding has no use..
thanx in advance

Hi,
I did a lot of testing with the method of my last posting and ran into the following problem: each character, that is not available in iso 8859-1 is discarded and replaced by a '?'. So when e.g. loading arabic characters you get just "?????" as your property's value.
The only possibility to enable other encodings of Property Files is to replace the encoding of the reader to one that is better suited. So I created a customized class, that inherits from java.util.Properties, and supports loading and storing to any encoding. This class I give here:
* Properties.java
* Created on 11. Juni 2003, 14:08
package xy;
* The <code>Properties</code> class represents a persistent set of
* properties. The <code>Properties</code> can be saved to a stream
* or loaded from a stream. Each key and its corresponding value in
* the property list is a string.
* <p>
* A property list can contain another property list as its
* "defaults"; this second property list is searched if
* the property key is not found in the original property list.
* <p>
* Because <code>Properties</code> inherits from <code>Hashtable</code>, the
* <code>put</code> and <code>putAll</code> methods can be applied to a
* <code>Properties</code> object. Their use is strongly discouraged as they
* allow the caller to insert entries whose keys or values are not
* <code>Strings</code>. The <code>setProperty</code> method should be used
* instead. If the <code>store</code> or <code>save</code> method is called
* on a "compromised" <code>Properties</code> object that contains a
* non-<code>String</code> key or value, the call will fail.
* <p>
* <a name="encoding"></a>
* When saving properties to a stream or loading them from a stream, the
* ISO 8859-1 character encoding can be used. For characters that cannot be directly
* represented in this encoding,
* <a href="http://java.sun.com/docs/books/jls/html/3.doc.html#100850">Unicode escapes</a>
* are used; however, only a single 'u' character is allowed in an escape sequence.
* The native2ascii tool can be used to convert property files to and from
* other character encodings.
* </p>
* <p>
* This Properties class is an extension of the default properties class an supports the
* loading and saving from and into other encodings than ISO 8859-1.
* </p>
* @see <a href="../../../tooldocs/solaris/native2ascii.html">native2ascii tool for Solaris</a>
* @see <a href="../../../tooldocs/win32/native2ascii.html">native2ascii tool for Windows</a>
* @author Gregor Kappler, extended the class of JDK by
* @author Arthur van Hoff
* @author Michael McCloskey
* @version 1.64, 06/26/00
* @since   JDK1.0
public class Properties extends java.util.Properties {
    private static final String keyValueSeparators = "=: \t\r\n\f";
    private static final String strictKeyValueSeparators = "=:";
    private static final String specialSaveChars = "=: \t\r\n\f#!";
    private static final String whiteSpaceChars = " \t\r\n\f";
    /** Creates a new instance of Properties */
    public Properties() {
     * Reads a property list (key and element pairs) from the input stream.
     * The stream is assumed to be in the specified character encoding.
     * <p>
     * Every property occupies one line of the input stream. Each line
     * is terminated by a line terminator (<code>\n</code> or <code>\r</code>
     * or <code>\r\n</code>). Lines from the input stream are processed until
     * end of file is reached on the input stream.
     * <p>
     * A line that contains only whitespace or whose first non-whitespace
     * character is an ASCII <code>#</code> or <code>!</code> is ignored
     * (thus, <code>#</code> or <code>!</code> indicate comment lines).
     * <p>
     * Every line other than a blank line or a comment line describes one
     * property to be added to the table (except that if a line ends with \,
     * then the following line, if it exists, is treated as a continuation
     * line, as described
     * below). The key consists of all the characters in the line starting
     * with the first non-whitespace character and up to, but not including,
     * the first ASCII <code>=</code>, <code>:</code>, or whitespace
     * character. All of the key termination characters may be included in
     * the key by preceding them with a \.
     * Any whitespace after the key is skipped; if the first non-whitespace
     * character after the key is <code>=</code> or <code>:</code>, then it
     * is ignored and any whitespace characters after it are also skipped.
     * All remaining characters on the line become part of the associated
     * element string. Within the element string, the ASCII
     * escape sequences <code>\t</code>, <code>\n</code>,
     * <code>\r</code>, <code>\\</code>, <code>\"</code>, <code>\'</code>,
     * <code>\ </code> (a backslash and a space)
     * are recognized and converted to single
     * characters. Moreover, if the last character on the line is
     * <code>\</code>, then the next line is treated as a continuation of the
     * current line; the <code>\</code> and line terminator are simply
     * discarded, and any leading whitespace characters on the continuation
     * line are also discarded and are not part of the element string. <br>
     * Note:
     * <code>\u</code><i>xxxx</i> is not supported if the encoding is not
     * ISO 8859-1!
     * <p>
     * As an example, each of the following four lines specifies the key
     * <code>"Truth"</code> and the associated element value
     * <code>"Beauty"</code>:
     * <p>
     * <pre>
     * Truth = Beauty
     *     Truth:Beauty
     * Truth               :Beauty
     * </pre>
     * As another example, the following three lines specify a single
     * property:
     * <p>
     * <pre>
     * fruits                    apple, banana, pear, \
     *                                  cantaloupe, watermelon, \
     *                                  kiwi, mango
     * </pre>
     * The key is <code>"fruits"</code> and the associated element is:
     * <p>
     * <pre>"apple, banana, pear, cantaloupe, watermelon, kiwi, mango"</pre>
     * Note that a space appears before each <code>\</code> so that a space
     * will appear after each comma in the final result; the <code>\</code>,
     * line terminator, and leading whitespace on the continuation line are
     * merely discarded and are <i>not</i> replaced by one or more other
     * characters.
     * <p>
     * As a third example, the line:
     * <p>
     * <pre>cheeses
     * </pre>
     * specifies that the key is <code>"cheeses"</code> and the associated
     * element is the empty string.<p>
     * @param      inStream   the input stream.
     * @exception IOException if an error occurred when reading from the
     *               input stream.
    public synchronized void load(java.io.InputStream inStream, java.nio.charset.Charset encoding) throws java.io.IOException {
        if (encoding.equals (encoding.forName("8859_1"))) {
            super.load (inStream);
            return;
        java.io.BufferedReader in = new java.io.BufferedReader(new java.io.InputStreamReader(inStream, encoding));
     while (true) {
            // Get next line
            String line = in.readLine();
            if (line == null)
                return;
            if (line.length() > 0) {
                // Continue lines that end in slashes if they are not comments
                char firstChar = line.charAt(0);
                if ((firstChar != '#') && (firstChar != '!')) {
                    while (continueLine(line)) {
                        String nextLine = in.readLine();
                        if(nextLine == null)
                            nextLine = "";
                        String loppedLine = line.substring(0, line.length()-1);
                        // Advance beyond whitespace on new line
                        int startIndex=0;
                        for(startIndex=0; startIndex<nextLine.length(); startIndex++)
                            if (whiteSpaceChars.indexOf(nextLine.charAt(startIndex)) == -1)
                                break;
                        nextLine = nextLine.substring(startIndex,nextLine.length());
                        line = new String(loppedLine+nextLine);
                    // Find start of key
                    int len = line.length();
                    int keyStart;
                    for(keyStart=0; keyStart<len; keyStart++) {
                        if(whiteSpaceChars.indexOf(line.charAt(keyStart)) == -1)
                            break;
                    // Blank lines are ignored
                    if (keyStart == len)
                        continue;
                    // Find separation between key and value
                    int separatorIndex;
                    for(separatorIndex=keyStart; separatorIndex<len; separatorIndex++) {
                        char currentChar = line.charAt(separatorIndex);
                        if (currentChar == '\\')
                            separatorIndex++;
                        else if(keyValueSeparators.indexOf(currentChar) != -1)
                            break;
                    // Skip over whitespace after key if any
                    int valueIndex;
                    for (valueIndex=separatorIndex; valueIndex<len; valueIndex++)
                        if (whiteSpaceChars.indexOf(line.charAt(valueIndex)) == -1)
                            break;
                    // Skip over one non whitespace key value separators if any
                    if (valueIndex < len)
                        if (strictKeyValueSeparators.indexOf(line.charAt(valueIndex)) != -1)
                            valueIndex++;
                    // Skip over white space after other separators if any
                    while (valueIndex < len) {
                        if (whiteSpaceChars.indexOf(line.charAt(valueIndex)) == -1)
                            break;
                        valueIndex++;
                    String key = line.substring(keyStart, separatorIndex);
                    String value = (separatorIndex < len) ? line.substring(valueIndex, len) : "";
                    // Convert then store key and value
                    key = loadConvert(key);
                    value = loadConvert(value);
                    put(key, value);
     * Writes this property list (key and element pairs) in this
     * <code>Properties</code> table to the output stream in a format suitable
     * for loading into a <code>Properties</code> table using the
     * <code>load</code> method.
     * The stream is written using the ISO 8859-1 character encoding.
     * <p>
     * Properties from the defaults table of this <code>Properties</code>
     * table (if any) are <i>not</i> written out by this method.
     * <p>
     * If the header argument is not null, then an ASCII <code>#</code>
     * character, the header string, and a line separator are first written
     * to the output stream. Thus, the <code>header</code> can serve as an
     * identifying comment.
     * <p>
     * Next, a comment line is always written, consisting of an ASCII
     * <code>#</code> character, the current date and time (as if produced
     * by the <code>toString</code> method of <code>Date</code> for the
     * current time), and a line separator as generated by the Writer.
     * <p>
     * Then every entry in this <code>Properties</code> table is written out,
     * one per line. For each entry the key string is written, then an ASCII
     * <code>=</code>, then the associated element string. Each character of
     * the element string is examined to see whether it should be rendered as
     * an escape sequence. The ASCII characters <code>\</code>, tab, newline,
     * and carriage return are written as <code>\\</code>, <code>\t</code>,
     * <code>\n</code>, and <code>\r</code>, respectively. Characters less
     * than <code>\u0020</code> and characters greater than
     * <code>\u007E</code> are written as <code>\u</code><i>xxxx</i> for
     * the appropriate hexadecimal value <i>xxxx</i>. Leading space characters,
     * but not embedded or trailing space characters, are written with a
     * preceding <code>\</code>. The key and value characters <code>#</code>,
     * <code>!</code>, <code>=</code>, and <code>:</code> are written with a
     * preceding slash to ensure that they are properly loaded.
     * <p>
     * After the entries have been written, the output stream is flushed. The
     * output stream remains open after this method returns.
     * @param   out      an output stream.
     * @param   header   a description of the property list.
     * @exception IOException if writing this property list to the specified
     *             output stream throws an <tt>IOException</tt>.
     * @exception ClassCastException if this <code>Properties</code> object
     *             contains any keys or values that are not <code>Strings</code>.
     * @exception NullPointerException if <code>out</code> is null.
     * @since 1.2
    public synchronized void store(java.io.OutputStream out, java.nio.charset.Charset encoding, String header)
    throws java.io.IOException
        if (encoding.equals (encoding.forName("8859_1"))) {
            super.store (out,header);
            return;
        java.io.BufferedWriter awriter;
        awriter = new java.io.BufferedWriter(new java.io.OutputStreamWriter(out,encoding));
        if (header != null)
            writeln(awriter, "#" + header);
        writeln(awriter, "#" + new java.util.Date().toString());
        for (java.util.Enumeration e = keys(); e.hasMoreElements();) {
            String key = (String)e.nextElement();
            String val = (String)get(key);
            key = saveConvert(key, true);
         /* No need to escape embedded and trailing spaces for value, hence
          * pass false to flag.
            val = saveConvert(val, false);
            writeln(awriter, key + "=" + val);
        awriter.flush();
     * changes special saved chars to their original forms
    private String loadConvert (String theString) {
        char aChar;
        int len = theString.length();
        StringBuffer outBuffer = new StringBuffer(len);
        for(int x=0; x<len; ) {
            aChar = theString.charAt(x++);
            if (aChar == '\\') {
                aChar = theString.charAt(x++);
                if (aChar == 't') aChar = '\t';
                else if (aChar == 'r') aChar = '\r';
                else if (aChar == 'n') aChar = '\n';
                else if (aChar == 'f') aChar = '\f';
                else if (aChar == '\\') aChar = '\\';
                else if (aChar == '\"') aChar = '\"';
                else if (aChar == '\'') aChar = '\'';
                else if (aChar == ' ') aChar = ' ';
                else
                    throw new IllegalArgumentException ("error in Encoding: '\\"+aChar+" not supported");
                outBuffer.append(aChar);
            } else
                outBuffer.append(aChar);
        return outBuffer.toString();
     * writes out any of the characters in specialSaveChars
     * with a preceding slash
    private String saveConvert(String theString, boolean escapeSpace) {
        int len = theString.length();
        StringBuffer outBuffer = new StringBuffer(len*2);
        for(int x=0; x<len; x++) {
            char aChar = theString.charAt(x);
            switch(aChar) {
          case ' ':
              if (x == 0 || escapeSpace)
               outBuffer.append('\\');
              outBuffer.append(' ');
              break;
                case '\\':outBuffer.append('\\'); outBuffer.append('\\');
                          break;
                case '\t':outBuffer.append('\\'); outBuffer.append('t');
                          break;
                case '\n':outBuffer.append('\\'); outBuffer.append('n');
                          break;
                case '\r':outBuffer.append('\\'); outBuffer.append('r');
                          break;
                case '\f':outBuffer.append('\\'); outBuffer.append('f');
                          break;
                default:
//                    if ((aChar < 0x0020) || (aChar > 0x007e)) {
//                        outBuffer.append(aChar);
//                    } else {
                        if (specialSaveChars.indexOf(aChar) != -1)
                            outBuffer.append('\\');
                        outBuffer.append(aChar);
        return outBuffer.toString();
     * Returns true if the given line is a line that must
     * be appended to the next line
    private boolean continueLine (String line) {
        int slashCount = 0;
        int index = line.length() - 1;
        while((index >= 0) && (line.charAt(index--) == '\\'))
            slashCount++;
        return (slashCount % 2 == 1);
    private static void writeln(java.io.BufferedWriter bw, String s) throws java.io.IOException {
        bw.write(s);
        bw.newLine();
}I hope you can use this class for your needs as I can. For me it supports any characters so far. If you find some bugs on it, let me know
Regards,
Gregor Kappler

Internationalization for Hebrew language

hi,
I made an application, internationalized to be used with some different languages.
Til now I internationalized it only for languages that have Latin letters (ISO/IEC 10646-1)
and in this way it is working fine...
Recently I had a request to put it in Hebrew language, that doesn't use Latin characters.
Reading the documentation, I implemented the localization for the Hebrew language in the
follow way:
Someone wrote the words in Hebrew characters inside a property file (file_iw_IL.properties) ,
afterwards I transformed the Hebrew characters in the UTF-16 mode "\uxxxx", using the
native2ascii tool that come with jdk ....
But when the application for the Hebrew language run, it didn't work...
I should like to know if the procedure that I did for Hebrew
internationalization is right and, if is not, I would like to know what is the right precedure
to make internationalization for languages that don't have Latin characters.
thank you in advance for a kind help
regards
tonyMrsangelo

No one answer me so I try again for some help, explain better my problem.
I use this class to get strings in different languages:
public class SupplierOfInternationalizedStrings {
    Locale localUsedIt;
    Locale localUsedEn;
    Locale localUsedHe;
    Locale localizedCurrencyFormat;
    Locale localUsedHere;
    public SupplierOfInternationalizedStrings() { // constructor
        localUsedIt = new Locale("it", "IT"); // specifica il file appartenente alla famiglia
        localUsedEn = new Locale("en", "US"); // specifica il file appartenente alla famiglia
        localUsedHe = new Locale("he", "HE"); // specifica il file appartenente alla famiglia
    } // constructor
    void setInternationalizationCountry(String langToUse) {
        if (langToUse.compareToIgnoreCase("Italiano") == 0) {
            localUsedHere = localUsedIt;
            localizedCurrencyFormat = new Locale("it", "IT");
        } else if (langToUse.compareToIgnoreCase("English") == 0) {
            localUsedHere = localUsedEn;
            localizedCurrencyFormat = new Locale("iw", "IL");
        } else if (langToUse.compareToIgnoreCase("Hebrew")== 0) {
            localUsedHere = localUsedHe;
            localizedCurrencyFormat = new Locale("iw", "IL");
       System.out.println("linguaggio impostato = " + localUsedHere);
    } // initializeInternationalization()
    public String getInternationalString(String keyForTheWord) {
        ResourceBundle resourceBund = ResourceBundle.getBundle("properiesFile", // il nome del file .properties... la famiglia dei files
                localUsedHere);
        String word = resourceBund.getString(keyForTheWord);
        return word;
    } // getStringForMedidentStartClass()
} // class SupplierOfInternationalizedStringsOf course I have a file.property iw_IL - Hebrew (Israel) where for each key there is a value wrote with hebrew characters
the hebrew file property look in this way:
keyForLabel1= ה עברית מילה שתימ \n
keyForLabel2=ספה עברית מילה שתימ The class SupplierOfInternationalizedStrings works for the translation in english and in other languages that use latins letters ..,
but when it is setted for be used with the Hebrew language ( localUsedHere = localUsedHe)
the metod getInternationalString() don't return Hebrew words.
This didn't worry me, because I read in the documentation that properies file cannot
read characters that are different from latine... and I also learned that in this case it need to convert the letters
in Unicode format...
I made this work putting the hebrew words in an .rtf file and, using native2ascii utility
with the command native2ascii -encoding UTF8 file.rtf textdoc.txt, I got the unicode format.
Now the file property for Hebrew language looks in this way:
keyForLabel1     =       \u00d4 \u00e2\u00d1\u00e8\u00d9\u00ea...
keyForLabel2     =       \u00e1\u00e4\u00d4 \u00e2\u00d1.....At this point I expected the translation in Hebrew comes good... but instead I still I cannot get the Hebrew words..
I should like to know why the elaboration still doesn't works
and I would have too some help in order to do the program works good.
thank you
regards
tonyMrsangelo

Non-Latin-1 characters in properties files?

We are having substantial difficulties translating a properties file into Turkish (which uses 6 characters outside the Latin 1 code page). The only way we've found to encode them that seems to work is to encode them with backslash escapes. Is there a better way?

No, what you did is what you have to do. From the API documentation for Properties:
"When saving properties to a stream or loading them from a stream, the ISO 8859-1 character encoding is used. For characters that cannot be directly represented in this encoding, Unicode escapes are used; however, only a single 'u' character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings."

Use properties to get character in file but some char could not be decode

i use properties class to get a file with Big5 character inside the file, but some character could not be display properly.....
sample code:
import java.io.*;
import java.util.*;
public class Frankie {
public static void main(String[] arg) {
try {
Properties p = new Properties();
p.load(new FileInputStream("file.ini"));
Enumeration e = p.propertyNames();
while (e.hasMoreElements()) {
String name = (String)e.nextElement();
String value = p.getProperty(name);
String coded_value = new String(value.getBytes("iso-8859-1"), "Big5");
System.out.println(name + " : " + coded_value);
byte[] bBytes2 = coded_value.getBytes();
for (int k = 0; k < bBytes2.length; k++) {
System.out.println("byte " + "iso1" + "[" + k + "] = " + bBytes2[k]);
catch (Exception e) {
e.printStackTrace();
==================
file.ini
people=你
a=餐
==================
result:
a : ?
byte iso1[0] = 63
people : 你
byte iso1[0] = -89
byte iso1[1] = 65
==================
the proper byte of "餐" should be (-64, 92).....
If i use a varible to store this character in the source code, and use value.getBytes("big5"), the byte could be properly display for this character....
how can i solve this? thanks a lot!

The Properties class javadoc says
The load and store methods load and store properties in a simple line-oriented format specified
below. This format uses the ISO 8859-1 character encoding. Characters that cannot be directly
represented in this encoding can be written using Unicode escapes ; only a single 'u' character
is allowed in an escape sequence. The native2ascii tool can be used to convert property files to
and from other character encodings.
which means you are not supposed to use "big5" encoding in your Properties text file
directly. There is a commandline tool "native2ascii" bundled with your jdk package that
you can use to convert your "big5" encoded Properties file into unicode escapes based
text file, then you no longer needs to play the trick
String coded_value = new String(value.getBytes("iso-8859-1"), "Big5");
p.getProperty(name) will give you exactly the correct "value" defined in your properties
file.
-x
btw, when using "native2ascii", if you are not in a "big5" env, using "-encoding big5" option
to force it.

Native2ascii tool

Similar Messages

Maybe you are looking for