Japanese JIS encoding conversion in Linux

My program reads in a japanese data from MySQL database which the data is stored in UTF-8 and convert it into EUC-JP encoding in Linux and finally write it into a Text File. However, when I open the Text File, the contents have been garbled.
I do not have any problem when working on the "Shift_JIS" or "SJIS" in Windows, this is because both encodings are Windows/DOS specific encoding, the "EUC-JP" SHOULD be used in Linux.
Can somebody point me in the right direction of this ?
Thanks

However, when I open the Text File, the contents have been garbled. How do you open it? Is it not the opening program's issue?

Similar Messages

Japanese email encoding setting possible?

In Japan, the traditional encoding for mail transport has been JIS encoding. And the traditional encoding for web content has been ShiftJIS. Typically a site would receive the email in JIS, convert to ShiftJIS and store the contents in the database.
Lately this has moved more towards UTF-8, but there are still a lot of web sites which are ShiftJIS and JIS is considered a standard encoding for email transport.
At a particular network I post at, they run their site in ShiftJIS and assume email arrives in JIS - the traditional standard.
But email sent from the iPhone ends up corrupted when posted to the site because it seems that the iPhone Mail app sends its messages using UTF-8 encoding.
My question is, does anybody know a way to change the default Japanese message sending encoding from UTF-8 to JIS?
Thanks,
doug

ISO-2020-JP and Shift-JIS are quite different.
That is true. ISO-2022-JP is JIS and is used for the mail transport. This particular server receives ISO-2022-JP email and then converts it to Shift-JIS for storage in the database. Receiving JIS and converting it to Shift-JIS is pretty common. That's why I wanted to convert UTF-8 to Shift-JIS - for ultimate storage in the database.
I think that conversion might not be possible via algorithm but require tables.
I'm afraid that might be true also. I've been looking around for some JavaScript code that might do that conversion, using tables, but haven't been able to find any.
The easiest thing would be if there was a way of getting the iPhone to send Japanese email using ISO-2022-JP.
The next easiest (and actually more robust) solution would be modifying the server to accept UTF-8 messages as well, but converting them to Shift-JIS when storing in the database. Then the server could receive both JIS and UTF-8 emails.
The best long-term solution would be converting the enter server system to UTF-8, but that would be a huge project, and conversion of all the existing content would be required as well. And the whole emailing system in the server would need to be modified.
doug

Character encoding conversion for marshall/unmarshall?

Hello, Java Web Services gurus,
I am wondering if there is an easy/plugin-able way to do character encoding conversion transparently in the process of marshall/unmarshall.
Basically, my input/output will always be these UTF-8 XMLs. As the backend database is ISO encoded, I hope the result of unmarshall will give me ISO strings. And when it comes to marshall, the ISO strings can be transparently turned to UTF-8 XML response. Right now I'm using JAXB's annotations to parse XML into objects.
I understand there will be chars in the input file not able to get converted, if so, I'd be be expecting an error/exception that flags the failure
Hope I sound clear. This has been a headache for a while. Really hope someone may help out a bit. Thanks a million in advance

[Duplicate Post|http://forums.sun.com/thread.jspa?messageID=10971554&tstart=0#10971554]

Can I use US keyboard as Japanese (jis) keyboard?

Hi,
I'm new to these forums, and I speak and do business in Japanese. Because of this, I need Japanese input on my MacBook, and I am familiar with the Kotoeri setup in Mac OS X and use it often. However, after living in Japan for several years with an Intel Core2Duo iMac with a Japanese JIS keyboard (US keyboard also with Japanese characters on it), I would like to use my US keyboard built into the MacBook as a JIS keyboard. By this I am asking if it is possible to set up a new keyboard map for the keyboard (so when I press the "a" key for example, the Japanese character "ち" is input).
I would be extremely grateful for any help.
Thank you,
Tom

I don't know of any way to make your keyboard pretend it is a full JIS keyboard, but there are two things you can do:
+Use Kotoeri Preferences to switch to Kana Typing (Operation > Input Style) . That means a generates ち
+Use the app Ukelele to create a custom keyboard layout.

TS3711 I have changed the keyboard on my iBook G4 (OS 10.5) to a Japanese JIS Keyboard but cannot get the Keyboard Setup Assisstant to run and recognise it.

I have changed the keyboard on my iBook G4 (OS 10.5) to a Japanese JIS Keyboard but cannot get the Keyboard Setup Assistant to run and recognise it. Some symbols don't match. What can I do to get the new keyboard recognised/installed? The keyboard setup assitant looks as if it will open when clicked on but does not.

You may need to reset the pmu. See this article:
http://m10lmac.blogspot.com/2009/12/fixing-keyboard-type-problems.html

Problem with URL encoding conversion

Hi all,
I am working on an I18N application and in my application one component sends the request to another component and then this component fetch that requet and extract the query-parameters from the request (HTTP request).
Now the problem is that the input to first component can be given in one of the 5 character encodings:-
UTF-8
Shift_JIS
EUC_JP
Windows-31J
ISO-2022-JP
I have created a test program that convert the encoded URL from one character encoding to another character encoding. It is working successfully for the above 4 encodings but for the last encoding that is "ISO-2022-JP" this fails. The test programs is: -
import java.io.*;
import java.util.*;
import java.net.URLDecoder;
import java.net.URLEncoder;
class JPtoUTF8{
     public static void main(String[] args){
          try{
              String shift_jis = "%82%C8%82%A4%82%8B%82%E8";          // This is Shift_JIS encoded URL
              String iso2022jp = "%1B%24B%24J%24%26%23k%24j%1B%28B"; // This is ISO-2022-JP encoded URL
              String utf8 = "%E3%81%AA%E3%81%86%EF%BD%8B%E3%82%8A";   // This is the result that should be obtained
              String decodedShift_jis = URLDecoder.decode(shift_jis,"Shift_JIS");
              String decodedIso2022jp = URLDecoder.decode(iso2022jp,"ISO-2022-JP");
              String encodedShift_JIS = URLEncoder.encode(decodedShift_jis,"UTF-8");
              String encodedIso2022jp = URLEncoder.encode(decodedIso2022jp,"UTF-8");
               System.out.println("shift_jis        = "+shift_jis);
               System.out.println("encodedShift_JIS = "+encodedShift_JIS);
               System.out.println("iso2022jp        = "+iso2022jp);
               System.out.println("encodedIso2022jp = "+encodedIso2022jp);
          }catch(Exception e){
               e.printStackTrace();
}I am using jdk5 for this application.
Please give your valuable suggestions.
Thanks in advance.

Could the cause be that ISO-2022-JP is not just ISO-2022-JP:
http://www.w3.org/TR/japanese-xml/#AEN28427904
Maybe what you are getting is one of the flavors, while the java urldecoder uses another flavor? Or maybe the string you are getting is incorrectly encoded to being with (might have been incorrectly converted from shift-jis)?
With the shift-in shift-out design it is a difficult encoding to deal with under the best of circumstances, so you have my sympathy.

Some japanese character encoded to "?". Please help me..

My system is below listed.
J2SE 1.4.1
MySql Ver 11.18 Distrib 3.23.52,
Resin 2.1.4
Java application load html document encoded 'SHIFT_JIS' using HtmlURLConnection.
And read the document in 'SHIFT_JIS'.
Almost it appears properly but some of characters printed in '?'.
hm....
I will show my source code.
private String readDocument(URL url) {
String METHOD_NM = ".readDocument()";
try {
HttpURLConnection URLCon = (HttpURLConnection)url.openConnection();
BufferedReader in = new BufferedReader( new InputStreamReader(url.openStream(), "SJIS"));
String inputLine;
On my web server, input the character(printed '?') on textbox in IE6(japanese language pack). and submit. then the character translated the code(ex> #4575; )
The code inserted DB. select from DB the tuple.
and display the IE6. it's ok.
but loaded from japanese html. It's inserted to DB '?'.
and displayed '?' on IE6.
I want to translate the character to code(ex> #5455;).
OK?
Please reply...
p.s I'm sorry my poor english..

Thank you for reply.
but I've already tried that.
I have tried all japanese encoding of "Supported Encodings" from java.
http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html
I want to convert that.
ex)
?fe? => & # 4532; fe & # 3456; => original charcater is removed blank
This is convert '?' to code number.
In this case '?' is Japanese character.
Please let me know the way..

Japanese Characters Encoding Problem

Hi All,
I have been looking at the problems posted in this forum and quite a few describe the issue I am facing currently but none has been able to provide a solution.
The problem I am facing is as follows:
Step 1: I am retrieving Japanese data from Oracle DB 9i (Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit) using standard JDBC API calls. [NLS_CHARACTERSET : AL32UTF8, NLS_NCHAR_CHARACTERSET : AL16UTF16]
byte[] title = resultSet.getBytes("COLUMN_NAME");
Step 2: I pass the retrieved bytes to a method that returns SJIS encoded String.
private String getStringSJIS(byte[] bytesToBeEncoded) {
          StringBuffer sb = new StringBuffer();
          try {
               if (title != null) {
                    ByteArrayInputStream bais = new ByteArrayInputStream(bytesToBeEncoded);
                    InputStreamReader isr = new InputStreamReader(bais, "SJIS");
                    for (int c = isr.read(); c != (-1); c = isr.read()) {
                         sb.append((char) c);
               return sb.toString();
          } catch (Exception ex) {;}
3) I am using an HTML Parser JAR to print the decimal value of the Encoded String.
String after = getStringSJIS(title);
System.out.println(Translate.encode(after));
I get an output of String 1: ﾂ禿ｺﾂ本ﾂ古ｪﾂサﾂイﾂト
which contains 14 decimal character codes.
The same data is being read by another application that uses JDBC again and connects to the same DB and returns the decimal values as: String 2: 日本語サイト
The display of these two Strings differ significantly when viewed in the browser.
It seems String 1 contains single byte half-width characters and String 2 does not. Is anyone familiar as to why the bytes are getting modified while being retrieved from the Database for the same column value?

The encoding for the bytes being returned from the database is Cp1252 but this encoding, I understand, depends on the underlying platform I am using.
If indeed the data from the DB is in UTF-8 or 16, shouldn't it be displayed correctly in the browser? No encoding/decoding should be required on the data then. In the browser it gets displayed as ÂÃºÂ{ÂÃªÂTÂCÂg. (The encoding of the JSP page is set to UTF-8.)

Japanese Charecter Encoding

Hi,
My application is currently using the following configuration
1..Weblogic Server 7.0 (running on Windows Machine with Japanese OS)
2..JDK 1.3
I am retriving some data from the database in my JSP(say JSP1).Some of this data
is in Japanese.Now I form a URL using some of the Japanese Data and forward it to the
next jsp(say JSP2).
The Japanese data retrived in JSP1 appears fine on the Browse screen.However when in JSP2
I try to get these paranmeters(in Japanese) from the request (i.e. request.getParameter(..)), they come up
in mangled form.
All the JSPs of my applications include the following page directive:-
<%@page contentType="text/html; charset=Shift_JIS"%>
Please help

Hey,
Thanks a lot for the help,it worked ( and sorry for the delay in the response).
In the second jsp (JSP B), I have the following code
request.setCharacterEncoding("Shift_JIS");
strFile = request.getParameter("FileType");
Although the code seem to be working I am not very clear about this logic please could you explain me the following points :-
1..Why was there a specific need to setCharacterEncoding(), I had already set <%@page contentType="text/html; charset=Shift_JIS"%>
in both the jsps, doesn't that imply that the response encoding and the request decoding be done using 'Shift_JIS'
2..The same initial code (without addition of the line - request.setCharacterEncoding("Shift_JIS")) worked on my colleagues PC(he was NOT accessing my server though).How was that possible, are there any specific browser/server level setting which be set to default?
3..In Internet Explorer,if you GO to 'Tools' -> 'Internet Options' -> 'Advanced' Tab ,there is an option ' Always send URLs as UTF-8' .
What is the significance of this setting ?
4..How would the browser encoding setting(under 'View' -> 'Encoding') , effect jsps, which are using a specific encoding?
Thanking you in Anticipation

Issue with Japanese character encoding

Hi
I have a Web Service client (in AXIS1.4) which calls a web service with some data, this data has Japanese characters,
When i call Web Service in AXIS 2.0 the Web Service gets proper Japanese characters, but when i call Web Service Developed in JCAPS, this web service does not get proper Japanese characters, but gets ????????
Does anyone have dealt with such situation,
Any ideas

Is the encoding you are using in AXIS 1.4 UTF-8 ?. I think JCAPS supports UTF-8.

Speed up Adobe Media Encoder conversion time?

Hello I started a topic that is now answered about After Effect which can't convert h.264 (MP4) without using Adobe's media encoder. That's fine with me I just want to be able to convert the video's I use the software After Effect gives you. But I started this topic over here to ask if their is a way to speed up conversion time using the media encoder. Thanks and sorry if I don't answer all the time I only reply to answers that help not comments about the same topic .

I don't think you can create such a thing for the general populace. There are just too many variables.
You can take a 1 minute segment and export at various settings on your computer to get such a chart, but that will apply only to you with that material.

Programming a japanese onscreen keyboard in Labview (Linux)

HI,
I would like to know whether it is possible to program an onscreen keyboard for a touch screen in Japanese.
I found a lot of threads on onscreen keyboards programmed for Windows but none for Linux.
Also, which is the best IME for Japanese in linux and how do i access this IME in labview.
Thanks in advance.
Pavitra

Hey, so I am still trying to visualize what the end goal is. Could you post the articles you are referring to? Is it a bunch of bolean controls that are arranging to look like a keyboard that has Japanese text on each button?
Daniel Eaton
National Instruments
Systems Engineering
Embedded and Industrial Control

Encoding conversion in Mail adapter

Hi,
we have a problem with the Mail adapter...
We try to send an email out of the XI to some service providers. The outgoing data is stored in an XML structure. This structure agrees with the conventions of the Mail Package format, which is used for dynamic Mail generation. The content of this Mail Package structure is a semicolon separated string, which should be attached as a CSV-File to the outgoing email. Up to this everything works fine. We get the email with the attachement out of the system, send it to a SMTP server and transfer it to a previous defined email address.
But when we open the attachement (with Wordpad, Excel,...) all german umlaut have been lost. The problem is, when transfering the Mail Package content into a File, this File is UTF-8 encoded.
Can anyone give us a hint how to convert the encoding of the attached file from UTF-8 to ISO-8859-1 (Latin-1)?
Actual we have set the following parameters on the module page:
Work sequence
1. localejbs/AF_Modules/MessageTransformBean      Local Enterprise Bean      XML2Plain
2. localejbs/sap.com/com.sap.aii.adapter.mail.app/XIMailAdapterBean       Local Enterprise Bean      mail
Modulconfiguration
XML2Plain     Transform.ContentDisposition     attachment;filename="ABC.csv"
XML2Plain     Transform.ContentType      text/plain;charset=latin-1
XML2Plain     Transform.ContentDescription      "ABC"
I hope we get some help...
Regards,
Lars

Hi,
Hi
We handled the similar requirement in our project.
To convert the target file encoding UTF-8 to ISO-8859-1. For this i have used XSLT mapping, by changing the output encoding as shown in the code.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:a="urn:abc.com:pi:ab:cd:FileToMail:Mail">
<xsl:variable name="vfileName" select="/a:MT_Mail/FileName"/>
                <xsl:output method="text" indent="yes" encoding="iso-8859-1" media-type="TYP"/>
also set the charset value to ISO-8859-1 as shown in the XSLT mapping
                 <xsl:text>----mime-boundary
Content-Type: text/html; charset="ISO-8859-1"
Content-Disposition: inline
In Adapter module, add XMLAnonymizerBean
Link:http://help.sap.com/saphelp_nwpi71/helpdata/en/2e/bf37423cf7ab04e10000000a1550b0/frameset.htm
Hope this will resolve your issue.
Regards,
Divya
Edited by: Divya_10 on Jun 14, 2011 10:20 AM

Encoding conversion from UTF-16 to UTF-8

Hi all
I have a simple java String which is in UTF-16 format. I have to convert it to UTF-8 format coz my application expects UTF-8 only.
Can any body tell me what should i do for this???
thanks

but isn't it true that java supports only UTF-16 strings.
When i create the new String by using a UTF-8 byte array , what i feel is, it'll again convert it to UTF-16 data.
Suppose i have a String s1="dd" in UTF-16 format.
i have another one s2="ddd" in UTF-8 format
if i perform s1+s2 operation , how will JVL come to nkow about the encoding value for both strings??how will it perform the operation??
plz respond back and clear my doubt........

Encoding conversion in Swing?

Hi,
In working with text file, there is a default encoding to convert the charset into java String. We can check it with System.getProperty("file.encoding").
How about in Swing component? Is there another default encoding to convert the text input in the JTextField to java String? I have searched a lot of forum topics but could not find the answer. Please help. Thanks.
Best Regards,
Leslie

Hi Leslie,
that's because there isn't one. Java Strings will always be in UTF-16 Unicode, period.
The thing that will change encoding is what you input into Java, be it through a text file, a database, or input into text components.
When you type into a textfield like you mention, the characters you type in will likely arrive in whatever encoding is your current OS default. Java will convert the encoding over to UTF-16 as it gets inputted.
I imagine what you'd really like to do is input international characters into a textfield, right? Search on the Input Method Framework within this forum - there should be a fair bit of literature on it. It is the Java package that allows you to input I18N data into text components.
Hope that helps!
Martin Hughes

Japanese JIS encoding conversion in Linux

Similar Messages

Maybe you are looking for