Converting a character to Unicode

Hello, im a first year java student and am currently taking a class in java where my teacher refuses to teach...........
anyways, nuff bout my sob story, i was wondering what the line of code was that is used in converting a single character into unicode. Can anyone help?
(Yes im well aware i suck at java, thats why im still learning ;) )
Thanx in advance.

I wrote this a while ago...see if it helps
* Written by: Evans Anyokwu
* Date : 28-Feb- 2002
* Purpose : Helps convert numbers to various formats
* import the java IO to enable the use of the streams
import java.io.*;
* The class decleration "Converter class"
public class Converter
public static void main (String [] arges) throws IOException
* Making use of the BufferedReader class to get the
* input from the keyboard.
java.io.BufferedReader keyBoard = new java.io.BufferedReader (
new InputStreamReader (System.in));
* Prompts the user to enter anything to be converted.
System.out.print ("Enter a string: ");
* Get the user input as stream from the keyboard.
String entered = keyBoard.readLine ();
* Loop throught the input and select the char at every
* point
for (int i = 0 ; i < entered.length () ; i++)
char c = entered.charAt (i);
* Print the char at (i) and convert it to a unicode.
//System.out.println (Integer.toBinaryString (c));
System.out.print ("\\u00" + Integer.toHexString (c));
//System.out.println (Integer.toOctalString (c));

Similar Messages

Convert chinese character to unicode

hello,i have a problem.
how can i know the unicode of chinese character in a file?
like this character ' 称 ' in a file a.txt
then how can i do to read the a.txt file then get the unicode of the character?
really need help..

i want to know what is the algorithm and the coding
of that tool.
thanksYou might consider downloading the open source project and looking directly at the code for that tool. I imagine that you could create something similar in...oh, maybe 25 lines of code.
retrieve the command line args
open the specified file using the charset encoding specified
for (all characters in the file) {
if character is > 0xFF convert to \uXXXX
output character to new file
John O'Conner

Approach to converting database character set from Western European to Unicode

Hi All,
EBS:12.2.4 upgraded
O/S: Red Hat Linux
I am looking for the below information. If anyone could help provide would be great!
INFORMATION NEEDED: Approach to converting database character set from Western European to Unicode for source systems with large data exceptions
DETAIL: We are looking to convert Oracle EBS database character set from Western European to Unicode to support Kanji characters. Our scan results show
both “lossy (110K approx.)” and “truncation (26K approx.)” exceptions in the database which needs to be fixed before the database is converted to Unicode.
Oracle Support has suggested to fix all open and closed transactions in the source Production instance using forms and scripts.
We’re looking for information/creative approaches who have performed similar exercises without having to manipulate data in the source instance.
Any help in this regard would be greatly appreciated!
Thanks for yourn time!
Regards,

There are two aspects here:
1. Why do you have such large number of lossy characters? Is this data coming from some very old eBS release, i.e. from before the times of the Java applet interface to Oracle Forms? Have you analyzed the nature of this lossy data?
2. There is no easy way around truncation issues as you cannot modify eBS metadata (make columns wider). You must shorten or remove the data manually through the documented eBS interfaces. eBS does not support direct manipulation of data in the database due to complex consistency rules enforced by the application itself (e.g. forms).
Thanks,
Sergiusz

How to convert ut8-8 to unicode

Hi guys,
For instance, I got a textbox and a user enter some Chinese characters in it. My page will need to put this string as a parameter to the other page. how should I convert these chinese character to unicode? At my servlet side, how should I put this unicode back to ut8-f in order to pass to database for searching? Thanks in advance !
regards,
Mark

Hi,
For instance, I enter 路 in google search then i will see one of it parameter with the url like this:
q=%E8%B7%AFwhich is i guess is utf-8 based 16.
any anyone know any javascript can convert 路 -> %E8%B7%AF and vice versa?
when this %E8%B7%AF string send to servlet side, how do I convert it back to text for database to search ? Pls help, Thanks !

Problem : "You cannot convert the character set"..Any suggestions?

Hi All,
I have this character inside my internal table, KOÇTAŞ YAPI MARK.TİC A.Ş. and it created a short dump on the program and it says "You cannot convert the character set". But this customer is already maintained inside the table KNA1..
The code is like this one:
REPLACE ALL OCCURRENCES OF '' IN linebuffer WITH 'EUR'.
TRANSFER linebuffer TO filename. "LENGTH bytes_to_transfer.
the LINEBUFFER here is an internal table.
I think conversion is the key here. Do anyone knows how to convert this thing? Or is it the Unicode system? Could someone please help me to my problem?
Thanks so much guys!
Regards,
Mackoy

Hi,
What i feel here is you are trying to pass totla internal table at once.Internal table may have more than one record.Filename may string which can contain one row.
So instead of that put the internal table into LOOP.
DATA : v_string(200).
LOOP AT LINEBUFFER.
Here your internal table may have non char data type.That is the
main problem.
Note : while concatenating all non char fields of LINEBUFFER
should move to first to temporary char field and conatenate it
CONCATENATE LINEBUFFER-field1 LINEBUFFER-field2
INTO v_string.
ENDLOOP.

Convert Hexadecimal NCRs to unicode characters

I have Hexadecimal NCRs in comments while importing comments from the 3b2 application. How can this be converted to appropriate characters using acrobat javascript.
For example: The comment contains (&#x000D) which should be converted to the appropriate unicode character.

I may be wrong, but I think you would use String.fromCharCode to convert a UCS-2 code into a string. So you then have to parse your string to process any escapes in it, and call that method.

Why summaries are converted to character when exported to excel??

Hi,
When i export my discoverer workbooks to excel, all of my summaries are converted to character type and excel gives error messages for each cell and asks to convert them back to numbers. Its not possible to do so because Discoverer plus is a client tool and no client would prefer rework. They'll simply say why to use Discoverer then?
Any suggestions appreciated.
Thanks.

Using Acrobat DC and the acrobat arrow tool. By right clicking on an image a submenu pops up. One of the items in that list is "Edit Image". Ordinarily or historically this would then open the image in Photoshop as a temp file. Edits could be made and saved and the image would be auto updated back into the PDF. In DC when selecting edit image, the document goes into a edit mode and the entire file gets converted into RGB color. From here one could again right click on the same image and get the option to edit using another program, (like photoshop). When returning back to document mode, the RGB conversion remains.
I determine the files are converted to RGB by using the inspector tool in Acrobat as well as looking at the colorspace in Photoshop
I determined the files were CMYK by building them that way and again verifying using Pitstop inspector.
Following screen grabs show image before / and after selecting "Edit Image"

It's possible to convert 'WE8MSWIN1252' character to chinese character set?

Hi All,
Is anyone know how to convert "WE8MSWIN1252" character to chinese character set in order to display chinese word in oracle apex?
My problem is i can't display chinese character set in oracle apex. The chinese field is showed like °×ÑªÇò¼ÆÊý. I'm using WE8MSWIN1252 database character set.
I'm wondering it's possible to show character word?
I'm appreciating if anyone have a good solution to share with me.
Thanks a lot in advance!
Edited by: Apex Junior on Jul 16, 2010 2:18 PM

WE8 is a Western European character set. If you wish to store and access a globalized multibyte character set you must have a database that supports it: You don't have one at the moment.
Given this is Apex I'd suggest you read the docs and reinstall.
Alternatively you could try CSSCAN and CSALTER and perhaps you can make the change but be very careful and have a good backup before you try.
http://www.morganslibrary.org/reference/character_sets.html

Invalid XML character. (Unicode: 0x7)

Hi There,
i get this Exception when i parse an XML file.
Invalid XML character. (Unicode: 0x7)
at com.ibm.xml.framework.XMLParser.handleError(XMLParser.java)
at com.ibm.xml.framework.XMLParser.error1(XMLParser.java)
at com.ibm.xml.internal.UTF8CharReader.skipInvalidChar(UTF8CharReader.java)
at com.ibm.xml.internal.DefaultScanner.scanContent(DefaultScanner.java)
at com.ibm.xml.internal.DefaultScanner.scanDocument(DefaultScanner.java)
at com.ibm.xml.framework.XMLParser.parse(XMLParser.java)
Basically, the data for the XML are picked from my 8i database and when i use a SAX parser to validate the formed XML, i get the exception above. The character do not appear to be outside ASCII range to me.
Any idea how i cld overcome this problem or identify which/what character is causing this problem? Am using a xmk4j 2.x parser.
Tks is advance.
-Sakthi

ASCII has nothing to do with it. XML is a text format and so an XML file may only include text characters. 0x7 isn't a text character, it's a control character, and it isn't allowed to occur in an XML file.
As for how to identify which character is causing the problem, the error message tells you that.

Converting a character to ASCII

Hi ,
I have a string and I should be converting it into to ASCII character.
Is there a method in java where we can convert a character(something like "a") to a ASCII character("65").
Thanks,
Anna

Hello Abikkina,
This is how you can do...
String s ="Example"
int [] ascii = new int[s.length()];
for(int i =0; i<s.length(); i++)
ascii[i] = (int)s.charAt(i);
for(int i = 0; i<ascii.length; i++)
System.out.println("Ascii output"+ascii);
Just cast the character into int then you will get the ASCII value.

Convert SAPscript documents to unicode

Hi,
we want to convert our system to unicode. By testing we found out that SAPscript documents created before the conversion cannot be displayed and also not convert after the unicode conversion.
We have SAPscript documents as spoolfiles and in the folders of SAP Business Workplace.
Is there any way to convert or save these documents before conversion?
Thanks in advance.

Hi Manuela,
please have a look at note 842767 - it outlines the possible methods to save spool files containing SAPScript.
I do not know about alternatives, which are not listed there.
Best regards,
Nils Buerckel
Solution Management
Globalization Services
SAP AG

Risk involved converting Oracle character set to Unicode (AL32UTF8 or UTF8)

Hi All -
I am a PL/SQL devloper and quite new in Database Adminstration have very little knowledge base on this.
Currently I am working on project where we have requirement to store data in Multiple Languages in Database.
After my findings via Google I am clear that our database character set needs to be changed to Unicode (AL32UTF8 or UTF8). Before moving forward I would like to know what are the risk involved doing this?
Few Question:-
Would this change take long time & involve lots of effort ?
Can we revert back once this chnage is done, with no data loss?
Will there be any changes required while wrting SQL on tables having multi language data?
As of now requirement to store data in Multi Language is very specfic to some tables only, not the whole DB, are there any other options storing data in diffrent languages like (Spanish,Japnese,Chinese,Italian, German, and French) in just one specific table?
Thanks...
Edited by: user633761 on Jun 7, 2009 9:15 PM

>
Will there be any changes required while wrting SQL on tables having multi language data?If you move from single byte character set to multi byte character set, you should take into count that 1 character my use 1,2,3 or 4 bytes to be stored: http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch2charset.htm#i1006683
This may impact SQL or PL/SQL code that is working on character string lengths.
Note also that using exp/imp to change database character set is not so simple; see following message:
Re: charset conversion from WE8ISO8859P1 (8.1.7.0) to AL32UTF8(9.0.1.1)
>
As of now requirement to store data in Multi Language is very specfic to some tables only, not the whole DB, are there any other options storing data in diffrent languages like (Spanish,Japnese,Chinese,Italian, German, and French) in just one specific table?Using NCHAR character types is another possibility:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch11charsetmig.htm#sthref1493
Edited by: P. Forstmann on Jun 8, 2009 9:10 AM

Error converting XSTRING to STRING (unicode, codepage)

Hi all,
I have a problem converting data from an external file into SAP.
The file is uploaded via an application created in web dynpro, where I use the upload functionality. This returns the file in the XSTRING format, and then use the following to convert (where l_xstring is the file and l_string is how I want the file):
l_string TYPE string,
l_xstring TYPE xstring,
convt = cl_abap_conv_in_ce=>create( input = l_xstring ).
convt->read( IMPORTING data = l_string ).
This worked perfectly - until I recieved a file containing russian characters
The SAP system (BI) is in Unicode, so this should be ok.
I get a:
CONVT_CODEPAGE
CX_SY_CONVERSION_CODEPAGE
Error, when trying to run it.
Also the following migth be helpful:
At the conversion of a text from codepage '4110' to codepage '4102':
- a character was found that cannot be displayed in one of the two
codepages;
- or it was detected that this conversion is not supported
The running ABAP program 'CL_ABAP_CONV_IN_CE============CP' had to be
terminated as the conversion
would have produced incorrect data.
The number of characters that could not be displayed (and therefore not
be converted), is 18141. If this number is 0, the second error case, as
mentioned above, has occurred.
I have tried setting the codepage parameter of the READ method, but to no success.
Anyone ??
-Tonni

Friend,
Call the FM like below....
CALL FUNCTION 'ECATT_CONV_XSTRING_TO_STRING'
EXPORTING
    IM_XSTRING        = x
   IM_ENCODING       = 'UTF-8'
IMPORTING
   EX_STRING         = x.

Character encoding (unicode to utf-8) conversion problem

I have run into a problem that I can't seem to find a solution to.
my users are copying and pasting from MS-Word. My DB is Oracle with its encoding set to "UTF-8".
Using Oracle's thin driver it automatically converts to the DB's default character set.
When Java tries to encode Unicode to UTF-8 and it runs into an unknown character (typically a character that is in the High Ascii range) it substitutes it with '?' or some other wierd character.
How do I prevent this.

my users are copying and pasting from MS-Word. My DB
is Oracle with its encoding set to "UTF-8".Pasting where? Into the database? If they are pasting into the database (however they might do that) and getting bad results then that's nothing to do with Java.
Using Oracle's thin driver it automatically converts
to the DB's default character set.Okay, I will assume that is correct.
When Java tries to encode Unicode to UTF-8 and it
runs into an unknown character (typically a character
that is in the High Ascii range) it substitutes it
with '?' or some other wierd character.This is false. When converting from Unicode to UTF-8 there are no "unknown characters". I don't know what you mean by the "High Ascii range" but if your users are pasting MS stuff into your Java program somehow, then a conversion from something into Unicode is done at that time. If "something" isn't the right encoding then you have the problems already, before you try to write to the DB.
How do I prevent this.First identify the problem. You have input coming from somewhere, then you are writing to the database. Two different steps. Either of them could have a problem. Test them separately so you know which one of them is the problem.

How to retrive Original Character in Unicode Format from UTF8

Our Database is enabled UTF8 format.User entered some data through UI(html forms),which is stored as â?? (whose ascii values are 50082,49792,49817 respectively,as the data is displaying in different way here ,I am giving here with the ascii values)in to the Database.When we retrive the data into UI or into a text file(.txt) it is displaying/storing as ’(Ascii value is 15712189).How to check for is that character correct or not.

Ashok,
What is your NLS_LANG setting on the client machine where FORMS is running?
To see if the value is stored properly in the database you can use the DUMP command. You can find this in the SQL Reference. But here is a desription:
he syntax of the function call is:
DUMP( <value> [, <format> [, <offset> [, <length> ] ] ] )
where:
value - is the value to be displayed
format - is a number which describes the format in which bytes of the value are to be displayed: 8 - means octal, 10 - means decimal, 16 - means hexadecimal; other values between 0 and 16 mean decimal; values greater then 16 are a little confusing and mean: print bytes as ASCII characters if they correspond to printable ASCII codes, print them as "^x" if they correspond to ASCII control codes and print them in hexadecimal otherwise; adding 1000 to the format number will add character set information for the character data type values to the return value offset - is the offset of the first byte of the value to display; negative values mean counting from the end length - is the number of bytes to display. So for example,
SQL> SELECT DUMP(col,1016)FROM table ;
Typ=1 Len=39 CharacterSet=UTF8: 227,131,143,227,131,170
returns the value of a column consisting of 3 Japanese characters in UTF8 encoding . For example the 1st char is 227(*255)+131. You will probably need to convert this to UCS2 to verify the codepoint value with the Unicode Standard version 3.0.

Converting a character to Unicode

Similar Messages

Maybe you are looking for