GUI_DOWNLOAD unicode characters turn into ##

Hello,
we have a Unicode enabled system and I have some Unicode dummy data in a field. the content is 知道，我看.
In the program the data stay like that until the file is downloaded. Then the characters are ########.
My program downloads like this:
    CALL METHOD CL_GUI_FRONTEND_SERVICES=>GUI_DOWNLOAD
      EXPORTING
        FILENAME                = IM_PC_FILE
      CHANGING
        DATA_TAB                = im_text_file
Any idea what is missing here?
thanks a lot
Koen Van Loocke

do you open the downloaded file with Notepad? Anyway you have to use the WRITE_BOM parameter of the method (value should be 'X').
here you can have some more informations reg. SAP and Unicode:
https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/e0928b44-b811-2a10-7599-cc4bb6585c46
https://websmp107.sap-ag.de/~sapidb/012003146900000190272007E/ConversionErrors.htm (this is OSS, so will require your OSS userid and password)
Unicode File Handling in ABAP

Similar Messages

Unicode Characters Turn To Garbage Depending On Length of Preceeding Text

Hey,
I wrote a script that creates a bunch of text frames, fills some text and styles it.
The problem is, sometimes, unicode characters come out as garbage: e.g. "3M™ Blenderm™" turns to "3Mâ„¢ Blendermâ„¢".
I was playing around with four text frames to see what causes it, and if I add a line of text in the second frame, all subsequent unicode chars turn to garbage only if that line of text is larger than 6 characters.
If I add a ™ character to the first line of the first text frame, then the problem fixes itself.
Has anyone encountered something like this?
Left me know if you need more info (my whole script is rather large...)

Hey,
Thanks for the idea!
I think it has to do something with the way InDesign tries to read my data file (or script).
I placed "™" character inside comments right at the top of the file, and everything works
I would play around and try to find some saner solution, but the deadline for my project is way too close
Thanks!

Double byte characters turn into squares at PDF export use Unicode font

Hi all,
We developing with Visual Studio 2008, .NET 2.0 and Crystal Report XI Release 2 SP5 an international windows application. We use the font Arial Unicode MS in the rpt file. We translate the fix texts with the Crystal Translator (3.2.2.299).
On the distributed installation of our software, the printout and preview displays the double byte characters properly (Japanese, Korean, Chinese), but when we export the report as PDF, the characters get displayed with squares. This happens also, when the font Arial Unicode MS is installed on the distributed installation on Windows XP Professional.
I searched for hours for a solution in the knowlegde base articles and in forum of Crystal Report. I found one thread, which describes exactly our problem:
[Crystal XI R2 exporting issues with double-byte character sets|Crystal XI R2 exporting issues with double-byte character sets;
But we already introduced the solution to use Unicode font and I also linked the font Lucida Sans Unicode to the Arial Unicode MS, but we still face the problem.
Due to our release on thursday we are very under pressure to solve this problem asap.
We appreciate your help very much!
Ronny

Your searches should have also come up with the fact that CR XI R2 is not supported in .NET 2008. Only CR 2008 (12.x) and Crystal Reports Basic for Visual Studio 2008 (10.5) are supported in .NET 2008. I realize this is not good news given the release time line, but support or non support of cr xi r2 in .net 2008 is well documented - from [Supported Platforms|https://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/7081b21c-911e-2b10-678e-fe062159b453
] to [KBases|http://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/oss_notes_boj/sdn_oss_boj_dev/sap(bD1lbiZjPTAwMQ==)/bc/bsp/spn/scn_bosap/notes.do], to [Wiki|https://wiki.sdn.sap.com/wiki/display/BOBJ/WhichCrystalReportsassemblyversionsaresupportedinwhichversionsofVisualStudio+.NET].
Best I can suggest is to try SP6:
https://smpdl.sap-ag.de/~sapidp/012002523100015859952009E/crxir2win_sp6.exe
MSM:
https://smpdl.sap-ag.de/~sapidp/012002523100000634042010E/crxir2sp6_net_mm.zip
MSI:
https://smpdl.sap-ag.de/~sapidp/012002523100000633302010E/crxir2sp6_net_si.zip
Failing that, you will have to move to a supported environment...
Ludek
Follow us on Twitter http://twitter.com/SAPCRNetSup
Edited by: Ludek Uher on Jul 20, 2010 7:54 AM

Japanese dynamic text (_sans) turns into squares

I load dynamicly created XML (any languages including
Japanese and Chinese). Text (textField in a MC with onRease
function) shows fine in Japanese in the first place. But on
certains PCs (Windows 2000), when you click to trigger a onRelease
function, all the Japanese (or Chinese characters) turn into littre
squares!
font is _sans and font is not embedded, so device
font.

The thing, is my client can see in Japanese for the first
time.
But then, she clicks on one button (which contains a mix of
English/Japanese text), the Japanese turns into a small squires.
It never happens on my PC with Japanese OS (XP) nor on my
MacOS (English).

Insert Unicode Characters Into Oracle 8.1.5

Hello,
First off, here are the specs:
Oracle 8.1.5
JDK 1.2.1
Oracle8i 8.1.6.2.0 JDBC Drivers for use with JDK 1.2.x for Solaris
I'm running into a problem with insert Unicode characters into Oracle via the JDBC driver. As you can see above, I am using the Oracle 8.1.6.2.0 JDBC driver because it is the first driver with supports the JDK 1.2.x. So I think I should be okay.
I can retrieve data with special characters from Oracle by calling the getBytes() method from the ResultSet with all special characters being intact. I am using getBytes because calling getString() would throw the following exception: "java.sql.SQLException(): Fail to convert between UTF8 and UCS2: failUTF8Conv". However, with that value that I just retrieved, or any other data with special characters (unicode) in which I try to insert into Oracle does not get converted properly.
What appears to be happening is that data with special characters (unicode), are not being treated as a single double byte character, but rather two single byte characters. Thus, R|ckschlagventil becomes RC<ckschlagventil once it is inserted. (Hopefully, my example will be rendered properly).
According to all documentation that I have found, the JDBC driver should not have any problem with converting UCS2 Java Strings to Oracle's UTF8 character set.
I have set Oracle's NLS_NCHAR_CHARACTERSET to UTF8. I am also setting the environment variable NLS_LANG to AMERICAN_AMERICA.UTF8. Perhaps there is some other environment setting in which I am missing?
Any help would be appreciated,
Christian
null

Import has a lot of options, so it depends on what you want to do.
C:\> imp help=y
will show you all possible options. An example of full import :
C:\> imp <username>/<password>@<TNS alias> file=<DMP file> full=y log=<LOG file>
Message was edited by:
Paul M.
...and there is always [url http://download-uk.oracle.com/docs/cd/F49540_01/DOC/index.htm]The documentation

How do I get unicode characters out of an oracle.xdb.XMLType in Java?

The subject says it all. Something that should be simple and error free. Here's the code...
String xml = new String("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<x>\u2026</x>\n");
XMLType xmlType = new XMLType(conn, xml);
conn is an oci8 connection.
How do I get the original string back out of xmlType? I've tried xmlType.getClobVal() and xmlType.getString() but these change my \u2026 to 191 (question mark). I've tried xmlType.getBlobVal(CharacterSet.UNICODE_2_CHARSET).getBytes() (and substituted CharacterSet.UNICODE_2_CHARSET with a number of different CharacterSet values), but while the unicode characters are encoded correctly the blob returned has two bytes cut off the end for every unicode character contained in the original string.
I just need one method that actually works.
I'm using Oracle release 11.1.0.7.0. I'd mention NLS_LANG and file.encoding, but I'm setting the PrintStream I'm using for output explicitly to UTF-8 so these shouldn't, I think, have any bearing on the question.
Thanks for your time.
Stryder, aka Ralph

I created analogic test case, and executed it with DB 11.1.0.7 (Linux x86), which seems to work fine.
Please refer to the execution procedure below:
* I used AL32UTF8 database.
1. Create simple test case by executing the following SQL script from SQL*Plus:
connect / as sysdba
create user testxml identified by testxml;
grant connect, resource to testxml;
connect testxml/testxml
create table testtab (xml xmltype) ;
insert into testtab values (xmltype('<?xml version="1.0" encoding="UTF-8"?>'||chr(10)||'<x>'||unistr('\2026')||'</x>'||chr(10)));
-- chr(10) is a linefeed code.
commit;
2. Create QueryXMLType.java as follows:
import java.sql.*;
import oracle.sql.*;
import oracle.jdbc.*;
import oracle.xdb.XMLType;
import java.util.*;
public class QueryXMLType
     public static void main(String[] args) throws Exception, SQLException
          DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
          OracleConnection conn = (OracleConnection) DriverManager.getConnection("jdbc:oracle:oci8:@localhost:1521:orcl", "testxml", "testxml");
          OraclePreparedStatement stmt = (OraclePreparedStatement)conn.prepareStatement("select xml from testtab");
          ResultSet rs = stmt.executeQuery();
          OracleResultSet orset = (OracleResultSet) rs;
          while (rs.next())
               XMLType xml = XMLType.createXML(orset.getOPAQUE(1));
               System.out.println(xml.getStringVal());
          rs.close();
          stmt.close();
3. Compile QueryXMLType.java and execute QueryXMLType.class as follows:
export PATH=$ORACLE_HOME/jdk/bin:$PATH
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export CLASSPATH=.:$ORACLE_HOME/jdbc/lib/ojdbc5.jar:$ORACLE_HOME/jlib/orai18n.jar:$ORACLE_HOME/rdbms/jlib/xdb.jar:$ORACLE_HOME/lib/xmlparserv2.jar
javac QueryXMLType.java
java QueryXMLType
-> Then you will see U+2026 character (horizontal ellipsis) is properly output.
My Java code came from "Oracle XML DB Developer's Guide 11g Release 1 (11.1) Part Number B28369-04" with some modification of:
- Example 14-1 XMLType Java: Using JDBC to Query an XMLType Table
http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28369/xdb11jav.htm#i1033914
and
- Example 18-23 Using XQuery with JDBC
http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28369/xdb_xquery.htm#CBAEEJDE

Scanning files for non-unicode characters.

Question: I have a web application that allows users to take data, enter it into a webapp, and generate an xml file on the servers filesystem containing the entered data. The code to this application cannot be altered (outside vendor). I have a second webapp, written by yours truly, that has to parse through these xml files to build a dataset used elsewhere.
Unfortunately I'm having a serious problem. Many of the web applications users are apparently cutting and pasting their information from other sources (frequently MS Word) and in the process are embedding non-unicode characters in the XML files. When my application attempts to open these files (using DocumentBuilder), I get a SAXParseException "Document root element is missing".
I'm sure others have run into this sort of thing, so I'm trying to figure out the best way to tackle this problem. Obviously I'm going to have to start pre-scanning the files for invalid characters, but finding an efficient method for doing so has proven to be a challenge. I can load the file into a String array and search it character per character, but that is both extremely slow (we're talking thousands of LONG XML files), and would require that I predefine the invalid characters (so anything new would slip through).
I'm hoping there's a faster, easier way to do this that I'm just not familiar with or have found elsewhere.

require that I predefine the invalid charactersThis isn't hard to do and it isn't subject to change. The XML recommendation tells you here exactly what characters are valid in XML documents.
However if your problems extend to the sort of case where users paste code including the "&" character into a text node without escaping it properly, or they drop in MS Word "smart quotes" in the incorrect encoding, then I think you'll just have to face up to the fact that allowing naive users to generate uncontrolled wannabe-XML documents is not really a viable idea.

Special Unicode characters in RSS XML

Hi,
I'm using an adapted version of Husnu Sensoy's solution (http://husnusensoy.wordpress.com/2007/11/17/o-rss-11010-on-sourceforgenet/ - thanks, Husnu) to consume RSS feeds in an Apex app.
It works a treat, except in cases where the source feeds contain special unicode characters such as [right double quotation mark - 0x92 0x2019] (thankyou, http://www.nytimes.com/services/xml/rss/nyt/GlobalBusiness.xml)
These cases fail with
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00217: invalid character 8217 (U+2019) Error at line 19
Any ideas on how to translate these characters, or replace them with something innocuous (UNISTR?), so that the XML transformation succeeds?
Many thanks,
jd
The relevant code snippet is:
procedure get_rss
( p_address                 in httpuritype
, p_rss                    out t_rss
is
   function oracle_transformation
      return       xmltype is
      l_result     xmltype;
   begin
      select xslt
      into   l_result
      from   rsstransform
      where rsstransform = 0;
      return l_result;
   exception
   when no_data_found then
      raise_application_error(-20000, 'Transformation XML not found');
   when others then
      l_sqlerrm := sqlerrm;
      insert into errorlog...
   end oracle_transformation;
begin
   xmltype.transform(p_address.getXML()
                    ,oracle_transformation
                    ).toobject(p_rss);
exception
when others then
l_sqlerrm := sqlerrm;
insert into errorlog....
end get_rss;My environment:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
PARAMETER VALUE
NLS_LANGUAGE AMERICAN
NLS_CHARACTERSET WE8ISO8859P1
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE

environment
Oracle 10g R2 x86 10.2.0.4 on RHEL4U8 x86.
db NLS_CHARACTERSET WE8ISO8859P1
After following the following note:
Changing US7ASCII or WE8ISO8859P1 to WE8MSWIN1252 [ID 555823.1]
the nls_charset was changed:
Database character set WE8ISO8859P1
FROMCHAR WE8ISO8859P1
TOCHAR WE8MSWIN1252
And the error:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00217: invalid character 8217 (U+2019)
was no longer generated.
A Unicode database charset was not required in this case.
hth.
Paul

Illustrator CS6 is not displaying unicode characters properly.

I need to access certain unicode characters (namely the x/8 fractions). I have the appropriate font installed, and the characters display in every other program on the machine, but they are not displaying correctly in Illustrator. If I type the characters in using alt codes, the 3/8 character displays as "\", the 1/8 character displays as "[", etc. Is there some setting I need to change to enable support for Unicode characters?
*UPDATE*
If I type the character in another program, copy it, and paste it into Illustrator, it shows up as a box with an X through it. If I highlight the box and change the font to one that supports the character, then the character does display correctly. This is, however, quite inconvenient. Is there a way to type the character directly into Illustrator?
Message was edited by: NovakDamien

I use the following method...
Mike

What table column size is needed to accomodate Unicode characters

Hi guys,
I have encounter something which i dont understand and i hope gurus here will shed some light on me.
I am running a non-unicode database and i decided to port the data over to a unicode database.
So
1) i export the schema out --> data.dmp
2) then i create the unicode database + create a user
3) then i import the schema into the database
during the imp i can see that character conversion will take place.
During importing of data into the unicode database
I encounter some error
saying column size is too small
so i went to check the row that has the column value that is too large to fit in the table.
I realise it has some [][][][] data.. so i went to the live non-unicode database and find the row. Indeed it has some [][][][] rubbish data which i feel that someone has inserted other language then english into the database.
But regardless,
I went to modify the column size to a larger size, now the row can be accommodated. However the data is still [][][].
q1) why so ? since now my database is unicode, during the import, this column data [][][] should be converted to unicode already but i still have problem seeing what language it is.
q2) why at the non-unicode database, the [][][] data can fit into the table column size, but on unicode database, the same table column size need to be increase ?
q3) while doing more research on unicode, it was said that unicode character takes up 2 byte per character. Alot of my table data are exactly the same size of the table column size.
E.g Name VARCHAR2(5);
value - 'Peter'
Now if converting to unicode, characters will take 2byte instead of 1, isnt 'PETER' going to take up 10byte ( 2 byte per character ),
why is it that i can still accomodate the data into the table column ?
q4) now with unicode database up, i will be supporting different language characters around the world. How big should i set my column size to ? the longest a name can get ? or ?
Thanks guys!

/// does oracle automatically "look" at the each and individual characters in a word and determine how much byte it should take.
Characters usually originate from a keyboard, which has an associated keyboard layout and an associated character set encoding (a.k.a code page, a.k.a. encoding). This means, the keyboard driver knows that when a key with a letter "á" on it is pressed on a French keyboard, and the associated character set encoding is MS Code Page 1252 (Oracle name WE8MSWIN1252), then one byte with the value 225 is generated. If the associated character set encoding is UTF-16LE (standard internal Windows encoding), two bytes 225 and 0 are generated. When the generated bytes travel through APIs, they may undergo character set conversions from one encoding to another encoding. The conversion algorithms use translation tables to find out how to translate given byte sequence from one encoding to another encoding. In case of translation from WE8MSWIN1252 to AL32UTF8, Oracle will know that the byte sequence resulting from conversion of the code 225 should be 195 followed by 161. For a Chinese characters, for example when converting it from ZHS16GBK, Oracle knows the resulting sequence as well, and this sequence is usually 3 bytes.
This is how AL32UTF8 data gets into a database. Now, when Oracle processes a multibyte string, and needs to look at individual characters, for example to count them with LENGTH, or take a substring with SUBSTR, it uses information it has about the structure of the character set. Multibyte character sets are of two type: fixed-width and variable-width. Currently, Oracle supports only one fixed-width multibyte character set in the database: AL16UTF16, which is Oracle's name for Unicode UTF-16BE encoding. It supports this character set for NCHAR/NVARCHAR2/NCLOB data types only. This character set uses two bytes per each character code. To find the next code, 2 is simply added to the string pointer.
All other Oracle multibyte character sets are variable-width character sets, including AL32UTF8. In most cases, the length of each character code can be determined by looking at its first byte. In AL32UTF8, the number of 1-bits in the most significant positions in the first byte before the first 0-bit tells how many bytes a character has. 0 such bits means 1 byte (such codes are identical to 7-bit ASCII), 2 such bits mean two bytes, 3 bits mean 3 bytes, 4 bits mean four bytes. 1 bit (e.g. the bit sequence 10) starts each second, third or fourth byte of a code.
In other ASCII-based multibyte character sets, the number of bytes is usually determined by the value range of the first byte. Bytes below 128 means a one-byte code, bytes above 128 begin a two- or three-byte sequence, depending on the range.
There are also EBCDIC-based (mainframe) multibyte character sets, a.k.a shift-sensitive character sets, where a sequence of two-byte codes is introduced by inserting the SO character (code 14=0x0e) and ended by inserting the SI character (code 15=0x0f). There are also character sets, like ISO-2022-JP, which use more complicated byte sequences to define the length and meaning of byte sequences but Oracle supports them only in limited number of places.
/// e.g i have a word with 4 character. the 3rd character will be a chinese character..the rest are ascii character
/// will oracle use 4 byte per character regardless its ascii(english) or chinese
No.
/// or it will use 1 byte per english character then 3 byte for the chinese character ? e.g.total - 6 bytes taken
It will use 6 bytes.
Thnx,
Sergiusz

FOI Servlet non-unicode characters cannot be processed

Hello,
I'm using Oracle MapViewer 10.1.3.1 quickstart kit to test some map features
my database is in CL8MSWIN1251 charset
I made a simple map application to display some data using JavaScript API
when I define a theme based FOI layer in the map and the predefined theme has some non-Unicode characters in the labeling or in hidden info fields I get the folowing error:
Cannot process the following response from FOI server:
{"foiarray":[{"id":"AAARiqAAEAAAzFgAAA","name":"\u422\u414","gtype":"2001","imgurl":"http://localhost:8888/mapviewer/images/foi/p_16_13_MVDEMO_M.IMAGE131_BW.png","x":"50.0","y":"50.0","width":"16","height":"13","attrs":["987654321","100"]}],"attrnames":["BBB","Osn"]}
As you can see "\u422\u414" shoud be "\u0422\u0414" otherwise JavaScript cannot display characters in the right way. I think FOIServlet is the problem here.
Anyone has the same problems or has a solution for this problem pls

require that I predefine the invalid charactersThis isn't hard to do and it isn't subject to change. The XML recommendation tells you here exactly what characters are valid in XML documents.
However if your problems extend to the sort of case where users paste code including the "&" character into a text node without escaping it properly, or they drop in MS Word "smart quotes" in the incorrect encoding, then I think you'll just have to face up to the fact that allowing naive users to generate uncontrolled wannabe-XML documents is not really a viable idea.

Terminal.app and the European Unicode characters?

Does anyone have the unicode characters working properly in Terminal.app?
If I try to write in GNU nano 1.2.4 for instance "örrör" it translates into:
(one empty line)
örr
ör
which isn't certainly right. This is especially awkward when editing an unicode text file where the text then easily becomes more or less garbled. Usually more.
It doesn't seem to make any difference whether or not I use the Finnish extended (unicode) keyboard layout or the conventional one in nano. If the Terminal.app window preferences are set as UTF-8, it says:
?rr
?r
which looks even more garbled.
In plain bash the characters print like this:
å = \345
ä = \344
ö = \366
so my mighty apple translates the example string "örrör" as "\366rr\366r".
Any ideas, anyone?
PowerBook G4 @ 1.5 GHz Mac OS X (10.4.4) 1.25 GB DDR SDRAM
Debian Sarge 3.1 as a slave fetchmail server.

Hi solarflare,
   My first (and essentially only) language is English as well. However enough folks have asked that I have experimented with multibyte characters. There are so many apps and options involved, it's difficult to get consistent results. However, I'll recount as many settings as I can recall.
   To begin with, you are right about the LC settings. It helps many apps to have:
export LCALL=enUS.UTF-8
export LANG=en_US.UTF-8
set in your shell startup scripts. Then the system should be set to produce unicode when you type. In the "Input Menu" tab of the "International" pane of "System Preferences", you should select a unicode keyboard layout, such as U.S. Extended.
   To configure the Terminal, you need to open the "Terminal Inspector" by selecting "Window Settings..." in the "Terminal" menu. To type many multibyte characters, you need the option key. To use it, you must have the "Use option key as meta key" checkbox unchecked, although I find the meta key too important in UNIX to leave that unchecked. In the dropdown menu in the "Display" pane of the "Terminal Inspector", you should set the "Character Set Encoding" to "Unicode (UTF-8)". In the "Emulation" pane of the same window, you must uncheck the "Escape non-ASCII characters" checkbox. That is important as I've read that it is checked by default and that can produce some pretty strange results.
   Now it's helpful to use a very modern shell. For instance, the latest beta version of zsh-4.3 has the best unicode support of all versions of zsh. After you've chosen a good shell, you're at the mercy of the application that you're using. As I gather you've noticed, vim has excellent unicode support and picks up on the LC settings. I have no idea about nano but it is meant to be a minimal text editor.
   I know that my settings allow me to type extended characters and the "Character Palette" lets me insert more. As far as other command line utilities go, the best you can do is to choose well and keep your apps as up-to-date as possible. Fink or Darwin Ports can often help in that regard.
Gary
~~~~
   This generation doesn't have emotional baggage.
   We have emotional moving vans.
         -- Bruce Feirstein

Java class names which contain unicode characters

I need to create, compile and load java classes which have class names that contain unicode characters.
I am using Win2k but will neet to support unix* in future.
When I try to create a fooXbar.jar where X is a unicode character which is not ascii I get an error when creating the file.
My question is: how do I map java class names and package names which contain non ascii characters into
names that the files systems will like AND that the java VM will use when trying to load .class file from the class path.
for example what would the .java and .class file be for the following class?
class \u6587\u66f8 {

You could make names for .java and .class that is understandable by the filesystem. E.g. you could prepend with % and then digits for the unicode character. The problem is then how to compile the class, and how to load the class.
You can load the class with a custom classloader, which will translate the unicode class name to the escaped file name (using %).
The problem is then reduced to how you can compile your code (you have to map the file name to the class name somehow). I think it can be done, but I don't know the solution to that.
Alternatively you can use meaningful names for the classes, and then make an obfuscator that can change the bytecodes so the classnames are changed to some obscure unicode names. Perhaps there is already obfuscators out there you can use that will use unicode characters.

Unicode Characters in Label/JLabels

Hi All,
Does anyone know how when any unicode characters within a String get transformed into the character they represent? I ask because I'm getting conflicting behaviour depending on whether the String is hard-coded or read from file at runtime.
For instance, the following code works fine and produces a label on the GUI containing the infinity character:
String name = "100 to \u221E";
JLabel label = new JLabel(name);
However, if <name> is read from an XML file, the label produced shows "100 to \u221E" verbatim.
Has anyone else seen this effect?
Thanks in advance for any advice,
Andy Chamberlain

Thanks for that. If I understand correctly, is it
therefore the case that by the time the JLabel
constructor gets called, the String object ("name", in
this case) already has any unicode characters
encoded within it?
Exactly. The compiled .class file already has the unicode characters in it; JLabel has nothing to do with it.
If so, then when debugging, any such characters must
get decoded again back to ASCII when the value of
"name" is inspected within the debugger environment
(JDeveloper in this case).
Depends on the unicode-awareness of JDeveloper; I don't know anything about it.
And the finger would certainly then point to when the
String was created by the XML parser (I'm using
org.dom4j.io.SAXReader). I'll investigate this
further.If you have a text editor that can save a file in UTF-8, you could try saving the xml with the infinity symbol as plain text and specify the encoding of the file with <?xml encoding='UTF-8'?>... Or does your parser accept the &#some-decimal-number; way?

International Characters Turn to ?'s

I'm using Kodo 2.5.4 with MySQL. Any international characters (such as
"__") in MySQL get translated to a "?". Thanks for any clues on how to
prevent this -
Sam

Sam-
There are 3 possible places where something might be going wrong:
1. the database might not be storing the international characters
correctly
2. the JDBC driver might not be handling the international characters
properly
3. Kodo might be doing something wrong with the international characters
Since Kodo just uses the JDBC driver's String handling, I think #1 and #2
are more likely.
Some quick searching on the internet reveals that MySQL did not support
unicode until version 4.1:
http://www.mysql.com/doc/en/Charset-Unicode.html
You might want to try upgrading to see if your problems are magically
solved.
Otherwise, you are going to be limited by the capabilities of the JDBC
driver, and Kodo doesn't do any special handling for unicode characters
(beyond what is provided by the Java language itself). One solution
would be to perform your own encoding into the String field, and then
perform the decoding when you retrieve the field.
In article <bn0bdj$eec$[email protected]>, Sam wrote:
I forgot to mention that I'm also using MySQL version 3.23.58.
I'm using Kodo 2.5.4 with MySQL. Any international characters (such as
"__") in MySQL get translated to a "?". Thanks for any clues on how to
prevent this -
Sam
Marc Prud'hommeaux [email protected]
SolarMetric Inc. http://www.solarmetric.com

GUI_DOWNLOAD unicode characters turn into ##

Similar Messages

Maybe you are looking for