UTF8, UTF-16 & Cloudscape

Hi everyone,
I have a problem when I'm trying to store and retrieve Big5 chars into Cloudscape
(which claims to support Unicode just the Java does). The following is the steps:
1. Prepare my Big5 chars
2. Run native2ascii -encoding UTF-16 (UTF8 gives MalformedInputException)
3. Running native2ascii -reverse perfectly reverses everything and I could see the Big5 chars.
So I guess this should right.
4. Inserted into Cloudscape by SQL: insert into theuser values('\uXXXX', 'elite02'); just to try out.
SQL select shows ? though.
5. I retrieve the String with normal rs.getString();
6. I output the string to a file to check if it is correct by doing:
new OutputStreamWriter() with UTF-16 (I got some other character), UTF8 (gives a ?) and even Big5 and I got crap.
Correct me if I'm wrong but my rationale is:
Java stores String as Unicode UTF-16 and since Cloudscape claims that I can just get
the Unicode String without any manual conversion, logically my String should contain
the right thing. Now when I output the String to file, wouldn't using the UTF-16 solve my
problem? I mean since String is UTF-16 and my output is also UTF-16, right?
Or do I still need to convert the String from Cloudscape to UTF8 (bear in mind this actually
gives exception with native2ascii tool) or UTF-16 before I can output it to UTF-16 file?
Or am I totally goofed up with these encoding stuff?
thank you guys in advance!

Hi,
the -encoding paramter of native2ascii is the encoding of your sourcefile. (the targetfile is ASCII-encoded).
So step 2 should be native2ascii -encoding Big5.Running native2ascii -reverse with the same encoding as in step 2 should always lead back to your original input, regardless of the -encoding parameter used (as long as it is the same).
I dont know Cloudscape, but I guess inserting your Strings from Step 2 by a java-program should work.
But you cant insert the values direct via SQL (because \uxxxx has only meaning in java).
Regards
Jan

Similar Messages

Locale.gen and echoing LANG=es_US.UTF8 UTF-8 || Why "LANG"???

I noticed in the beginner tutorial during the locale-gen area it gives you some options.
EDIT!!! all es_US is really en_US. In bash I never typed es_US. So I changed my code examples from es_US to en_US
NOTE!! I am in learning mode so I am purposly messing with things. I have already got Arch Linux up and running... I am purposely breaking things to fix them.
First, I ran nano and looked at the locale.gen file and saw that it had a huge list.
then I ran
echo LANG=en_US.UTF-8 > /etc/locale.gen
When I saw what happened other than being WOWed by learning what echo does. I saw that the locale.gen replaced the entire file with "LANG=es_US.UTF-8" and on top of that locale-gen has an error with the code.
I would prefer this
echo en_US.UTF-8 UTF 8 >> /etc/locale.gen
So does the beginner guide have an error? Because "LANG=" was never in the locale file to begin with and just uncommenting leaves "en_US.UTF-8 UTF-8" NOT "en_US.UTF-8"
Thank you for your time!
Last edited by AcousticBruce (2015-03-08 15:21:10)

I did not tell you what the error was... that was my fault. The error went away with me fixing locale.gen to locale.conf.
Ranman: I have gone over the locale.conf wiki. Really what is hard is, when the wiki explains one thing, it explains it almost another language that I, yet again, have to look up something else. So I am hoping my question can be just treated as a simple question.
I can appreciate the intense amount of questions you get from people who do not read or care to take the time. However, I am not that guy. So can you please quit telling me to do research. Please read my whole post, I went through the beginner guide-Many times! It does not tell me what the difference between these two files are.
Yes locale.conf is in the wiki, but locale.gen is NOT. So if it is ok, see my next question...
That brings me to the next question. What is the difference of locale.conf and locale.gen?
if I run locale-gen that will pull the uncommented line and set it as language?
Or if I chose to use the echo to locale.conf
and export LANG=en_US... does this do the same thing?
Last edited by AcousticBruce (2015-03-08 18:54:46)

Need help with UTF-8 and DBCS

I must not understand something fundamental to UTF-8, because no matter what I try, I can't get any conversion from English to Japanese or Simple Chinese or any other language. I've tried the Java tutorials, but the output I get is either blanks (in the GUI example) or question marks or English. I also tried changing my font.properties.ja to font.properties and setting the file.encoding system property to "UTF8". Still no workie.
Here is one of my tests; can someone please explain what I'm doing wrong.
public java.lang.String getUTFString( String testString )
try
byte[] utf8 = testString.getBytes("UTF-8");
//just what to see what the bytes look like.
//I assume these should be DBCS bytes because
//my font.properties is the japanese version.
for( int ndx=0; ndx<utf8.length; ndx++ )
System.out.println( utf8[ndx] );
String mystring = new String(utf8, "UTF-8");
//thought this would print Japanese characters to
//my terminal
System.out.println( "utf8 string: " + mystring );
catch (Exception e)
e.printStackTrace();
return mystring;
}

No, System.out.println is a hopeless tool for testing that sort of thing. The minimum you need is a GUI, including fonts that can actually render the characters you are interested in. It's not entirely clear to me what you expected to happen in that method, either, as it should return its parameter unchanged. "Converting from Japanese to English" -- you didn't expect a translation, did you?

Triple byte unicode to utf-16

I need to convert a triple byte unicode value (UTF-8) to UTF-16.
Does anyone have any code to do this. I have tried some code like:
String original = new String("\ue9a1b5");
byte[] utf8Bytes = original.getBytes("UTF8");
byte[] defaultBytes = original.getBytes();
but this does not seem to process the last byte (b5). Also, when I try to convert the hex values to utf-16, it is vastly off.
-Lou

Good question. Answer is, it does.
Oops, sorry, I think I left my brain in the kitchen :)
I was somehow thinking that "hmmm, e is not a hexadecimal digit so that must result in an error"... but of course it is...
Am I representing the triple byte unicode character
wrong? How do I get a 3 byte unicode character into
Java (for example, the utf-16 9875)?It's simply "\u9875".
If you have byte data in UTF-8 encoding, this is what you can do:try {
    byte[] utf8 = {(byte) 0xE9, (byte) 0xA1, (byte) 0xB5}
    String stringFromUTF8 = new String(utf8, "UTF-8");
} catch (UnsupportedEncodingException uee) {
    // UTF-8 is guaranteed to be supported everywhere
}

UTF8 support in Oracle Lite

I looks to me like Oracle Lite doesn't support the UTF8 character set for storing data. Is this correct?
I have an Oracle 8.1.7 database using UTF8 to store English, Thai, Chinese and Philippino data, and I want to synchronise that with Oracle Lite on Windows 2000 clients. The sync seems to work ok, but any non-english characters are lost in a characterset conversion (display as "?").
Regards
Steve

I'm a bit confused by your reply. Can Oracle Lite store data in a unicode characterset such as UTF8? It looks to me like the "UTF-8 support" is limited to the drivers, so that it can load and extract utf-8 data, but not store it. From section 2.2.1 of the release notes:
"Oracle Lite Database is NOT a NLS component. In order to reduce the kernel size, it is built for each language which supports native character sets for Windows. Which means, each language has each kernel. Here are the character sets supported by this release:
- Chinese: MS936 CodePage (Simplified Chinese GBK, ZHS)
- Taiwanese: MS950 CodePage (Traditional Chinese BIG5, ZHT)
- Japanese: MS932 CodePage (Japanese Shift-JIS, JA)
- Korean: MS949 CodePage (Korean, Ko)
The database kernel for each language in this list only supports its corresponding character set. Other multibyte character sets are not supported."
Also the documentation on the DBCharEncoding parameter you mention suggests that it only affects the UTF translation for java programs. Section A.2.3:
"... Specifies the UTF translation performed by Oracle Lite. If set to NATIVE, no UTF translation is performed. If set to UTF8, UTF translation is performed. If this parameter is not specified, the default is UTF8. This applies to Java programs only."
I've tried playing with these parameters, as well as changing the NLS_LANG parameter on the client, and for the mobile server, for the Oracle Lite home, all to no avail. I'm still losing the non-english data during synchronisation and it does look like it's being lost in a character set conversion rather than just being garbled, as each Thai characters is being replaced by the correct number of "?"s. As an example the Thai string "บริษัท บราเดอร์ คอมเมอร์เชี่ยล (ประเทศไทย) จำกัด" on the 8.1.7 database sever appears as "?????? ???????? ?????????????? (?????????) ?????" on the oracle lite database.
Am I missing something here? If I can get this data syncronising correctly then Oracle Lite looks like it will support all our requirments so any assistance would be greatly appreciated. (Should I post this to the globilization forum or does that focus only on Oracle's enterprise editions?)
BTW, thanks for the info on the sorting. Obviously the characterset issue is more a fundamental problem at this stage, but if we can fix this then it's good to know about the sorting abilities.

Xfce4 and gnome-network-manager [solved]

I just made a clean install of archlinux. installed xorg and Xfce4, xfce-goodies, gnome-panel, gnome-desktop and gnome-network-manager. also I enabled GNOME services in startup and sessions. how do I start that network manager now? is nowhere in the settings, nor there is icon to add to the taskbar (i got xfapplets installed too)
Last edited by anarxi (2009-01-14 00:22:42)

quoted from a previous thread
I solved it by uncommenting my <locale><charset> in /etc/locale.gen.
First you have to enable the locales you want being supported by your system. To enable or disable them, the file /etc/locale.gen is used. It contains every locale you can enable, and you have just to uncomment lines you want to do so.
As we want to setup an english UTF-8 conform system, we want to enable en_US.UTF-8. But for compatibility to programs that don't support UTF-8 yet, it's recommended to support any other locale, prefixed with en_US as well. Having this in mind, we enable this set of locales:
en_US.UTF8 UTF-8
en_US ISO-8859-1
After you've enabled the necessary locales, you have to run locale-gen as root to update them:
# sudo locale-gen
Generating locales...
en_US.UTF-8... done
en_US.ISO-8859-1... done
Generation complete.

Need a dificult script can't to it myself.

Hello,
I try to keep it short.
We have a new idea for a customer of ours. Don't know if it's possible and I hope there is someone that's want to try it and give it a go. I can work it out on paper altough my applescript / programmer skills want do the job.
We will work with an excel file. We want to keep it simple for our customer.
They have (more or less) 60 diffferent leaflets and from 100 to 200 agencies.
We would like to print all leaflets for 1 agency from 1 line in excel.
Normally 1 person will get the orders from all agencies (central) and will write it down in excel.
Once a week we will get the database for printing.
But in this case I can't print unless there is a automated procces that fix the database like I need it.
I first run a applescript for converting the database to tab delimit and unicode check.
So it will be an TXT file TAB delimited unicode MAC starting this script.
So the database must have in the beginning the basic customer information. Such as company name, street, delivery adresse,...
From A to N is the agency information. This information must be copied on every line.
From colum O the database will have the PROD1-PROD2-PROD3
We will have 70 PROD at start
So it's from colum O to CF the PRODUCTS.
UNder PROD they will write the quantaty the need of this leaflet.
Important : Now i write zero for visibilaty but zero need to be a blank field.(for printing software)
So this is the tricky part : The script needs to duplicate the records in a new database AND
it needs to begin with PROD 1 to PROD 70 AND may only copie the quantaty of PROD1 then leave the other PRODUCTS quantaties blank and go on. Please See example.
Customer info - PROD1 - PROD 2 - PROD 3
Customer 1 - 5-6-2
Customer 2 - 0-3-5
Customer 200
MUST BECOME:
Customer info - PROD1 - PROD 2 - PROD 3
Customer 1 - 5-0-0
Customer 1 - 5-0-0
Customer 1 - 5-0-0
Customer 1 - 5-0-0
Customer 1 - 5-0-0
Customer 1 - 0-6-0
Customer 1 - 0-6-0
Customer 1 - 0-6-0
Customer 1 - 0-6-0
Customer 1 - 0-6-0
Customer 1 - 0-6-0
Customer 1 - 0-0-2
Customer 1 - 0-6-2
Customer 2 - 0-3-0
Customer 2 - 0-3-0
Customer 2 - 0-3-0
Customer 2 - 0-0-5
Customer 2 - 0-0-5
Customer 2 - 0-0-5
Customer 2 - 0-0-5
Customer 2 - 0-0-5
want to give it a try?
Many thanks in advance.

Hello Colin,
Ah, I didn't take the field names' line into consideration...
Here's the revised code to handle it (hopefully). I assumed the field names' line is the 1st line of the input file and the rest are data lines.
Also in my previous code, customerRange and quantityRange are being set to {1, 1} and {2, 6} respectively to process the reduced sample data. However, according to your original description of input data, they should be changed as follows:
property customerRange : {1, 14} -- field range for customer info (e.g. {1, 14} for A..N)
property quantityRange : {15, 84} -- field range for quantities {e.g. {15, 84} for O..CF)
Hope this may help,
Hiroto
-- SCRIPT
  v0.2
    Modified such that
     - it treats the 1st line (field names line) in the input text file properly.
     - it now preserves the empty lines in the input text file.
  Usage:
    Set the following four properties to fit your need and run the script.
    property fp : "HFS:path:to:input:file"
    property fp1 : "HFS:path:to:output:file"
    property customerRange : {1, 14} -- field range for customer info (e.g. {1, 14} for A..N)
    property quantityRange : {15, 84} -- field range for quantities {e.g. {15, 84} for O..CF)
    It will
     - read the input text file (assumed in UTF-8, currently),
     - convert text to array,
     - expand each record according to given fields' ranges,
     - convert processed array back to text,
     - write (or overwrite) output text file (in UTF-8, currently).
E.g. (customerRange = {1, 1}, quantityRange = {2, 6})
INPUT TEXT (tab delimited)
Field Names (Column Title) Line
Customer1  0  5  1  0  0
Customer2  2  0  0  0  1
Customer3  0  0  0  0  0
Customer4  1  1  0  0  3
Customer5  0  0  4  0  0
OUTPUT TEXT (tab delimited)
Field Names (Column Title) Line
Customer1    5
Customer1    5
Customer1    5
Customer1    5
Customer1    5
Customer1      1
Customer2  2
Customer2  2
Customer2          1
Customer4  1
Customer4    1
Customer4          3
Customer4          3
Customer4          3
Customer5      4
Customer5      4
Customer5      4
Customer5      4
main()
on main()
script o
-- input file path
property fp : "HFS:path:to:input:file" -- #
-- output file path
property fp1 : "HFS:path:to:output:file" -- #
-- field ranges
property customerRange : {1, 14} -- # field range for customer info (e.g. {1, 14} for A..N)
property quantityRange : {15, 84} -- # field range for quantities {e.g. {15, 84} for O..CF)
-- working lists
property aa : {} -- original array
property dd : {} -- original quantity list
property ee : {} -- modified array (expanded quantity lists)
property ee0 : {}
property ee1 : {}
local c1, c2, d1, d2, c, t, t1
-- (0) preparation
set {c1, c2} to customerRange
set {d1, d2} to quantityRange
-- make empty data list (ee0)
repeat with i from d1 to d2
set end of my ee0 to ""
end repeat
-- (1) read input file (text of paragraphs of tab delimited values)
set t to read file fp as «class utf8» -- UTF-8
--set t to read file fp as Unicode text -- UTF-16
--set t to read file fp -- plain text (in System's primary encoding)
-- (2) convert text to 2d array
set my aa to text2array(t, tab)
-- (3) process each record
set end of my ee to my aa's item 1 -- get the 1st record that is field names record
set aa to my aa's rest -- exclude the 1st record from subsequent processing
repeat with a in my aa -- for each record entry
set a to a's contents
if a is {""} then -- ignore empty record (i.e. empty line in input text if any)
set end of my ee to a -- just leave the empty record alone
else
set c to a's items c1 thru c2 -- customer info parts
set my dd to a's items d1 thru d2 -- quantities
repeat with i from 1 to count my dd -- for each quantity entry
set d to my dd's item i as number
if d > 0 then
copy my ee0 to my ee1 -- make copy of empty data list
set my ee1's item i to d
repeat d times
set end of my ee to (c & ee1)
end repeat
end if
end repeat
end if
end repeat
-- (4) convert 2d array to text
set t1 to array2text(my ee, tab, return)
-- (5) write output file (text of paragraphs of tab delimited values)
writeData(fp1, t1, {_append:false, _class:«class utf8»}) -- UTF-8
--writeData(fp1, t1, {_append:false, _class:Unicode text}) -- UTF-16BE
--writeData(fp1, t1, {_append:false, _class:string}) -- plain text (in System's primary encoding)
-- (*) for test
--return {t, t1}
return t1
end script
tell o to run
end main
on text2array(t, cdelim) -- v1.1, with column delimiter as parameter
  text t: text of which paragrahps consist of values delimited by cdelim. e.g. (Let cdelim = tab)
    "11  12
     21  22
     31  32"
  string cdelim : column delimiter
  return list: two dimentional array. e.g. {{11, 12}, {21, 22}, {31, 32}}
script o
property aa : t's paragraphs
property xx : {}
property astid : a reference to AppleScript's text item delimiters
local astid0
try
set astid0 to astid's contents
set astid's contents to {cdelim}
repeat with a in my aa
set end of my xx to (a's text items)
end repeat
set astid's contents to astid0
on error errs number errn
set astid's contents to astid0
error "text2array(): " & errs number errn
end try
return my xx's contents
end script
tell o to run
end text2array
on array2text(dd, cdelim, rdelim) -- v1.1, with column delimiter and row delimiter as parameter
  list dd: two dimentional array. e.g. {{11, 12}, {21, 22}, {31, 32}}
  string cdelim : column delimiter
  string rdelim : row delimiter (e.g. CR, LF, CRLF, etc)
  return string : paragraphs of items delimited by cdelim.
  e.g.
  (Let cdelim = tab, rdelim = CR)
    "11  12
     21  22
     31  32"
  i.e.
    11 [tab] 12 [CR]
    21 [tab] 22 [CR]
    31 [tab] 32 [CR]
script o
property aa : dd's contents
property xx : {}
property astid : a reference to AppleScript's text item delimiters
local t, astid0
try
set astid0 to astid's contents
set astid's contents to {cdelim}
repeat with a in my aa
set end of my xx to ("" & a)
end repeat
set astid's contents to {rdelim}
set t to "" & my xx
set astid's contents to astid0
on error errs number errn
set astid's contents to astid0
error "array2text(): " & errs number errn
end try
return t
end script
tell o to run
end array2text
on writeData(fp, x, {append:append, class:class})
  text fp: output file path
  data x: anything to be written to output file
  boolean _append: true to append data, false to replace data
  type class _class: type class as which the data is written (_class = "" indicates x's class as is)
local a
try
open for access (file fp) with write permission
set a to fp as alias
if not _append then set eof a to 0
if _class = "" then
write x to a starting at eof
else
write x to a as _class starting at eof
end if
close access a
on error errs number errn
try
close access file fp
on error --
end try
error "writeData(): " & errs number errn
end try
end writeData
-- END OF SCRIPT
Message was edited by: Hiroto

How to do this? More words in one field - split the words

Hello Hello,
I have a simple question about a simple database arrangement.
But the family name and first name are in the same field(column).
I would like to have a database with the first name - family name(s) in different fields.
So when for example the database has Willem De Wortel in one field
I would like to get a database with 3 more fields First name - Family 1 - Family 2
So the first word is the first name : Willem
and the second field is second name : De and the third field is third name : Wortel. For Willem Van De Wortel : I get 4 columns
So the script should see at 1 column in the database.
And make more columns for every word in that column.
So I think most of times I get a database with 4 extra columns
First name - Second name - third name - fourth name
I prefer to make extra columns starting next to the selected column
so I get a new database tab delimited mac os roman with (most of the times) 4 extra columns where the name is separated in more columns. And the new columns are next to the original column.
I hope some one understand me.

Hello Colin,
Here it goes. Try this one.
May this work well.
Hiroto
P.S. Sorry I misspelled your name in previous posts...
--SCRIPT 2
E.g.
infile.txt (; denotes tab)
name;field2;field3;field4
d11;d12;d13;d14
d21;d22;d23;d24
d31;d32;d33;d34
d41;d42;d43;d44
* Here, the 1st field is for name
outfile.txt (; denotes tab)
name;name 1;name 2;...;name 9;name 10;field2;field3;field4
d11;d11[1];d11[2];...;d11[9];d11[10];d12;d13;d14
d21;d21[1];d21[2];...;d21[9];d21[10];d22;d23;d24
d31;d31[1];d31[2];...;d31[9];d31[10];d32;d33;d34
d41;d41[1];d41[2];...;d41[9];d41[10];d42;d43;d44
* X denotes X's i'th substring delimited by space
(If X is in double quotes, each X is also enclosed in double quotes)
e.g.,
Given X = "Willem Van De Wortel",
X[1] = "Wililem"
X[2] = "Van"
X[3] = "De"
X[4] = "Wortel"
X[5]..X[10] = (empty)
on run
open (choose file with prompt "Choose input file") as list
--open (choose file with prompt "Choose input file(s)" with multiple selections allowed) as list -- AS1.9.2 or later
end run
on open aa
repeat with a in aa
set infile to a as string
set {p, m, x} to {|parent|, |name stem|, |name extension|} of getPathComponents(infile)
set outfile to p & m & "-converted" & x
main(infile, outfile)
end repeat
end open
on main(infile, outfile)
string infile : HFS path of input file
string outfile : HFS path of output file
script o
property targetfield : 1 -- # target field index
property subfields : {¬
"name 1", "name 2", "name 3", "name 4", "name 5", ¬
"name 6", "name 7", "name 8", "name 9", "name 10"} -- # additional sub field's names
property sublen : count subfields
property text_class : string -- # input & output text class [1]
--property text_class : «class utf8» -- UTF-8
[1] Input and output text class (: text encoding)
string : System's primary encoding; e.g. Mac-Roman
«class utf8» : UTF-8
Unicode text : UTF-16BE
property ORS : return -- # output record separator
--property ORS : linefeed
property pp : {}
property qq : {}
property rr : {}
property mm : {}
-- (0) read input file
set t to read file infile as text_class
set pp to t's paragraphs
-- (1) build new header row
set h1 to my pp's item 1 -- original header row (from line 1 of infile)
set hh1 to text2list(h1, tab)
set hh1's item targetfield to {hh1's item targetfield} & subfields -- add new sub fields
set h to list2text(hh1, tab) -- new header row
set qq to {h}
-- (2) process each data row (line 2 .. line -1)
repeat with i from 2 to count my pp --pp's item i is data row (i - 1)
set p to my pp's item i
if p = "" then -- skip any empty row
set end of qq to p
else
-- get row data, name and name components
set rr to text2list(p, tab) -- current row data
set n to my rr's item targetfield -- full name
set mm to text2list(n, space) -- name components delimited by space
-- special treatment in case name is enclosed in double quotes
if n starts with """ then -- original name is in double quotes
set mm to text2list(list2text(my mm, """ & tab & """), tab) -- quote every component
end if
-- adjust name components' length to match the given field length
set delta to sublen - (count my mm)
repeat delta times
set end of my mm to "" -- pad "" to end
end repeat
if delta < 0 then set mm to my mm's items 1 thru sublen -- truncate any extra, just in case
-- build new row data
set my rr's item targetfield to {n} & my mm
set end of my qq to list2text(my rr, tab)
end if
end repeat
-- (3) build output text and write it to output file
set t1 to list2text(qq, ORS)
writeData(t1, outfile, {_append:false, class:textclass})
return t1
end script
tell o to run
end main
on list2text(aa, delim)
list aa : source list
text delim : text item delimiter in list-text coercion
local t, astid, astid0
set astid to a reference to AppleScript's text item delimiters
try
set astid0 to astid's contents
set astid's contents to {delim}
set t to "" & aa
set astid's contents to astid0
on error errs number errn
set astid's contents to astid0
error "list2text(): " & errs number errn
end try
return t
end list2text
on text2list(t, delim)
text t : source text
text delim : text item delimiter in text-list conversion
local tt, astid, astid0
set astid to a reference to AppleScript's text item delimiters
try
set astid0 to astid's contents
set astid's contents to {delim}
set tt to t's text items
set astid's contents to astid0
on error errs number errn
set astid's contents to astid0
error "text2list(): " & errs number errn
end try
return tt
end text2list
on writeData(x, fp, {append:append, class:class})
data x: anything to be written to output file
string fp: output file path
boolean _append: true to append data, false to replace data
type class _class: type class as which the data is written
local fref
try
set fref to open for access (file fp) with write permission
if not _append then set eof fref to 0
write x as _class to fref starting at eof
close access fref
on error errs number errn
try
close access file fp
on error --
end try
error "writeData(): " & errs number errn
end try
end writeData
on getPathComponents(a)
alias or HFS path string : a
return record : {|parent|:p, |name|:n, |name stem|:m, |name extension|:x}, where -
p = parent path (trailing colon inclusive)
n = node name (trailing colon not inclusive)
m = node name without name extension (trailing period not inclusive)
x = name extension (leading period inclusive; i.e. n = m & x)
local astid, astid0, fp, p, n, m, x
set astid to a reference to AppleScript's text item delimiters
set astid0 to astid's contents
try
-- (0) preparation (strip trailing ":")
set fp to a as Unicode text
if fp ends with ":" and fp is not ":" then set fp to fp's text 1 thru -2
-- (1) get node's parent path and node name
set astid's contents to {":"}
tell fp's text items
if (count) ≤ 1 then
set {p, n} to {"", fp}
else
set {p, n} to {(items 1 thru -2 as Unicode text) & ":", item -1}
end if
end tell
-- (2) get node name stem and extension
set astid's contents to {"."}
tell n's text items
if (count) ≤ 1 then
set {m, x} to {n, ""}
else
set {m, x} to {items 1 thru -2 as Unicode text, "." & item -1 as Unicode text}
end if
end tell
set astid's contents to astid0
on error errs number errn
set astid's contents to astid0
error "getPathComponents(): " & errs number errn
end try
return {|parent|:p, |name|:n, |name stem|:m, |name extension|:x}
end getPathComponents
--END OF SCRIPT 2
Message was edited by: Hiroto (fixed the code a bit)

[Upgrade] XML & Encoding

Hi,
Context
I developed in a BW 3.0 environment an AJAX BSP application. I use the CALL TRANSFORMATION id to return a XML string to the calling page where some Javascript parses it (getElementsByTagName, ...) and updates input fields.
In BW 3.0B
The BSP pages were encoded in iso8859-1
The XML returned was also iso8859-1 encoded.
In BI 7
Now, we upgraded to BI 7 with Unicode.
The BSP pages are now encoded in utf-8 (when I look at the HTTP Header Fields)
The XML returned is utf-16 encoded (it is the responseText attribute that I refer to)
The problem is that the responseXML object which I use in Javascript seems to be either corrupt or unreadable (I bet on encoding reasons).
I tried to specify the encoding like this :
CALL METHOD response->set_header_field( name = 'Content-Type'
 value = 'text/xml; charset=utf-16' ).
but without success.
Questions
Do I have to :
1) Re-encode the XML string
2) Use binary instead of character
Besides, is there a way to inspect the responseXML (today, I launch alert(responseXML.childnodes.length);)
Thanks in advance.
Best regards,
Guillaume

Hi Raja,
Many thanks for these valuable insights !
I was able to correct my code partly.
Indeed, the responseXML object is back to normal with the following addition to CALL TRANSFORMATION id :
OPTIONS xml_header = 'WITHOUT_ENCODING'
But, as I am dealing with several tables that I gather into one XML string, I use the cl_xml_document_base class like this :
 CREATE OBJECT wcl_xml_doc_inputs.
 CALL METHOD wcl_xml_doc_inputs->parse_string( stream = w_xml_inputs ).
 CREATE OBJECT wcl_xml_doc_except_msg.
 CALL METHOD wcl_xml_doc_except_msg->parse_string( stream = w_xml_except_msg ).
 wcl_node_inputs = wcl_node_inputs->get_parent( ).
 wcl_node_inputs->append_child( new_child = wcl_node_except_msg ).
 CALL METHOD wcl_xml_doc_inputs->render_2_string
 EXPORTING
 pretty_print = 'X'
 IMPORTING
 stream = o_xml.
This might also affect the final encoding because the responseXML is unreadable yet again.
I tried to set some :
CALL METHOD wcl_xml_doc_inputs->set_encoding( charset = ... ).
with iso8859-1, utf8, utf-16 but still unsuccessful.
Thanks in advance.
Best regards,
Guillaume

Saving turkish characters

Hi,
In JDeveloper, i can edit the sources by using turkish characters but when i try to save the source code it pretends to save the characters correctly but after closing the source code and reopening it, i see that the turkish characters are replaced by ?. I set the Project-> Project Settings -> Compiler -> Character Encoding setting to UTF8, UTF-8, Windows-1254, but nothing has been changed. Anyone knows what i'm missing in the settings? My regional settings also set as turkish/turkey.
Thanks,
Fatih ER

Hi,
In JDeveloper, i can edit the sources by using turkish characters but when i try to save the source code it pretends to save the characters correctly but after closing the source code and reopening it, i see that the turkish characters are replaced by ?. I set the Project-> Project Settings -> Compiler -> Character Encoding setting to UTF8, UTF-8, Windows-1254, but nothing has been changed. Anyone knows what i'm missing in the settings? My regional settings also set as turkish/turkey.
Thanks,
Fatih ER

Gnome install problem ? is mirror out of sync ? [Solved]

HI i did pacman -S gnome and extra but when i wanna start a gnome-sesion i just get errors ? and my Xfce4 works fine ?
i dont why gnome dont wanna start
is mirror out of sync ?

What is your locale set to and what do you use to start gnome?
the gnome-session binary should get started from .xinitrc, while kde and xfce4 have a startscript that launch X for you instead. GNOME will come with such a script in a while, because the dbus-launch things that are required for many GNOME apps nowadays causes many problems for people who don't add it to their .xinitrc.
About the locale: make sure you pick one from /etc/locale.gen and have it enabled there. Take note of the UTF8/UTF-8 differences. Some locales work with both write methods, most locales fail with these warnings if you don't pick the locale from /etc/locale.gen.

Unicode attachment

I want to create an email attachment encoded in unicode sending from a servlet. I've tried to create temporary file, and attach the file. I can see the temporary file in my server is encoded correctly but after I received the attachment, it's a bunch of "?". I've also tried to use the string (unicode) directly and also with the same problem. Has anybody done this before? Many thanks!
Here is my code:
Properties props = new Properties();
props.put("mail.smtp.host", mymailhost);
Session s = Session.getInstance(props,null);
MimeMessage message = new MimeMessage(s);
message.setFrom(new InternetAddress("[email protected]"));
essage.addRecipient(Message.RecipientType.TO,
new InternetAddress(req.getParameter("emailAddr")));
message.setSubject(subject);
// send a multipart message// create and fill the first message part
MimeBodyPart mbp1 = new MimeBodyPart();
String msgText = new String(mystr.getBytes("8859_1"),"UTF-8");
mbp1.setText(msgText, "UTF-8");
//String file = getServletContext().getRealPath("/"); + "/temp/" + request.getSession().getId() + String.valueOf((new Date()).getTime())+ ".txt";
//writeOutput(msgText, file);
MimeBodyPart mbp2 = new MimeBodyPart();
// DataSource fds = new FileDataSource(file);
//mbp2.setDataHandler(new DataHandler(fds));
mbp2.setDataHandler(new DataHandler(msgText, "text/html;charset=\"UTF-8\""));
//mbp2.setText(msgText, "UTF-8");
Multipart mp = new MimeMultipart();
mp.addBodyPart(mbp1);
mp.addBodyPart(mbp2);
message.setContent(mp);
message.setSentDate(new Date());
private void writeOutput(String str, String file) {
try {
FileOutputStream fos = new FileOutputStream(file);
Writer out = new OutputStreamWriter(fos, "UTF8");
out.write(str);
out.close();
} catch (IOException ex) {
ex.printStackTrace();
}

For me it worked when I used
setText(msgText, "utf8");
UTF-8 and utf-8 did not work!
complete:
MimeBodyPart bodyPartAttachment = new MimeBodyPart();
bodyPartAttachment.setHeader("Content-Transfer-Encoding", "quoted-printable");
bodyPartAttachment.setHeader("Content-Type", "text/html; charset=utf8");
bodyPartAttachment.setText(sAttachment, "utf8");
bodyPartAttachment.setDisposition(Part.ATTACHMENT);
bodyPartAttachment.setFileName(...);
bodyPartAttachment.setDescription("File Attachment");
multipart.addBodyPart(bodyPartAttachment);

PBE with blowfish : is this solution secure

Hi,
I have designed a solution do do PBE with blowfish. Assuming there is no weak keys in Blowfish, and using a 160bits key (due to SHA-1) is this code secure ?
MessageDigest md = MessageDigest.getInstance("SHA-1");
byte[] password = "here is the password I use !".getBytes();
byte[] key = md.digest(password);                                // hash the password on 160bits
SecretKeySpec skspec = new SecretKeySpec(key, "Blowfish");       // prepare the key for the cipher
Cipher cipher = Cipher.getInstance("Blowfish/ECB/PKCS5Padding"); // create the cipher
cipher.init(Cipher.ENCRYPT_MODE,skspec);
byte[] someCipherText = cipher.doFinal(somePlainText);           Thanks
Oliver.

Also, Sun should better add public final static constants in the String class to have the names of encodings supported on all platforms:
class String {
    public static final String ASCII = "US-ASCII";
    public static final String ISOLAT1 = "ISO8859_1";
    public static final String UTF8 = "UTF-8";
}This would be a strong sign that at least these minimum encodings will be supported. Alternatively, these constants could be placed in a separate new class or interface (named for example Encoding).
May be there should exist a valid value allowing to specify explicitly the native platform encoding if we still need it.
    public static final String NATIVE;Then there would be no more excuse to not use an explicit encoding:
    byte[] buffer = new byte[] { (byte)64, (byte)65 (byte)66 };
    String abc = new String(buffer, Encoding.ASCII); // returns a string equal to "ABC"
    buffer = new byte[] { (byte)0xc4, (byte)0x80, (byte)0 };
    abc = new String(buffer, Encoding.UTF8); // returns a string equal to "\u0100\u0000"

Accent headache

Why wont the accents show up ?
i have a page (utf-8) that calls a list of spanish cities
from a mysql table (utf8) via php.
The database table has the correct names (with accents)
The php page will show the correct names if i type them in.
(also in source i see letters with accents)
But the php that grabs the names and displays it - will only
show [squares] instead of accented letters.
Please anyone, whats going on ??

Thanks for the help - but i still have a problem
my database details
version:5.0.45-community-nt
char set: utf8 (utf-8 unicode)
collation: utf8_general_ci
my table 'places' has records like Cataluña (the n has a
tilda)
1)
connection script: //$q="SET NAMES 'utf8' ";
$r=mysqli_query($DBlink,$q) or die(mysqli_error($DBlink)." Q=".$q);
page header: <meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1" />
= the drop down list of places shows correctly
2)
page header: <meta http-equiv="Content-Type"
content="text/html; charset=utf-8" />
= the drop down has squares
3)
using SET NAMES and page=utf-8
= drop down displays correctly
So, i am thinking that setup(3) is the best for my
english/spanish website
I have done the display data, next i will build the
user-input form.
I have another problem...
i have an include page: require_once 'newdropdown.php'; which
is just php (triggered by javascript)
SELECT moreplaces...
echo '<select name="new">...
Strangely this always displays incorrectly, no matter what i
try.
How can i make this second dropdown display correctly ??

Multiline JLabel in Japanese?

Hi.
On Windows NT 4.0, I am trying to get a JLabel to display multiple lines in Japanese.
With English or other Latin-based languages, this can be done easily enough using HTML tags. However, when I try to mix HTML tags with Japanese, the Japanese will not be rendered. Rather, it will display the infamous square box.
Below is sample code which renders the two cases (uncomment the appropriate case).
Any ideas?
import javax.swing.*;
import java.awt.*;
public class test {
public static void main(String[] args) {
JFrame frame = new JFrame("HelloWorldSwing");
final JLabel label = new JLabel();
Font f = new Font("MS Mincho", Font.PLAIN, 24);
label.setFont(f);
 // This doesn't render the Japanese, but it renders Cyrillic
label.setText("<html><body>\u0414 \u3059</body></html>");
 // This renders both Japanese and Cyrillic properly
 //label.setText("\u0414\u3059");
frame.getContentPane().add(label);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
frame.pack();
frame.setVisible(true);
}

One thing I forgot to mention...
I tried setting the following as well:
label.setText("<html><head><meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF8\"></head><body>\u0414 \u3059</body></html>");
For "charset", I tried various different settings (shift_jis, utf8, utf-8, utf-16). However, I may have gotten this wrong so I am mentioning this, in case it helps anyone to find the correct answer more easily.

UTF8, UTF-16 & Cloudscape

Similar Messages

Maybe you are looking for