SVN Merge - Another Unicode/UTF-8 problem

When merging conflicts in Unicode/UTF-8 files (Java, XML, JSPX) the resulting file looses UTF-8 characters.
This is yet another serious Unicode/UTF-8 bug in JDev. What is going on with Unicode/UTF-8 support? Currently, developing any fully "Globalized" application with JDev is "mission impossible"...

Hi Steve!
Thank you for info. In noticed that Trinidad 1.2.2, released 3 months ago, is not included in TP 2 also. So I understand that something commited on 20th of September also did not make to TP 2.
Anyhow, god to know that we can expect better unicode support in TP 3 or so.

Similar Messages

More Cf + MySQL 5 + Unicode/UTF-8 Problems

Here is the problem:
I am using a MySQL database that store data in Unicode/UTF-8
(the website/database are in Lao).
Settings:
CF 7.0.2
MySQL 5.0.26
MySQL Defaults: latin1_swedish_ci collation, latin1 encoding
Database Defaults: utf8_general_ci collation, utf8 encoding
These are same on my local computer and on the host
(HostNexus)
The only difference is that my CF uses
mysql-connector-java-3.1.10 driver while the host uses MySQL 3.x
driver (org.gjt.mm.mysql.Driver class).
On my local computer everything works just fine, even without
any extra CF DSN settings (i.e. connection string and/or JDBC URL
do not need the useUnicode=true&characterEncoding=UTF-8 strings
added to show Lao text properly).
On the host, even with the
useUnicode=true&characterEncoding=UTF-8 added (I have even
tried adding
&connectionCollation=utf8_unicode_ci&characterSetResults=utf8
to the line as well), I only get ??????? instead of Lao text from
the db.
The cfm pages have <cfprocessingdirective> and
<cfcontent> tags set to utf-8 and also have html <meta>
set to utf-8. ALl static Lao text in cfm pages shows just fine.
Is this the problem with the MySQL driver used by the host?
Has anyone encountered this before? Is there some other setting I
have to emply with the org.gjt.mm.mysql.Driver class on the host?
Please help!

Thanks for your reply/comments, Paul!
I also think it must be the db driver used on the host... I
just don't understand why the DSN connection string
(useUnicode=true&characterEncoding=UTF-8 [btw, doesn't really
matter utf8 or UTF-8 - works with both; I think the proper way
actually is UTF-8, since that is the encosing's name used in
Java...]) wouldn't work with it??? I have the hosting tech support
totally puzzled over this.
Don't know if you can help any more, but I have added answers
to your questions in the quoted text below.
quote:
Sabaidee wrote:
> Here is the problem:
> I am using a MySQL database that store data in
Unicode/UTF-8 (the
> website/database are in Lao).
well that's certainly different.
I mean, they are in Lao language, not that they are hosted in
Laos.
> Database Defaults: utf8_general_ci collation, utf8
encoding
how was the data entered? how was it uploaded to the host?
could the data have
been corrupted loading or uploading to the host?
The data was entered locally, then dumped into a .sql file using
utf8 charset and then the dump imported into the db on the host,
once again with utf8 charset. I know the data in the database is
correct: when I browse the db tables with phpMyAdmin, all Lao text
in the db is displayed in proper Lao...
> The only difference is that my CF uses
mysql-connector-java-3.1.10 driver
> while the host uses MySQL 3.x driver
(org.gjt.mm.mysql.Driver class).
and does that driver support mysql 5 and/or unicode?
I am sure it does support MySQL5, as I have other MySQL5
databases hosted there and they work fine. I am not sure if it
supports Unicode, though.... I am actually more and more sure it
does not... The strange this is, I am not able to find the java
class that driver is stored in to try and test using it on my local
server... I have asked the host to email me the .jar file they are
using, but have not heard back from them yet...
> On my local computer everything works just fine, even
without any extra CF DSN
> settings (i.e. connection string and/or JDBC URL do not
need the
> useUnicode=true&characterEncoding=UTF-8 strings
added to show Lao text
> properly).
and what happens if you do use those? what locale for the
local server?
If I use just that line, nothing changes (apart from the 2 mysql
variables which then default to uft8 instead of latin1) -
everything works just fine locally.
The only difference I have noticed between MySQL setup on my
local comp and on the host is that on my comp the
character_set_results var is not set (shows [empty string]), but on
the host it is set to latin1. When I set it to latin1 on my local
comp using &characterSetResults=ISO8859_1 in the JDBC URL
string, I get exactly same problem as on the host: all ???????
instead of Lao text from db. If it is not set, or set to utf8,
everything works just fine.
For some reason, we are unable to make it work on the host:
whatever you add to the JDBC URL string or enter in the Connection
String box in CF Admin is simply ignored...
Do you know if this is a particular problem of the driver
used on the host?
> The cfm pages have <cfprocessingdirective> and
<cfcontent> tags set to utf-8
> and also have html <meta> set to utf-8. ALl static
Lao text in cfm pages
> shows just fine.
db driver then.
I think so too...

Quotation marks display as &quot in web pages, I'm using Unicode UTF-8 character encoding.

On many web pages, where a quotation mark character should appear, instead the page displays the text &quot. I believe this happens with other punctuation characters as well such as apostrophes although the text displayed in these other cases is different, of course. I'm guessing this is a problem with character encoding. I'm currently set to Unicode (UTF-8) encoding. Have tried several others without success.

Here's a link where the problem occurs. Note the second line of the main body of text.
http://www.sierratradingpost.com/lp2/snowshoes.html
BTW, I never use IE, but I checked this site in IE and it shows the same problem, so maybe it is the page encoding after all rather than what I thought.
In any case, my thanks for your help and would appreciate any solution you can suggest.

Character encoding of Unicode (UTF-8) is what seems to be the default for printing. The page looks funny and the print is spread out.

the layout of the page is different that it appears on the screen. I printed the confirmation of my VISA payment on line and it is not concise. The information is correct, but the layout isn't.
Should I use another encoding? One of the Western ones? Maybe unicode (UTF-16)?

There is a known bug involving printing in beta 12 that has been fixed in the Firefox 4 release candidate which is due out soon.

Bug - SQL Developer 3.0.04 SVN Merge

Hello.
I've found a bug when trying to merge files.
The problem occurred when there is a conflict on file content and an update on properties.
The SVN plugin prepares merge (diff) files but the GUI shows that the file is MERGED instead of CONFLICTED, and there is no way to enter the resolve conflicts on that file.
Example log from SVN:
--- Merging r19952 through r20348 into D:/test_sqlDev_svn/us10591/sql/AAA.sql
CG D:/test_sqlDev_svn/us10591/sql/AAA.sql
C -There is a conflict on file content
G - Properties were merged successfully without conflicts
If there is a conflict during merge but properties of file have not changed then everything works fine.
Can you please fix this bug as it causes the usage of SQL Developer SVN features unusable our my environment.
Kind Regards
Jacek Gębal
Edited by: user2242149 on 2011-11-25 07:21
Edited by: user2242149 on 2011-11-25 07:27

bump

How to save a file in unicode (UTF-8)

Hello,
I'm trying to save a xml file in unicode (UTF-8) in a 4.6C system. I tried the OPEN DATASET 'file' IN TEXT MODE FOR OUTPUT ENCODING UTF-8 but this is not available in 4.6C. Does anybody have an idea how to do this?
Thanks in advance
Kind regards
Roel

Hi Roel,
There is a workaround for this issue.
Use code below:
encoding = 'utf-8'.
data: codepage            type cpcodepage.
call function 'SCP_CODEPAGE_BY_EXTERNAL_NAME'
    exporting
      external_name = encoding
    importing
      sap_codepage = codepage
    exceptions
      not_found     = 1
      others        = 2.
if sy-subrc <> 0.
endif.
call function 'SCP_TRANSLATE_CHARS'
    exporting
      inbuff           = sourcedata_xml
      inbufflg         = length
      incode           = codepage
      outcode          = codepage
      substc_space     = 'X'
      substc           = '00035'
    importing
      outbuff          = custom_data
    exceptions
      invalid_codepage = 1
      internal_error   = 2
      cannot_convert   = 3
      fields_bad_type = 4
      others           = 5.
Now write this custom_data onto application server by using open dataset and transfer.
Also have a look at this weblog, there is a code sample in it.
/people/thomas.jung3/blog/2004/08/31/bsp-150-a-developer146s-journal-part-x--igs-charting
Hope it'll help.
Cheers
Ankur

Read/Write file in Unicode (UTF-16)

Hi, I have some issue to write a file in Unicode (UTF-16)
I have to read a file with LabView, change some parameters and write the new data into the same file. The file uses Unicode UTF-16.
I downloaded some library here: https://decibel.ni.com/content/docs/DOC-10153
I can read the file, convert the data to UNI/ASCII/UNI, and then write the file. But when I open the new file with an editor like Notepad++ there are some unexpected characters at the end of the line.
Even reading the file and writing exactly the same data doesn'not work.
I attached an example.
Thanks for you kind support.
Solved!
Go to Solution.
Attachments:
TEST.zip ‏7 KB

Right-click on your Read and Write Text File functions. There is an option to "Convert End Of Line". Turn that off on both functions.
As a side not, you don't need the Close File functions. The Read and Write Text File functions will close the file if the file reference output is not wired.
There are only two ways to tell somebody thanks: Kudos and Marked Solutions
Unofficial Forum Rules and Guidelines

JDBC-ODBC Bridge does not support Unicode UTF-16

Hi
I'm using Jdeveloper 10.0.3 IDE in order to develop an application for data transformation between MS Access 2003 (source) and Oracle 10g (destination). Clients use Windows XP.
JDBC-ODBC Bridge still does not support Unicode UTF-16 which is the Charest used by MS Access 2000/2003.
Note that when I changed locale in regional setting, destination Connection to Ora10g failed to open a connection, it works only with English locale, so I can't change my locale information.
How can I read Unicode from source DB?
Any help would be appreciated. I look forward to see your response.
Thanks,

i also heared that JDBC-ODBC Bridge still does not support Unicode UTF-16,
but i guess this is not in my case.That's the key in fact. JDBC-ODBC Bridge does not support UTF-16, which is the charset used by MS Access 2000/2003.
or do i need to use a third party driver for jdbc odbc bridge?Free library at http://jackcess.sourceforge.net/
Commerical JDBC driver at http://www.hxtt.com/access.html
Yonghong Zhao
System Analyst
www.hxtt.com

I have problems printing with a HP Officejet 6500 printer directly from my iMac. The printer itself does function well, as I can print from another PC without any problems. Does anyone have a clue?

I have problems printing with a HP Officejet 6500 printer directly from my iMac. The printer itself does function well, as I can print from another PC without any problems. Does anyone have a clue?

Hello:
I must apologize . I should have checked here first:
http://support.apple.com/kb/HT3669#HP
According to the list, there is no driver for your printer included in OS X 10.7.
There are a couple of options:
1. Contact HP to see if they have a workaround.
2. If the model of printer is not listed but your printer is a PostScript printer or PCL Laser printer, try the "Generic PostScript" or "PCL Laser printer" drivers. Generic printer drivers may not let you access all the features of your printer.
I have not tried #2, but some people have had success.
Barry

[UPDATED WORKAROUND] SEVERE unicode/ UTF-8 ADFm bidnig/invokation problem

I have BIG problem with very simple search page use-case (with one text input field, search button and af:table for results). It looks like the unicode input value is somehow ruined during PPR and model update cycle (the unicode value is internally collated to ascii while transferred to EJB method)!!!
Here is the scenario (please note that all was created by pure drag-and-drop from Data Controls palette):
I have one inputText field on page, value of which is bound to simple attribute binding (say #{bindings.key.inputValue}) which is bound to vKey variable.
In PageDef I have methodIterator with
<methodIterator Binds="XYZ.result" Refresh="always" ...(the XYZ is method in some EJB) and I have adequate methodAction XYZ defined with named param:
<NamedData NDName="key" NDValue="${bindings.vKey}" NDType="java.lang.String"/>The table is bound to tree binding bound to the above methodIterator. The button is PPR trigger for table (only thing nod done by drag-and-drop).
JSPX page is xml encoding="UTF-8" with:
<jsp:directive.page contentType="text/html;charset=UTF-8"
pageEncoding="UTF-8"/>There are no locale settings in faces-config.xml (the default config).
In inputText I entered some unicode input text like "ЧШЩЪ".
When button is clicked, the methodAction XYZ is invoked.
The debugger brake-point is set inside EJB method.
Now, during the PPR after button click, the EJB method brake-point is hited twice (I assume because the Refresh="always" for methodIterator). In firs hit, the value of key parameter is OK (correct unicode value visible in Inspect...). BUT, the second time (during the same PPR) the method is invoked WITH totally ruined value of "????"! Of course, the search didn't find anything...
Thus, not only that problem of unicode is related to localization of pages/resources but something strange is happening with value binding also.
Can someone help?
Message was edited by:
PaKo
Message was edited by:
PaKo

Another way around:
Instead using processScope or pageFlowScope (which is not releasing memory automatically so it may make you a problems further on), I discovered an alternative workaround:
instead binding to #{bindings.someAttribBinding.inputValue} (which suffers from UTF-8 bug as concluded), you can bind your text inputs directly to #{bindings.someIterator.currentRow.dataProvider. someAttribute } which binds directly to your underlying data source property.
In my case, I use EJBs so the underlying datasource is Entity bean and this way I bind directly to setter method thus overriding any ADFm interference.
This shows to be more reliable and also MORE EFFICIENT! In case of indirect (via buggy attribute) binding, the getter method in entity is called twice while in case of direct binding (through .currentRow.dataProvider.someAttribute) the getter is called only once per page lifecycle (the setter is called once in both cases).
I would, thus, suggest to ADF team to consider introduction of some sort of better support for direct binding to the underlying data sources instead through Iterators and Attribute bindings. On example, introduce Entity Binding (like Tree binding, but with direct support for access to entity attributes including parent/children collections). This also apply for list bindings where it is NECESSARY to enable object binding from list to entity attribute (as EJB entities don't know for foreign keys but for related entities so the attribute mapping supported with current list bindings is totally useless).
Regards,
Pavle

Problem setting Unicode (utf-8) in http header using tomcat

Hi:
I am trying to set a file name in utf-8 to http header using the following code:
response.setContentType("text/html; charset=utf-8");
response.setHeader("Content-disposition", "attachment; filename=解決.zip");
// I actually has file name in utf-8 here to set to the header, and I know that the name is correctly
// and I also looked into the response object MimeHeaders object and saw the head is correctly there
then write the content of zip file using ServletOutputStream.
The problem I have is that the file name is not displayed correctly when prompted to save or open in the pop up window next. I found out using Fiddler that the request header is wrong:
Content-disposition: attachment; filename=�zn��.zip
I am using Tomcat 5.0.28. Any idea how to get this working?
Thanks in advance!

You are setting the charset for the content to be UTF-8. (That is why the method is called setContentType.) But HTTP headers are not part of the content and so that has no effect on the header.
The original specification for HTTP only allowed US-ASCII characters in headers. It is possible that more recent versions have features that allow for non-ASCII header data, but I don't know if that is the case or how you would use those features if they exist.

Problem while sending unicode (utf-8) xml to IE.

Hi,
I have encoding problem while sending utf-8 xml from servlet to IE (Client), where i am parsing the xml using Ajax.
In the log I can see proper special characters that are being sent from the servlet. but when same is seen in the client end,, it is showing ? symbols instead of special charcters.
This is the code that sends the xml from servlet.
ByteArrayOutputStream stream = new ByteArrayOutputStream(2000);
transformer.transform(new DOMSource(document), new StreamResult(new OutputStreamWriter(stream, "iso-8859-1")));
_response.setContentType("text/xml; charset=UTF-8");
_response.setHeader("Cache-Control", "no-cache");
_response.getWriter().println(new String(stream.toByteArray(), "UTF-8"));
In the log i can see :
<response status="success" value="1154081722531" hasNextPage="false" hasPreviousPage="false" ><row row_id="PARTY_test_asdasd" column_0="PARTY_test_asdasd" column_1="asdasd �" mode="edit" column_en_US="asdasd �" column_de_DE="? xyz" column_fr_FR="" ></row></response>
But in the Client side I am able to see
<?xml version = '1.0' encoding = 'UTF-8'?>
<response status="success" value="1154082795061" hasNextPage="false" hasPreviousPage="false"><row row_id="PARTY_test_asdasd" column_0="PARTY_test_asdasd" column_1="asdasd ?" mode="edit" column_en_US="asdasd ?" column_de_DE="? xyz" column_fr_FR=""/></response>
I am getting ? instead of �.
It will be greatful if somebody tell how to send utf xml from servlet, for ajax purpose.
Thanks,
Siva1

This is the code that sends the xml from servlet.
ByteArrayOutputStream stream = new
ByteArrayOutputStream(2000);
transformer.transform(new DOMSource(document), new
StreamResult(new OutputStreamWriter(stream,
"iso-8859-1")));Here you produce XML that's encoded in ISO-8859-1. (!!!)
_response.setContentType("text/xml; charset=UTF-8");Here you tell the browser that it's encoded in UTF-8.
_response.getWriter().println(new String(stream.toByteArray(), "UTF-8"));Here you convert the XML to a String, assuming that it was encoded in UTF-8, which it wasn't.
Besides shooting yourself in the foot by choosing ISO-8859-1 for no good reason, you're also doing a lot of translating from bytes to chars and back again. Not only is that a waste of time, it introduces errors if you don't do it right. Try this instead:_response.setContentType("text/xml; charset=UTF-8");
_response.setHeader("Cache-Control", "no-cache");
_transformer.transform(new DOMSource(document_),
new StreamResult(_response.getOutputStream()));

Unicode kernel upgrade problem in XI server

Hi
I'm trying to upgrade the Brtools in XI server and getting the following problem:
rx2n0v4:xdvadm 23> SAPCAR -xvf DBATL640O92_45-10002837.SAR
stderr initialized
processing archive DBATL640O92_45-10002837.SAR...
--- Unicode interface [u16_get.c line 233] pid = 6963 :
Invalid UTF-8 encountered by fgetsU16 (fileno 0x4)
fd
Characters previously read:
0043 0041 0052 0020 0032 002e 0030 0031
0052 0047 030 0031
--- Unicode interface -
End of message -
Illegal byte sequence DBATL640O92_45-10002837.SAR
Couple of times, i downloaded the kernel today and tried but get the same error. Here XI (6.40)is the unicode server and i downloaded the unicode kernel from sapnet (brtools and SAPCAR kernel). I tried with version 7.00 kernel but get the same problem.
Any solution of this problem?
Regards
Amar

Confusion About SP16 Unicode Kernel Patch/Upgrade
Problem with updating XI 3.0 (Kernel etc.)
Check this might be useful.

Unicode, UTF-8 and java servlet woes

Hi,
I'm writing a content management system for a website about russian music.
One problem I'm having is trying to get a java servlet to talk Unicode to the Content mangament client.
The client makes a request for a band, the server then sends the XML to the client.
The XML reading works fine and the client displays the unicode fine from an XML file read locally (so the XMLReader class works fine).
The servlet unmarshals the request perfectly (its just a filename).
I then find the correct class, and pass it through the XML writer. that returns the XML as string, that I simply put into the output stream.
out.write(XMLWrite(selectedBand));I have set correct header property
response.setContentType("text/xml; charset=UTF-8");And to read it I
         //Make our URL
         URL url = new URL(pageURL);
         HttpURLConnection conn = (HttpURLConnection)url.openConnection();
         conn.setRequestMethod("POST");
         conn.setDoOutput(true); // want to send
         conn.setRequestProperty( "Content-type", "application/x-www-form-urlencoded" );
         conn.setRequestProperty( "Content-length", Integer.toString(request.length()));
         conn.setRequestProperty("Content-Language", "en-US");
         //Add our paramaters
         OutputStream ost = conn.getOutputStream();
         PrintWriter pw = new PrintWriter(ost);
         pw.print("myRequest=" + URLEncoder.encode(request, "UTF-8")); // here we "send" our body!
         pw.flush();
         pw.close();
         //Get the input stream
         InputStream ois = conn.getInputStream();
            InputStreamReader read = new InputStreamReader(ois);
         //Read
         int i;
         String s="";
         Log.Debug("XMLServerConnection", "Responce follows:");
         while((i = read.read()) != -1 ){
          System.out.print((char)i);
          s += (char)i;
         return s;now when I print
read.getEncoding()It claims:
ISO8859_1Somethings wrong there, so if I force it to accept UTF-8:
InputStreamReader read = new InputStreamReader(ois,"UTF-8");It now claims its
UTF8However all of the data has lost its unicode, any unicode character is replaced with a question mark character! This happens even when I don't force the input stream to be UTF-8
More so if I view the page in my browser, it does the same thing.
I've had a look around and I can't see a solution to this. Have I set something up wrong?
I've set, "-encoding utf8" as a compiler flag, but I don't think this would affect it.

I don't know what your problem is but I do have a couple of comments -
1) In conn.setRequestProperty( "Content-length", Integer.toString(request.length())); the length of your content is not request.length(). It is the length of th URL encoded data.
2) Why do you need to send URL encoded data? Why not just send the bytes.
3) If you send bytes then you can write straight to the OutputStream and you won't need to convert to characters to write to PrintWriter.
4) Since you are reading from the connection you need to setDoInput() to true.
5) You need to get the character encoding from the response so that you can specify the encoding in           InputStreamReader read = new InputStreamReader(ois, characterEncoding);
6) Reading a single char at a time from an InputStream is very inefficient.

SciTE 1.60 UTF-8 problem

I can't write Russian or Hebrew when i set "File-> Encoding->UTF-8"
In Kate for example it's work properly.

@leX wrote:I can't write Russian or Hebrew when i set "File-> Encoding->UTF-8"
In Kate for example it's work properly.
I can write german Umlauts like ü,ä,ö without problems. I also copied a text
from the Russian truth (prawda) to it and saw the text correctly.
My crystal ball says you dont have enabled the pango engine in scite. Instead
you sticked with the oridinary gtk textarea. To enable it you need specify
your font settings in your /usr/share/scite/SciTEGlobal.properties (or you
local copy of itin ~) with setting an exclamation mark infron of the font.
Here is a copy of mine:
# Give symbolic names to the set of fonts used in the standard styles.
if PLAT_WIN
font.base=font:Verdana,size:10
font.small=font:Verdana,size:8
font.comment=font:Comic Sans MS,size:9
font.code.comment.box=$(font.comment)
font.code.comment.line=$(font.comment)
font.code.comment.doc=$(font.comment)
font.text=font:Times New Roman,size:11
font.text.comment=font:Verdana,size:9
font.embedded.base=font:Verdana,size:9
font.embedded.comment=font:Comic Sans MS,size:8
font.monospace=font:Courier New,size:10
font.vbs=font:Lucida Sans Unicode,size:10
if PLAT_GTK
font.base=font:!bitstream vera sans mono,size:10
font.small=font:!bitstream vera sans mono,size:9
font.comment=font:!bitstream vera sans mono,italics,size:10
font.code.comment.box=$(font.comment)
font.code.comment.line=$(font.comment)
font.code.comment.doc=$(font.comment)
font.text=font:!bitstream vera sans mono,size:10
font.text.comment=font:!bitstream vera sans mono,size:10
font.embedded.base=font:!bitstream vera sans mono,size:9
font.embedded.comment=font:!bitstream vera sans monosize:9
font.monospace=font:!andale mono,size:10
font.vbs=font:!bitstream vera sans mono,size:10
font.js=$(font.comment)
Try this, HTH
bye neri

SVN Merge - Another Unicode/UTF-8 problem

Similar Messages

Maybe you are looking for