JSF and Character Sets (UTF-8)

Hi all,
This question might have been asked before, but I'm going to ask it anyway because I'm completely puzzled by how this works in JSF.
Let's begin with the basics, I have an application running on an OC4J servlet container, and am using JSF 1.1 (MyFaces). The problems I am having with this setup, is that it seems that the character encodings I want the server/client to use are not coming across correctly. I'm trying to enforce the application to be UTF-8, but after the response is rendered to my client, I've magically been reverted to ISO-8859-1, which is the main character set for the netherlands. However, I'm building the application to support proper internationalization; which means I NEED to use UTF-8.
I've executed the following steps to reach this goal:
- All JSP files contain page directives, noting the character set:
<%@ page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8" %>I've checked the generated source that comes from the JSP's, it looks as expected.
- I've created a servlet filter to set the character set directly on the request and response objects:
    public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain) throws IOException, ServletException {
        // Set the characterencoding for the request and response streams.
        req.setCharacterEncoding("UTF-8");
        res.setContentType("text/html; charset=UTF-8");
        // Complete (continue) the processing chain.
        chain.doFilter(req, res);
    }I've debugged the code, and this works fine, except for where JSF comes in. If I use the above situation, without going through JSF, my pages come back UTF-8. When I go through JSF, my pages come back as ISO-8859-1. I'm baffled as to what is causing this. On several forums, writing a filter was proposed as the solution, however this doesn't do it for me.
It looks like somewhere internally in JSF the character set is changed to ISO. I've been through the sources, and I've found several pieces of code that support that theory. I've seen portions of code where the character set for the response is set to that of the request. Which in my case coming from a dutch system, will be ISO.
How can this be prevented? Can anyone give some good insight on the inner workings of JSF with regards to character sets in specific? Could this be a servlet container problem?
Many thanks in advance for your assistance,
Jarno

Jarno,
I've been investigating JSF and character encodings a bit this weekend. And I have to say it's more than a little confusing. But I may have a little insight as to what's going on here.
I have a post here:
http://forum.java.sun.com/thread.jspa?threadID=725929&tstart=45
where I have a number of open questions regarding JSF 1.2's intended handling of character encodings. Please feel free to comment, as you're clearly struggling with some of the same questions I have.
In MyFaces JSF 1.1 and JSF-RI 1.2 the handling appears to be dependent on the raw Content-Type header. Looking at the MyFaces implementation here -
http://svn.apache.org/repos/asf/myfaces/legacy/tags/JSF_1_1_started/src/myfaces/org/apache/myfaces/application/jsp/JspViewHandlerImpl.java
(which I'm not sure is the correct code, but it's the best I've found) it looks like the raw header Content-Type header is being parsed in handleCharacterEncoding. The resulting value (if not null) is used to set the request character encoding.
The JSF-RI 1.2 code is similar - calculateCharacterEncoding(FacesContext) in ViewHandler appears to parse the raw header, as opposed to using the CharacterEncoding getter on ServletRequest. This is understandable, as this code should be able to handle PortletRequests as well as ServletRequests. And PortletRequests don't have set/getCharacterEncoding methods.
My first thought is that calling setCharacterEncoding on the request in the filter may not update the raw Content-Type header. (I haven't checked if this is the case) If it doesn't, then the raw header may be getting reparsed and the request encoding getting reset in the ViewHandler. I'd suggest that you check the state of the Content-Type header before and after your call to req.setCharacterEncoding('UTF-8"). If the header charset value is unset or unchanged after this call, you may want to update it manually in your Filter.
If that doesn't work, I'd suggest writing a simple ViewHandler which prints out the request's character encoding and the value of the Content-Type header to your logs before and after the calls to the underlying ViewHandler for each major method (i.e. renderView, etc.)
Not sure if that's helpful, but it's my best advice based on the understanding I've reached to date. And I definitely agree - documentation on this point appears to be lacking. Good luck
Regards,
Peter

Similar Messages

Database Encoding and character set

Hi,
Is it possible to change [some easier straigtht forward way] the encodings and character set for a database [DB11 in this case]?
I have a database which has UTF16 and MSWIN1252 as the encoding and characterset, i want to change it to UTF8 for some testing and reasons.
Thanks,
Anupam

You should detail what you mean by encoding and character set.
A database has 2 character sets:
- the character set for CHAR, VARCHAR2 and CLOB
- the national character set for NCHAR, NVARCHAR2 and NCLOB.
You have 2 ways to change the database character set:
- with ALTER DATABASE statement
- with export/import
See http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96529/ch10.htm#1656

How to know the existing NLS_LANG and Character Set

Dear all
How can I know about the existing NLS_LANG and Character set setting for Oracle 8 (Unix platform)
Thank you
Kwan

You can see this from the following
NLS_DATABASE_PARAMETERS (i think it is how database was created)
NLS_INSTANCE_PARAMETERS (this instance)
NLS_SESSION_PARAMETERS (this you can set just for your login session).
HTH
Srinivasa Medam

Database character set = UTF-8, but mismatch error on XML file upload

Dear experts,
I am having problems trying to upload an XML file into an XMLType table. The Database is 9.2.0.5.0, with the character set details:
SELECT *
FROM SYS.PROPS$
WHERE name like '%CHA%';
Query results:
NLS_NCHAR_CHARACTERSET          UTF8     NCHAR Character set
NLS_SAVED_NCHAR_CS          UTF8
NLS_NUMERIC_CHARACTERS          .,     Numeric characters
NLS_CHARACTERSET          UTF8     Character set
NLS_NCHAR_CONV_EXCP          FALSE     NLS conversion exception
To upload the XML file into the XMLType table, I am using the command:
insert into XMLTABLE
values(xmltype(getClobDocument('ServiceRequest.xml','UTF8')));
However, I get the error:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00200: could not convert from encoding UTF-8 to UCS2
Error at line 1
ORA-06512: at "SYS.XMLTYPE", line 0
ORA-06512: at line 1
Why does it mention UCS2, as can't see that on the Database character set?
Many thanks for your help,
Mark

USC2 is known as AL16UTF16(LE/BE) by Oracle...
Try using AL32UTF8 as the character set name
AFAIK The main difference between Oracle's UTF8 and AL32UTF8 character set is that is the UTF8 character set does not support those UTF-8 characteres that require 4 bytes..
-Mark

Utl_file and character sets

hello,
we are using AL32UTF8 character set in our database, I have a PL/SQL routine reading a csv file with utl_file into the database, Client using UTF-8 (Toad). When I have a UTF-8 csv file, everything works fine. But I want to be able to read ANSI aswell. Is there a way to read ANSI files aswell with utl_file ?
ps: I used CONVERT(v_line, 'UTF8', 'WE8MSWIN1252') and it worked. Is there a way I can read out the character set of the csv file with PL/SQL, so I dont need to use a parameter ?
Ilja
Edited by: Ikrischer on Sep 25, 2009 2:32 PM

Too bad your Oracle installation doesn't have a version number or use any DDL so we would know what you are doing.
Are you using GET_LINE or GET_LINE_NCHAR or something else.

MySQL5 and Character Sets

Hi Everyone.
We are evaluating switching for MySQL4.0.x (native support
via CF) to MySQL5.0.x (support via JDBC ConnectorJ) and we are
having some character set issues with on our evaluation server.
When we had it configured with MySQL4.0.x using the built in MySQL
driver we always used the connection string to use the UTF-8
character set:
useUnicode=true&characterEncoding=utf-8
We have tried using this with the JDBC driver but it doesn't
appear to have any effect, all the special character are coming out
as mangle multiple character string, which is the same as we see if
we connect to the server from the command prompt using the default
"Latin1" character set. If we connect from the command prompt using
UTF-8 everything looks ok, so I'm guessing the connection string
has changed syntax. I've checked the ConnectorJ documentation and
it appears the connection string should now be:
characterEncoding=UTF-8
However, this did seem to make any difference.
Any ideas?

andrewdixon wrote:
> Hi Everyone.
>
> We are evaluating switching for MySQL4.0.x (native
support via CF) to
> MySQL5.0.x (support via JDBC ConnectorJ) and we are
having some character set
> issues with on our evaluation server. When we had it
configured with MySQL4.0.x
> using the built in MySQL driver we always used the
connection string to use the
> UTF-8 character set:
>
> useUnicode=true&characterEncoding=utf-8
>
> We have tried using this with the JDBC driver but it
doesn't appear to have
> any effect, all the special character are coming out as
mangle multiple
> character string, which is the same as we see if we
connect to the server from
> the command prompt using the default "Latin1" character
set. If we connect from
> the command prompt using UTF-8 everything looks ok, so
I'm guessing the
> connection string has changed syntax. I've checked the
ConnectorJ documentation
> and it appears the connection string should now be:
>
> characterEncoding=UTF8
>
> However, this did seem to make any difference.
>
> Any ideas?
>
> Kind regards,
>
> Andrew.
>
try:
1) add the following to the end of JDBC URL in CF Admin DSN
config
screen for your db:
?useUnicode=true&characterEncoding=utf8&characterSetResults=UTF-8
(note: NOT in the "connection string" box, but at the end of
jdbc url!)
2) in your Application.cfm file add the following lines right
after
<cfapplication> tag:
<cfscript>
SetEncoding("form","utf-8");
SetEncoding("url","utf-8");
</cfscript>
<cfcontent type="text/html; charset=utf-8">
3) on every cfm page in your application add the line:
<cfprocessingdirective pageencoding="utf-8">
as the first line of code
all three or combination of 2 of the above usually solve the
problems
with displaying utf-8/unicode encoded text from db. which
combination
works depends on your db setup...
Azadi Saryev
Sabai-dee.com
Vientiane, Laos
http://www.sabai-dee.com

Document character set UTF-8

Hi, sorry for my bad english. I am Brazilian. So..
I have problem when i upload the muse site to the hosting... problems with: document character set. They are configured to UTF-8... put characters "á" and "é" didnt work... but it is just on this hosting "lgplasticos.com.br" and i had uploaded the same files in another hosting and its fine "weking.com.br/clientes/lg"... how to solve this? I contacted the hosting support and they said that: "is no problem in our server configuration , probably some wrong configuration file when sending files ..." but in weking.com/clientes/lg server worked ... how to solve ? Please help me ..

Hi,
did you check this thread How to change charset from UTF-8 to ISO-8859-1 in Muse?

Reports 9i and character set

We have reports in character set WINDOWS 1251
But iAS do they in UTF 8.
We tried to set character set in uifont.ali, but there was not effect.
How to change default character set from UTF8 to WINDOWS 1251 ?
Environment:
iAS 9i R2
nls_lang=AMERICAN.CL8MSWIN1251
uifont.ali:
COURIER....UTF8=COURIER....CL8MSWIN1251

Laura,
First, You can run the 6i reports with OGD in reports 9i, provided you have your Oracle 6i home is in system path
and the regsitry variable ORACLE_GRAPHICS6I_HOME is set to pint your 6i oracle home
Now, if you wan to open these 6i report with OGD and modify the report in 9i builder, then the OGDs are loast when saving this reports from 9i
buikder, so you would need to recreate the graphs using the graph wizard
Thanks
The Oracle Reports Team

Region type URL and character set encoding

Hello,
I'd like to include static html page using URL region, but there is some translation done between encoding of input static html page and output of HTML DB. Does anyone know how the file encoding is translateded when the page is rendered?
I have tried some encodings of input file (CP1250, UTF-8, Unicode) but it did not work.

DarkFiBrE72 wrote:
Its AL16UTF16,
From metalink
Starting in Oracle 9i the National Characterset (NLS_NCHAR_CHARACTERSET) will be
limited to UTF8 and AL16UTF16.
For more details refer to The National Character Set in Oracle 9i and 10g
Any other NLS_NCHAR_CHARACTERSET will no longer be supported.
When upgrading to 10g the value of NLS_NCHAR_CHARACTERSET is based
on value currently used in the Oracle8 version.
If the NLS_NCHAR_CHARACTERSET is UTF8 then new it will stay UTF8.
In all other cases the NLS_NCHAR_CHARACTERSET is changed to AL16UTF16
and -if used- N-type data (= data in columns using NCHAR, NVARCHAR2 orNCLOB )
may need to be converted.
Edited by: DarkFiBrE72 on Sep 24, 2008 7:12 PMI'm not sure if the OP was referring to the National character set? Is this implied by the corresponding SQL Server characterset mentioned?
Otherwise I would assume we are talking about the Database character set, which allows numerous different character sets and types (single-byte, multi-byte, Unicode etc. depending on Oracle release).
Regards,
Randolf
Oracle related stuff blog:
http://oracle-randolf.blogspot.com/
SQLTools++ for Oracle (Open source Oracle GUI for Windows):
http://www.sqltools-plusplus.org:7676/
http://sourceforge.net/projects/sqlt-pp/

Oracle XE and character set

Hello all,
I installed Oracle XE on RHEL 4 Linux and I found out that database character set is AL32UTF8. Does anyone know why oracle choose this character set? Maybe because of NLS_LANG env variable? Is it possible to change it to EE8ISO8859P2? Since database is still empty I can drop it and crate new database.
Do you think it is possible to set some env variables and do new oracle xe instalation including database with iso charset?
I want to have EE8ISO8859P2 charset because of doing exp/imp from another oracle iso db to oracle xe and it is much easier to do this without charset conversion.
Any help will be appreciated.
regards,
Miha

When you download XE, you have a choice - take the 'western european' character set download, or the 'unicode' download.
No other choices.
Join us over in the XE forum where people have discussed this and found workarounds. Info about finding that forum at Re: Oracle XE Installation failed

Database Links and Character Sets

Can I link a ARABIC Character set database to an English Character one using database links. The Application running on the Arabic database needs to read and write into the English one using a database link.

Shouldn't be a problem, assuming that all the English characters you want to represent can be properly encoded in the source system's character set. Shouldn't be a problem unless you start dealing with things like Microsoft's curly quotes.
Justin
Distributed Database Consulting, Inc.
http://www.ddbcinc.com/askDDBC

Additional language and character sets in R/3

Our SAP for Retail R/3 system (GUI 470 Basis 620) is about to incorporate some new stores we are opening in the Czech Republic. With English (EN) and German (DE) languages installed, we are able to type, display and save to the database all of the Czech characters apart from two of them. It seems a bit silly to install a whole new language (CZ) for these two characters alone. Is it possible to somehow add an additional character set to SAP that would contain the Czech character set? We are happy for Czech users to log on in English so we don't need to see Czech characters in standard field descriptions in the GUI, but users will want to type these characters into items such as material/article description fields.
Any advice as to the best way to go with this would be greatly appreciated.
Kind regards,
Stuart Richards

Hi Stuart
CS belong to 1401 code and EN/DE belongs to 1100.. few char-sets might match others might not..
Best practice is to install the language..
You are not know about the future whih CS words that the users might type in.
Regards
Madhu

Dbassist and character set's

Hello everyone. I just installed Oracle8i on my Slackware7 linux box and I can not seem to start the dbassist tool that comes with it. When I attempt to start the assistant it throws an JNLS exception, that reads as shown bellow.
"JNLS Exception:oracle.ntpg.jnls.JNLSException Unable to find any National Character Sets. Please check your Oracle installation."
Now my question is (obviously) how do I go about fixing this annoyance? Could one of my Enviorment vars. be causeing this to occur? Or am I barking up the wrong tree?
Any help would be greatly appreciated...
thanx in advance james...

The JNLS error is a bug...What u have to do is just ignore the error and go ahead with the database creation...
I think this should help u out...
Edwin
email:[email protected]

Problem with character set UTF-16 LE

Hello.
There were difficulties with character set change, using function convert ()
The matter is that in the list v$nls_valid_values the character set AL16UTF16LE does not appear, during too time the inquiry is successfully carried out
convert ([some-national-characters], ' CL8MSWIN1251 ', ' AL16UTF16LE ').
But at giving on an input of the data stored in CLOB, there is an error "a character set is not supported"
What are possible ways of the decision of a problem?

You can try to use DBMS_LOB. SUBSTR to access LOB data like in the following example:
SQL> select * from v$version;
BANNER
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Product
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production
SQL>
SQL> declare
2 v_i clob;
3 v_o clob;
4 begin
5 v_i := 'a';
6 v_o:=convert(dbms_lob.substr(v_i,1,1),'AL16UTF16LE', 'CL8MSWIN1251');
7 end;
8 /
PL/SQL procedure successfully completed.

Search and character sets?

I have noticed that iTunes when using the search box does not see the difference between and 'e' and and 'e with and accent over it' or any other foreign charater set, such as letters with an umlaut like, ū.
Is there anyway to get iTunes to see the difference in the search box?
Thanks!

Hello,
These are the encoding types:
ISO-8859-1
ISO-8859-13
ISO-8859-15
ISO-8859-2
ISO-8859-4
ISO-8859-5
ISO-8859-7
ISO-8859-9
KOI8-R
US-ASCII
UTF-16
UTF-16BE
UTF-16LE
UTF-8
UTF8
UnicodeBigUnmarked
UnicodeLittleUnmarked
windows-1250
windows-1251
windows-1252
windows-1253
windows-1254
windows-1257
Check this
https://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79fb42

JSF and Character Sets (UTF-8)

Similar Messages

Maybe you are looking for