Problems with LPX-00245 (character set problem?)

Hi all,
I've got a problam with ORA-19202 and LPX-00245 (extra data after end of document) when querying my xmltype table. The table contains one large xml document. This xml document is valid, I've checked it against the corresponding XSD (using JDeveloper and also Notepad++, no validation errors).
I gues it has something to do with the encoding of the document. The original encoding is ISO-8859-1 (<?xml version="1.0" encoding="ISO-8859-1"?>). When I load the document to the database it is autoamtically changed to UTF-8 (<?xml version="1.0" encoding="UTF-8"?>) maybe because the character setting of my database is AL32UTF8.
I use the following statement to store my XML:
      insert into my_table
      values( my_seq_spp.nextval,
           r_get_files.file_name,
              xmltype(
              bfilename(p_directory, r_get_files.file_name) -- p_directory is the name of an oracle dircetory
              , nls_charset_id('WE8ISO8859P1')
Nevertheless the retrieved charset id 31 is ignored. Also if II use csid = 0, it doesn't work...
Any idea how to enforce using ISO-8859-1 instead UTF-8 as character set?
Best regards
Matthias

Hi Marco,
I don't think it has anything to do with encoding (client-side or not).
I'd be more inclined to say it's related to XML fragments manipulation.
@Matthias :
Does this work better :
select m.version
     , sp.Betriebsstelle
     , spa.Betriebsstellenfahrwege
from imp_spurplan t
   , xmltable('/XmlIssDaten'
       passing t.xml_document
       COLUMNS
         Version                  varchar2(6) path 'Version/Name'
       , Spurplanbetriebsstellen  xmltype     path 'Spurplanbetriebsstellen'
     ) m
   , xmltable('/Spurplanbetriebsstellen/Spurplanbetriebsstelle'
       passing m.Spurplanbetriebsstellen
       COLUMNS
         Betriebsstellenfahrwege_xml xmltype      path 'Betriebsstellenfahrwege'
       , Betriebsstelle              varchar2(6)  path 'Betriebsstelle'
     ) sp
   , xmltable('/Betriebsstellenfahrwege'
       passing sp.Betriebsstellenfahrwege_xml
       COLUMNS
         Betriebsstellenfahrwege  xmltype path '.'
     ) spa
where sp.Betriebsstelle = 'NWH'

Similar Messages

  • ORACLE invoices with a Japanese character set

    We are having trouble printing ORACLE invoices with a Japanese character set.
    the printer we are using is a Dell W5300,do I need to configure the printer or is it something that needs to be configure in the software?????please help......

    We are having trouble printing ORACLE invoices with a
    Japanese character set.
    the printer we are using is a Dell W5300,do I need to
    configure the printer or is it something that needs
    to be configure in the software?????please help......What is the "trouble"? Are you seeing the wrong output? It may not be the printer, but the software that is sending the output to the printer.
    If you are using an Oracle Client (SQL*Plus, FOrms, Reports etc), ensure you set the NLS_LANG to JAPANESE_JAPAN.WE8MSWIN1252 or JAPANESE_JAPAN.JA16SJIS

  • Import dump from a satabase with a different character set

    My database has this character set:
    select * from database_properties:
    NLS_CHARACTERSET     AL32UTF8     Character set
    NLS_NCHAR_CHARACTERSET     AL16UTF16     NCHAR Character set
    I need to import a dump from a database with WE8MSWIN1252 character set.
    After the import I have seet that some character in the table are wrog:
    I see this simbol " " insted of this "à".
    How can I solve the problem?
    The nls_lang variable on my os is: NLS_LANG=ITALIAN_ITALY.AL32UTF8
    I work with oracle 10.0.4 on linux
    Message was edited by:
    user613483

    I have read thos doc on metalink: Note:227332.1
    I also tried to set nls_lang = NLS_LANG=ITALIAN_ITALY.WE8MSWIN1252
    and then I execute the import command.
    BUt didn't work.

  • Crystal XI R2 exporting issues with double-byte character sets

    NOTE: I have also posted this in the Business Objects General section with no resolution, so I figured I would try this forum as well.
    We are using Crystal Reports XI Release 2 (version 11.5.0.313).
    We have an application that can be run using multiple cultures/languages, chosen at login time. We have discovered an issue when exporting a Crystal report from our application while using a double-byte character set (Korean, Japanese).
    The original text when viewed through our application in the Crystal preview window looks correct:
    性能 著概要
    When exported to Microsoft Word, it also looks correct. However, when we export to PDF or even RPT, the characters are not being converted. The double-byte characters are rendered as boxes instead. It seems that the PDF and RPT exports are somehow not making use of the linked fonts Windows provides for double-byte character sets. This same behavior is exhibited when exporting a PDF from the Crystal report designer environment. We are using Tahoma, a TrueType font, in our report.
    I did discover some new behavior that may or may not have any bearing on this issue. When a text field containing double-byte characters is just sitting on the report in the report designer, the box characters are displayed where the Korean characters should be. However, when I double click on the text field to edit the text, the Korean characters suddenly appear, replacing the boxes. And when I exit edit mode of the text field, the boxes are back. And they remain this way when exported, whether from inside the design environment or outside it.
    Has anyone seen this behavior? Is SAP/Business Objects/Crystal aware of this? Is there a fix available? Any insights would be welcomed.
    Thanks,
    Jeff

    Hi Jef
    I searched on the forums and got the following information:
    1) If font linking is enabled on your device, you can examine the registry by enumerating the subkeys of the registry key at HKEY_LOCAL_MACHINEu2013\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink to determine the mappings of linked fonts to base fonts. You can add links by using Regedit to create additional subkeys. Once you have located the registry key that has just been mentioned, from the Edit menu, Highlight the font face name of the font you want to link to and then from the Edit menu, click Modify. On a new line in the dialog field "Value data" of the Edit Multi-String dialog box, enter "path and file to link to," "face name of the font to link".u201D
    2) "Fonts in general, especially TrueType and OpenType, are u201CUnicodeu201D.
    Since you are using a 'true type' font, it may be an Unicode type already.However,if Bud's suggestion works then nothing better than that.
    Also, could you please check the output from crystal designer with different version of pdf than the current one?
    Meanwhile, I will look out for any additional/suitable information on this issue.

  • Using Document Filters with the Japanese character sets

    Not sure if this belongs here or on the Swing Topic but here goes:
    I have been requested to restrict entry in a JTextField to English alphaNumeric and Full-width Katakana.
    The East Asian language support also allows Hiragana and Half-width Katakana.
    I have tried to attach a DocumentFilter. The filter employs a ValidateString method which strips all non (Latin) alphaNumerics as well as anything in the Hiragana, or Half-width Katakana ranges. The code is pretty simple (Most of the code below is dedicated to debugging):
    public class KatakanaInputFilter extends DocumentFilter
         private static int LOW_KATAKANA_RANGE = 0x30A0;
         private static int LOW_HALF_KATAKANA_RANGE = 0xFF66;
         private static int HIGH_HALF_KATAKANA_RANGE = 0xFFEE;
         private static int LOW_HIRAGANA_RANGE = 0x3041;
         public KatakanaInputFilter()
              super();
         @Override
         public void replace(FilterBypass fb, int offset, int length, String text,
                   AttributeSet attrs) throws BadLocationException
              super.replace(fb, offset, length, validateString(text, offset), null);
         @Override
         public void remove(FilterBypass fb, int offset, int length)
                   throws BadLocationException
              super.remove(fb, offset, length);
         // @Override
         public void insertString(FilterBypass fb, int offset, String string,
                   AttributeSet attr) throws BadLocationException
              String newString = new String();
              for (int i = 0; i < string.length(); i++)
                   int unicodePoint = string.codePointAt(i);
                   newString += String.format("[%x] ", unicodePoint);
              String oldString = new String();
              int len = fb.getDocument().getLength();
              if (len > 0)
                   String fbText = fb.getDocument().getText(0, len);
                   for (int i = 0; i < len; i++)
                        int unicodePoint = fbText.codePointAt(i);
                        oldString += String.format("[%x] ", unicodePoint);
              System.out.format("insertString %s into %s at location %d\n",
                        newString, oldString, offset);
              super.insertString(fb, offset, validateString(string, offset), attr);
              len = fb.getDocument().getLength();
              if (len > 0)
                   String fbText = fb.getDocument().getText(0, len);
                   for (int i = 0; i < len; i++)
                        int unicodePoint = fbText.codePointAt(i);
                        oldString += String.format("[%x] ", unicodePoint);
              System.out.format("document changed to %s\n\n", oldString);
         public String validateString(String text, int offset)
              if (text == null)
                   return new String();
              String validText = new String();
              for (int i = 0; i < text.length(); i++)
                   int unicodePoint = text.codePointAt(i);
                   boolean acceptChar = false;
                   if (unicodePoint < LOW_KATAKANA_RANGE)
                        if ((unicodePoint < 0x30 || unicodePoint > 0x7a)
                                  || (unicodePoint > 0x3a && unicodePoint < 0x41)
                                  || (unicodePoint > 0x59 && unicodePoint < 0x61))
                             acceptChar = false;
                        else
                             acceptChar = true;
                   else
                        if ((unicodePoint >= LOW_HALF_KATAKANA_RANGE && unicodePoint <= HIGH_HALF_KATAKANA_RANGE)
                                  || (unicodePoint >= LOW_HIRAGANA_RANGE && unicodePoint <= LOW_HIRAGANA_RANGE))
                             acceptChar = false;
                        else
                             acceptChar = true;
                   if (acceptChar == true)
                        System.out.format("     Accepted code point = %x\n",
                                  unicodePoint);
                        validText += text.charAt(i);
                   else
                        System.out.format("     Rejected code point = %x\n",
                                  unicodePoint);
              String newString = "";
              for (int i = 0; i < validText.length(); i++)
                   int unicodePoint = validText.codePointAt(i);
                   newString += String.format("[%x] ", unicodePoint);
              System.out.format("ValidatedString = %s\n", newString);
              return validText;
          * @param args
         public static void main(String[] args)
              Runnable runner = new Runnable()
                   public void run()
                        JFrame frame = new JFrame("Katakana Input Filter");
                        frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
                        frame.setLayout(new GridLayout(2, 2));
                        frame.add(new JLabel("Text"));
                        JTextField textFieldOne = new JTextField();
                        Document textDocOne = textFieldOne.getDocument();
                        DocumentFilter filterOne = new KatakanaInputFilter();
                        ((AbstractDocument) textDocOne).setDocumentFilter(filterOne);
                        textFieldOne.setDocument(textDocOne);
                        frame.add(textFieldOne);
                        frame.setSize(250, 90);
                        frame.setVisible(true);
              EventQueue.invokeLater(runner);
    }I run this code, use the language bar to switch to Full-width Katakana and type "y" followed by "u" which forms a valid Katakana character. I then used the language bar to switch to Hiragana and retyped the "Y" followed by "u". When the code sees the Hiragana codepoint generated by this key combination it rejects it. My debugging statements show that the document is properly updated. However, when I type the next character, I find that the previously rejected codePoint is being sent back to my insert method. It appears that the text somehow got cached in the composedTextContent of the JTextField.
    Here is the output of the program when I follow the steps I just outlined:
    insertString [ff59] into at location 0 <== typed y (Katakana)
    Accepted code point = ff59
    ValidatedString = [ff59]
    document changed to [ff59]
    insertString [30e6] into at location 0 <== typed u (Katakana)
    Accepted code point = 30e6
    ValidatedString = [30e6]
    document changed to [30e6]
    insertString [30e6] [ff59] into at location 0 <== typed y (Hiragna)
    Accepted code point = 30e6
    Accepted code point = ff59
    ValidatedString = [30e6] [ff59]
    document changed to [30e6] [ff59]
    insertString [30e6] [3086] into at location 0 <== typed u (Hiragana)
    Accepted code point = 30e6
    Rejected code point = 3086
    ValidatedString = [30e6]
    document changed to [30e6]
    insertString [30e6] [3086] [ff59] into at location 0 <== typed u (Hiragana)
    Accepted code point = 30e6
    Rejected code point = 3086
    Accepted code point = ff59
    ValidatedString = [30e6] [ff59]
    document changed to [30e6] [ff59]
    As far as I can tell, the data in the document looks fine. But the JTextField does not have the same data as the document. At this point it is not displaying the ff59 codePoint as a "y" (as it does when first entering the Hiragana character). but it has somehow combined it with another codePoint to form a complete Hiragana character.
    Can anyone see what it is that I am doing wrong? Any help would be appreciated as I am baffled at this point.

    You have a procedure called "remove" but I don't see you calling it from anywhere in your program. When the validation failed, call remove to remove the bad character.
    V.V.

  • Character set problem with transportable tablespace

    Hi,
    I'm trying to import a transportable tablespace with data pump into a database with a different character set compared to the source database. I know this is by default not possible. But there's no violating data in the tablespace that could make it a problem when transfering the TS. So I issued 'ALTER SYSTEM SET "_tts_allow_nchar_mismatch"=true;' on the target to force the import. However; I still get the eror:
    ORA-29345: can not plug a tablespace into a database using a different character set
    How can I fix this?

    Hi,
    What're the character sets of the source and target database?
    A general restriction of transportable tablespace is that the source and target databases must use the same database character set.
    Regards
    Nat

  • Foreign character set problem

    Hi Sergiusz, it looks like you are a character set guru. Maybe you would know how to solve my problem? I've got a working application under oracle xe, apache and embedded listener. I would like to switch to a new APEX listener with tomcat but there is a problem with a foreign character set. Existing pages with such characters are displayed correctly but if I type them in into an input filed they are not showing up correctly on the next page. This is done without even saving information in a database. I type in text in one field which is a source of an item on another page. These are character settings in my database.
    SQL> select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET';
    VALUE
    AL32UTF8
    SQL> select value from nls_database_parameters where parameter = 'NLS_NCHAR_CHARACTERSET';
    VALUE
    AL16UTF16
    Thanks,
    Art

    Hi Sergiusz, thank you for your help. After setting a URIEncoding to UTF-8 and some further research I was able to fix my problem. Here is the entire solution in case someone else needs it.
    1.) Change $CATALINA_HOME/conf/server.xml and add
    URIEncoding=UTF-8
    <Connector port="8090" protocol="HTTP/1.1"
    connectionTimeout="20000"
    redirectPort="8443" URIEncoding="UTF-8"/>
    2.) Copy $CATALINA_HOME/webapps/examples/WEB-INF/classes/filters/SetCharacterEncoding.class => $CATALINA_HOME/webapps/apex/WEB-INF/classes/filters
    3.) Add the following into $CATALINA_HOME/webapps/apex/WEB-INF/web.xml file after the last </servlet-mapping> tag.
    <filter>
    <filter-name>Set Character Encoding</filter-name>
    <filter-class>filters.SetCharacterEncodingFilter</filter-class>
    <init-param>
    <param-name>encoding</param-name>
    <param-value>UTF8</param-value>
    </init-param>
    </filter>
    <filter-mapping>
    <filter-name>Set Character Encoding</filter-name>
    <url-pattern>/*</url-pattern>
    </filter-mapping>
    Thanks,
    Art

  • Character set conversion problem during upgrde.

    Dear Friends,
    I am trying upgrade one of my windows database with version 9.2.0.5 to 10.2.0.4 on unix. I am following exp/imp. During import I am seeing followinig errors for couple of tables,
    IMP-00019: row rejected due to ORACLE error 12899
    IMP-00003: ORACLE error 12899 encountered
    ORA-12899: value too large for column
    IMP-00058: ORACLE error 1461 encountered
    ORA-01461: can bind a LONG value only for insert into a LONG column
    This may be due to character set issue, since database on windows has WE8MSWIN1252 and on unix it has UTF8.
    Please let me know how I can resolve this issue.
    Regards.
    Mahdu

    Hello,
    It's better that your Target Database is created with the same character set than the source one.
    This is an option you can choose at the database creation.
    If you have to stay in UTF8 on your Target database then, you'll have to extend the column size or, use the
    option CHAR (as Unicode may use up to *4 bytes* for one character instead of *1 byte* for WE8MSWIN1252).
    To use the option CHAR you may specify it on the column datatype, for instance:
    col1 VARCHAR2 (100 CHAR)Else, without this option, VARCHAR2 (100) means 100 Bytes (which can store 25 characters in Unicode).
    You also have the parameter NLS_LENGTH_SEMANTICS that you can set to CHAR, but the export/import
    utility doesn't manage it well.
    So, the safest way is to create your target database with the same character set than the source one
    except if you want to migrate to Unicode.
    Hope this help.
    Best regards,
    Jean-Valentin
    Edited by: Lubiez Jean-Valentin on Mar 3, 2010 10:11 PM

  • Query distributed database with different character sets.

    Hello experts, this is my situation:
    I have two databases A and B, the same version 11.1.0.7, the same OS Suse Linux Enterprise 10 but with different character sets, A has WE8MSWIN1252 while B has AL32UTF8. The database B is my XML DB repository so there I have some XML type tables. I need to query this tables from the database A using a dblink and in fact I have done that but the XML content is trasformed due to the different character sets between the databases. Some time there are data loss and some time there are data missmatch.
    Is there any way to query the tables stored in the database B without problems? I do not know if the following is correct: Maybe I can set the character set for the session in the database A during the time it query the database B. That is, change the character set in fly at session level.
    Do you have any special suggestion?
    I hope you can help me, thank you in advance.

    The Globalization Support Guide for 11.1.0.7 has a chapter on character set migration that should be helpful. AL32UTF8 is a superset of WE8MSWIN1252 but it is not a strict superset. That is, it doesn't meet the second prong of the test in the documentation
    The new character set is a strict superset of the current character set if:
    Each and every character in the current character set is available in the new character set.
    Each and every character in the current character set has the same code point value in the new character set. For example, many character sets are strict supersets of US7ASCII.Exporting the data from the A, changing the character set (or creating a new database with the AL32UTF8 character set), and then importing the data may be the easiest approach in your case.
    Justin
    Edited by: Justin Cave on Jan 13, 2011 12:08 PM

  • Oracle Character sets with PeopleSoft - AL32UTF8 vs. UTF8

    We currently have PeopleSoft FInancials v8.8 with PeopleTools 8.45 running on Oracle 9.2.0.8 with the UTF8 character set.
    We plan to upgrade to Oracle 10.2, and want to know if we can and should also convert the character set to AL32UTF8.
    Any issues?
    (A couple of years ago, we were told that AL32UTF8 was not yet supported in PeopleSoft).

    Right now, something strange, Oracle recommand do not use anymore UTF8, and Peoplesoft recommand do not use AL32UTF8 yet.
    You can read the solution id #719906, but anyway, AL32UTF8 on PT8.4x should works fine.
    Nicolas.

  • Can't Create Database with SJISYEN character set

    I am trying to create a new database in Oracle 10g using the SJISYEN character set. The summary page display in the Database Configuration wizard shows the following
    Character Sets
    Database Character Set JA16SJISYEN
    National Character Set AL16UTF16
    The database gets created successfully with no errors, but when I query the V$NLS_PARAMETERS view, the NLS_CHARACTERSET is reported as US7ASCII
    Should I be able to create a database with the SJISYEN character set with Oracle 10g?

    Can you please run the following SQL statement and see what you get:
    SQL>
    1 select * from nls_database_parameters
    2* where parameter like '%CHARACTERSET'

  • Build new database through scripts must understand spanish character sets.

    Hello Gurus,
    I need some simple advice, a good chance for some quick points for you.
    I have never built a database to understand any other character set other than American English. I now have to build a database that will be used for Spanish characters- keyboards, etc. But I will be using English for the 11g software install. I only wish to be able to show Spanish characters in the data for customers names.
    I will be creating the database with scripts I have made to make the standard template for database files, control files, etc.
    Then I will be importing from a dump I have done that was made with American English character sets.
    System is 11g (11.2.0.3.0) on Linux Enterprise Server 5.8.
    I was thinking to use the AL32UTF8 character set, but I am unsure where to use it.
    My original test did not show Spanish characters for customers names like the 'tilda' or 'sueano' (pardon my spelling). But in this case I did not make the exeception for Spanish, I only used the standard American English build (no changes in the init.ora file or initial database build script).
    How can I adjust my parameter file for the initial creation of the database template to be able to understand the Spanish character set and still be able to import my dump file without error.
    EXAMPLE of a build script:
    CREATE DATABASE mynewdb
    USER SYS IDENTIFIED BY sys_password
    USER SYSTEM IDENTIFIED BY system_password
    LOGFILE GROUP 1 ('/u01/app/oracle/oradata/mynewdb/redo01.log') SIZE 100M,
    GROUP 2 ('/u01/app/oracle/oradata/mynewdb/redo02.log') SIZE 100M,
    GROUP 3 ('/u01/app/oracle/oradata/mynewdb/redo03.log') SIZE 100M
    MAXLOGFILES 5
    MAXLOGMEMBERS 5
    MAXLOGHISTORY 1
    MAXDATAFILES 100
    CHARACTER SET US7ASCII
    NATIONAL CHARACTER SET AL16UTF16
    If I replace NATIONAL CHARACTER SET AL16UTF16 to AL32UTF8 will it work to show Spanish characters?
    Sorry for the long winded question, any advice will be great.
    Thankfully,
    Shawn

    Hello,
    the national charsets is for column types like nvarchar not for normal varchar data types. So if your dump file contains such column types you will also need to set it. The charset is for the normal column types like varchar. The use of unicode is best pratice if you use multiel language, but keep in mind that multibyte charset can be a problem during the import because varchar2(10) means 10byte and not 10 chars, so errors like identifier to long can occur during import.
    You can create the database.
    Check this documentation:
    http://docs.oracle.com/cd/B28359_01/server.111/b28298/ch2charset.htm
    You can use a charset like WE8MSWIN1252 which covers spanish also (as far i know) and is a superset to us7ascii
    regards
    Peter

  • Different Character sets?

    Oracle version: 11.1.0.7.0
    There are any problems if I configure an Oracle Streams Replication between to database with different character set? I have configured a propagation between two database A and B. The A character set is WE8MSWIN1252 and the B character set is WEISO8859P1 but when the propagation passes about 30 minutes configured and ora-07445 begin. Also when I do the same configuration between two database with the same character set the problems does not appear.
    Here I write the error in the alert log:
    ORA-07445: se ha encontrado una excepción: volcado de memoria [kohrsmc()+1484] [SIGSEGV] [ADDR:0x18] [PC:0x7CCC92E] [Address not mapped to object] []
    Incident details in: /opt/oracle/diag/rdbms/bdmdic/bdmdic2/incident/incdir_110004/bdmdic2_j000_18185_i110004.trc
    Fri Aug 06 16:57:31 2010
    Trace dumping is performing id=[cdmp_20100806165731]
    Exception [type: SIGILL, Illegal operand] [ADDR:0x2ACE05EEFE60] [PC:0x2ACE05EEFE60, {empty}]
    Errors in file /opt/oracle/diag/rdbms/bdmdic/bdmdic2/trace/bdmdic2_j000_18185.trc (incident=110005):
    ORA-07445: se ha encontrado una excepción: volcado de memoria [PC:0x2ACE05EEFE60] [SIGILL] [ADDR:0x2ACE05EEFE60] [PC:0x2ACE05EEFE60] [Illegal operand] []
    ORA-07445: se ha encontrado una excepción: volcado de memoria [kohrsmc()+1484] [SIGSEGV] [ADDR:0x18] [PC:0x7CCC92E] [Address not mapped to object] []
    Incident details in: /opt/oracle/diag/rdbms/bdmdic/bdmdic2/incident/incdir_110005/bdmdic2_j000_18185_i110005.trc

    Where did I said that charset is interfering in propogataion? It will not interfere. Char set difference will cause the target database populated with data which it may not understand and will show you as garbage data. As you replication is working find without an issue for sometime, so it looks like you are hitting some bug when you hit replication for some specific data. To confirm this you may have to work with Oracle to find the exact cause or which bug you may be hitting.
    All my suggestions till this point in time is without knowing your proper environment (Server A,B OS and database version, also the error description except english words are all foreign to me). As suggested you will be better off working with Oracle support to find the resolution. Please let us know what fixed your issue when you work with Oracle to help all forum members.
    Regards

  • JSF and Character Sets (UTF-8)

    Hi all,
    This question might have been asked before, but I'm going to ask it anyway because I'm completely puzzled by how this works in JSF.
    Let's begin with the basics, I have an application running on an OC4J servlet container, and am using JSF 1.1 (MyFaces). The problems I am having with this setup, is that it seems that the character encodings I want the server/client to use are not coming across correctly. I'm trying to enforce the application to be UTF-8, but after the response is rendered to my client, I've magically been reverted to ISO-8859-1, which is the main character set for the netherlands. However, I'm building the application to support proper internationalization; which means I NEED to use UTF-8.
    I've executed the following steps to reach this goal:
    - All JSP files contain page directives, noting the character set:
    <%@ page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8" %>I've checked the generated source that comes from the JSP's, it looks as expected.
    - I've created a servlet filter to set the character set directly on the request and response objects:
        public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain) throws IOException, ServletException {
            // Set the characterencoding for the request and response streams.
            req.setCharacterEncoding("UTF-8");
            res.setContentType("text/html; charset=UTF-8");       
            // Complete (continue) the processing chain.
            chain.doFilter(req, res); 
        }I've debugged the code, and this works fine, except for where JSF comes in. If I use the above situation, without going through JSF, my pages come back UTF-8. When I go through JSF, my pages come back as ISO-8859-1. I'm baffled as to what is causing this. On several forums, writing a filter was proposed as the solution, however this doesn't do it for me.
    It looks like somewhere internally in JSF the character set is changed to ISO. I've been through the sources, and I've found several pieces of code that support that theory. I've seen portions of code where the character set for the response is set to that of the request. Which in my case coming from a dutch system, will be ISO.
    How can this be prevented? Can anyone give some good insight on the inner workings of JSF with regards to character sets in specific? Could this be a servlet container problem?
    Many thanks in advance for your assistance,
    Jarno

    Jarno,
    I've been investigating JSF and character encodings a bit this weekend. And I have to say it's more than a little confusing. But I may have a little insight as to what's going on here.
    I have a post here:
    http://forum.java.sun.com/thread.jspa?threadID=725929&tstart=45
    where I have a number of open questions regarding JSF 1.2's intended handling of character encodings. Please feel free to comment, as you're clearly struggling with some of the same questions I have.
    In MyFaces JSF 1.1 and JSF-RI 1.2 the handling appears to be dependent on the raw Content-Type header. Looking at the MyFaces implementation here -
    http://svn.apache.org/repos/asf/myfaces/legacy/tags/JSF_1_1_started/src/myfaces/org/apache/myfaces/application/jsp/JspViewHandlerImpl.java
    (which I'm not sure is the correct code, but it's the best I've found) it looks like the raw header Content-Type header is being parsed in handleCharacterEncoding. The resulting value (if not null) is used to set the request character encoding.
    The JSF-RI 1.2 code is similar - calculateCharacterEncoding(FacesContext) in ViewHandler appears to parse the raw header, as opposed to using the CharacterEncoding getter on ServletRequest. This is understandable, as this code should be able to handle PortletRequests as well as ServletRequests. And PortletRequests don't have set/getCharacterEncoding methods.
    My first thought is that calling setCharacterEncoding on the request in the filter may not update the raw Content-Type header. (I haven't checked if this is the case) If it doesn't, then the raw header may be getting reparsed and the request encoding getting reset in the ViewHandler. I'd suggest that you check the state of the Content-Type header before and after your call to req.setCharacterEncoding('UTF-8"). If the header charset value is unset or unchanged after this call, you may want to update it manually in your Filter.
    If that doesn't work, I'd suggest writing a simple ViewHandler which prints out the request's character encoding and the value of the Content-Type header to your logs before and after the calls to the underlying ViewHandler for each major method (i.e. renderView, etc.)
    Not sure if that's helpful, but it's my best advice based on the understanding I've reached to date. And I definitely agree - documentation on this point appears to be lacking. Good luck
    Regards,
    Peter

  • DBCA doesn't save character set information in the template!!

    Hello,
    I have found a problem using the DBCA when creating a template with a database character set of AL32UTF8.
    When I create a database using the aforementioned template from the commandline (dbca -silent) it reverts the character set to WE8ISO8859P1.
    If I try and alter the character set after creation using:
    ALTER DATABASE CHARACTER SET AL32UTF8;
    it complains that tis cannot be done, so I must set this in the template.
    Am I missing something obvious? Why is this information not being stored in the .dbc file?
    Many thanks
    Nic Hemley

    Well, having patched the template xml file i then had to do:
    dbca -silent -createDatabase -templateName advisor.dbc -gdbName advisor.invocom -sid advisor -sysPassword j3r3m1aha -continueOnNonFatalErrors false
    apparently the .dbc is required on the template name
    However.....
    NLS_DATE_LANGUAGE
    NLS_DATE_FORMAT
    NLS_LANGUAGE
    NLS_TERRITORY
    NLS_ISO_CURRENCY
    are not set in the template either...why is the template missing so much stuff that i thought i had configured?

Maybe you are looking for