Character set change? Help!

Hi,
I am doing imports and the export client uses ZHS16GBK character set (simplified chinese) and my imports are done in
US7ASCII which is not a super set of the aforementioned character set. I want to change to ZHS32GB18030 which is a super set
of US7ASCII and ZHS16GBK. Is it possible to change my character set in sql plus?
Also could this character conversion cause an IMP-00008 error (unrecognized statement in export file)?
-Ashley

This one my logs. It has the error ORA 942, this is an error that I have found in most of my files that I am importing:
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export file created by EXPORT:V10.01.00 via conventional path
Warning: the objects were exported by SMARTGPS2006, not by you
import done in US7ASCII character set and AL16UTF16 NCHAR character set
import server uses AL32UTF8 character set (possible charset conversion)
export client uses ZHS16GBK character set (possible charset conversion)
export server uses UTF8 NCHAR character set (possible ncharset conversion)
. importing SMARTGPS2006's objects into RPI
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200706" 0 rows imported
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200802" 0 rows imported
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200803" 0 rows imported
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200804" 0 rows imported
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200903" 22939070 rows imported
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200904" 0 rows imported
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200905" 0 rows imported
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200906" 0 rows imported
. . importing partition "MCC_ASYN_POS":"MCC_ASYN_POS200907" 0 rows imported
IMP-00017: following statement failed with ORACLE error 942:
"ALTER TABLE "MCC_ASYN_POS" ADD CONSTRAINT "MCC_ASYN_POS_OEMCODE_NEW" FOREIG"
"N KEY ("OEMCODE") REFERENCES "OEM" ("OEMCODE") ENABLE NOVALIDATE"
IMP-00003: ORACLE error 942 encountered
ORA-00942: table or view does not exist
About to enable constraints...
Import terminated successfully with warnings.
From doing research I have found that IMP-00008 is because the file is corrupt. Is this always the case?
Also, you said there errors would not be because of a character set conversion issue, does that mean I do need to change my character set?
Ashley

Similar Messages

Database Character Set Change

Hi,
I am trying to change my database's(9i) character set US7ASCII to UTF8 .
There are some tables having CLOB columns which are stopping this from happening .
So I exported all those tables having CLOB columns and truncated them .
But there are some views also which have CLOB columns which are restricting me from changing the characterset further.
Any help will be appreciated.
Thanx

Hi,
I used the following after export and truncate of tables containing CLOB columns .
STARTUP MOUNT;
ALTER SYSTEM ENABLE RESTRICTED SESSION;
ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;
ALTER SYSTEM SET AQ_TM_PROCESSES=0;
ALTER DATABASE OPEN;
ALTER DATABASE CHARACTER SET UTF8;
SQL> ALTER DATABASE CHARACTER SET UTF8;
ALTER DATABASE CHARACTER SET UTF8
ERROR at line 1:
ORA-12716: Cannot ALTER DATABASE CHARACTER SET when CLOB data exists
Pls help me out.
Thanx

Character set changed

I met a big problem.
I set the character set to WE8ISO8859P1 by mistaken.
and thousands of records in Japanese werer inserted.
are these records saved in Japanese correctly ?
now I reconstructed the DB in JA16SJIS.
and How can I migrate the Japanese records in the old
database to the new one ?
what should I do ? waiting for your help.

You have to recreate the database again because the national character set and the database character set can not be changed after the database is created. If you want to get detailed information about it you can read this white paper :
http://otn.oracle.com/pls/db92/db92.to_pdf?partno=a96529&remark=docindex
[email protected]
Joel P�rez

Arabic Character set conversion-help needed

We have our main database running in 10g (Solaris o/s) & planning to move these to RAC 11g.
One of our old oracle DB(8.0.5)/solaris, which is not used till recently need to upgrade to 10g Rel2.
I know Supported direct upgrade 8.0.6/8.1.7/9i -> 10g
Current DB: 8.0.5 (Character Set: AR8ISO8859P6)
Target DB : 10g Rel 2 (Character Set: AR8MSWIN1256)
I am thinking to go the following way by using exp/imp
8.0.5(AR8ISO8859P6) -> 8.1.7(AR8ISO8859P6) -> 10G(AR8MSWIN1256)
OR
8.0.5(AR8ISO8859P6) -> 8.1.7(AR8MSWIN1256) -> 10G(AR8MSWIN1256)
please advice
thanks

(1) At source db 8.0.5 (solaris)
PARAMETER VALUE
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-YY
NLS_DATE_LANGUAGE AMERICAN
NLS_CHARACTERSET AR8ISO8859P6
NLS_SORT BINARY
NLS_NCHAR_CHARACTERSE T AR8ISO8859P6
$set NLS_LANG=AMERICAN_AMERICA.AR8ISO8859P6
$exp sys/dba file=full251109.dmp full=y
(2):>> At target db 10g R2 (solaris)
PARAMETER VALUE
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RRRR
NLS_DATE_LANGUAGE AMERICAN
NLS_CHARACTERSET AR8ISO8859P6
NLS_SORT BINARY
NLS_NCHAR_CHARACTERSET UTF8
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
$export NLS_LANG=AMERICAN_AMERICA.AR8ISO8859P6
$imp testdba/testdba file=full251105.dmp fromuser=PROFINAL touser=PROFINAL
*$csscan testdba/testdba FULL=Y FROMCHAR=AR8ISO8859P6 TOCHAR=AR8ISO8859P6 LOG=P6check CAPTURE=Y ARRAY=100000 PROCESS=2*
There is EXCEPTIONAL DATA in .err file+
clients accessing 8.0.5 dataabase uses characterset AR8IS08859P6, which is SAME as 8.0.5 database
-CSSCAN result->>[Database Scan Parameters]
Parameter Value
CSSCAN Version v2.1
Instance Name MIG1
Database Version 10.2.0.1.0
Scan type Full database
Scan CHAR data? YES
Database character set AR8ISO8859P6
FROMCHAR AR8ISO8859P6
TOCHAR AR8ISO8859P6
Scan NCHAR data? NO
Array fetch buffer size 100000
Number of processes 2
Capture convertible data? YES
[Scan Summary]
Some character type data in the data dictionary are not convertible to the new
haracter set Some character type application data are not convertible to the new characters
[Data Dictionary Conversion Summary]
Datatype Changeless Convertible Truncation Lossy
VARCHAR2 2,235,403 0 0 *1,492*
CHAR 1,097 0 0 0
LONG 155,188 0 0 6
CLOB 24,643 0 0 0
VARRAY 21,352 0 0 0
Total 2,437,683 0 0 1,498
Total in percentage 99.939% 0.000% 0.000% 0.061%
The data dictionary can not be safely migrated using the CSALTER script
[Application Data Conversion Summary]
Datatype Changeless Convertible Truncation Lossy
VARCHAR2 16,986,594 0 0 *1,240,383*
CHAR 164,114 0 0 0
LONG 7 0 0 0
CLOB 1 0 0 0
VARRAY 1,436 0 0 0
Total in percentage 93.256% 0.000% 0.000%
6.744%
[Distribution of Convertible, Truncated and Lossy Data by Table]
USER.TABLE Convertible Truncation Lossy
PROFINAL.BASE_MASTER_DATAS 0 0 *362,003*
PROFINAL.CODE_ALLOW 0 0 *53*
PROFINAL.CODE_ALLOWANCE_TYPES 0 0 *1*
PROFINAL.CODE_BONUS_TYPES 0 0 *2*
PROFINAL.CODE_BRANCHES 0 0 *2*
PROFINAL.CODE_CERTIFICATES 0 0 *94*
Kindly help,,,
Edited by: userR12 on Nov 25, 2009 1:43 AM
Edited by: userR12 on Nov 25, 2009 1:52 AM

How to review implication of database character set change on PL/SQL code?

Hi,
We are converting WE8ISO8859P1 oracle db characterset to AL32UTF8. Before conversion, i want to check implication on PL/SQL code for byte based SQL functions.
What all points to consider while checking implications on PL/SQL code?
I could find 3 methods on google surfing, SUBSTRB, LENGTHB, INSTRB. What do I check if these methods are used in PL/SQL code?
What do we check if SUBSTR and LENGTH functions are being used in PL/SQl code?
What all other methods should I check?
What do I check in PL/SQL if varchar and char type declarations exist in code?
How do i check implication of database characterset change to AL32UTF8 for byte bases SQL function.
Thanks in Advance.
Regards,
Rashmi

There is no quick answer. Generally, the problem with PL/SQL code is that once you migrate from a single-byte character set (like WE8ISO8859P1) to a multibyte character set (like AL32UTF8), you can no longer assume that one character is one byte. Traditionally, column and PL/SQL variable lengths are expressed in bytes. Therefore, the same string of Western European accented letters may no longer fit into a column or variable after migration, as it may now be longer than the old limit (2 bytes per accented letter compared to 1 byte previously). Depending on how you dealt with column lengths during the migration, for example, if you migrated them to character length semantics, and depending on how relevant columns were declared (%TYPE vs explicit size), you may need to adjust maximum lengths of variables to accommodate longer strings.
The use of SUBSTR, INSTR, and LENGTH and their byte equivalents needs to be reviewed. You need to understand what the functions are used for. If the SUBSTR function is used to truncate a string to a maximum length of a variable, you may need to change it to SUBSTRB, if the variable's length constraint is still declared in bytes. However, if the variable's maximum length is now expressed in characters, SUBSTR needs to be used. However, if SUBSTR is used to extract a functional part of a string (e.g. during parsing), possibly based on result from INSTR, then you should use SUBSTR and INSTR independently of the database character set -- characters matter here, not bytes. On the other hand, if SUBSTR is used to extract a field in a SQL*Loader-like fixed-format input file (e.g. read with UTL_FILE), you may need to standardize on SUBSTRB to make sure that fields are extracted correctly based on defined byte boundaries.
As you see, there is universal recipe on handling these functions. Their use needs to be reviewed and understood and it should be decided if they are fine as-is or if they need to be replaced with other forms.
Thanks,
Sergiusz

Character replacement before character set change

We are preparing to change our character set from us7ascii to al32utf8 on 11.2.0.3 HP-UX. The majority of the "lossy" rows identified contain accented and umlauted vowels. I have tried to change the accented and umlauted characters to their "plain" counterparts (eg, accented a to 'a'). I've tried several variations of the replace and translate functions with no succcess. One of the statements I was "sure" that it would work follows - what am I doing wrong?
update tbl1 set desc_fld = regexp_replace(desc_fld, '[=a=]', 'a');

Hi,
bynummike wrote:
We are preparing to change our character set from us7ascii to al32utf8 on 11.2.0.3 HP-UX. The majority of the "lossy" rows identified contain accented and umlauted vowels. I have tried to change the accented and umlauted characters to their "plain" counterparts (eg, accented a to 'a'). I've tried several variations of the replace and translate functions with no succcess. One of the statements I was "sure" that it would work follows - what am I doing wrong?
update tbl1 set desc_fld = regexp_replace(desc_fld, '[=a=]', 'a');Try:
update tbl1
set desc_fld = regexp_replace (desc_fld, '[[=a=]]', 'a');<tt> [=a=] </tt> means "any variation on the letter a" only when it's inside square-brackets; otherwise, it looks like you want the set of characters consisting of '=', 'a' and '='.

Oracle 8i us7ascii character set problem - help required urgent.

Hi frnds,
I have a oracle 8i database server installed on sun solaris os. The database character set is us7ascii. In one of the tables TIFF images are stored in a long column. I m trying to fetch these images using oracle 9i client and visual basic(oracle ODBC drivers). But i m unable to do so. I can not fetch special characters.
Is it because of the character set problem? but when i run my code on the server itself, i m able to fetch the images. I tried to fetch the images using oracle 8 i client on windows XP machine but could not do so. Are there any special settings that i have to do on the client side?

Indeed, it's an ODBC issue. Read this statement from Oracle:
From ODBC 8.1.7.2.0 drivers onwards it's NOT possible any more to
"disable" Characterset conversion by specifying for the NLS_LANG
the same characterset as the database characterset. There is now
ALWAYS a check to see if a codepoint is valid for that characterset.
Typically you will encounter problems if you upgrade an environment
that has NO NLS_LANG set on the client (or US7ASCII) and the database
was also US7ASCII. This incorrect setup allowed you to store characters
like èçàé in an US7ASCII database, with the new 8i drivers this is not possible
any more.
Basic problem is the 'wrong' characterset US7ASCII in the database. As long as no characterset conversion happens (that's the case on the unix server), special characters are no problem.
Werner

Oracle character set problem - help reqed urgent !!

Hello frnds,
I have a oracle 8i database server installed on sun solaris os. The database character set is us7ascii. In one of the tables TIFF images are stored in a long column. I m trying to fetch these images using oracle 9i client and visual basic(oracle ODBC drivers). But i m unable to do so. I can not fetch special characters.
Is it because of the character set problem? but when i run my code on the server itself, i m able to fetch the images. I tried to fetch the images using oracle 8 i client on windows XP machine but could not do so. Are there any special settings that i have to do on the client side?

i run my code on the server itself, i m able to fetch
the images. I tried to fetch the images using oracle
8 i client on windows XP machine but could not do so.You able to fetch the image , So it is not because of the character set.
First thing you need to consider is that use certified combination of OS , client and database server only. Check Certify - Oracle's Certification Matrices
Virag

Character SET Change

Hi,
I am migrating the database from 9i to 10g.9i database is on windows and 10g would be on solaris.Now,We have some encryted data which uses the windows characterset WE8MSWIN1252 to AL32UTF8(solaris).Could anyone pls let me know how can i go about it?
Thanks!

Is the encrypted data stored in VARCHAR2 columns? Or did you store it in RAW columns?
One of the (many) reasons that I would strongly advocate RAW for encrypted data is that you don't have to worry about character set transforms.
If the data is stored in VARCHAR2 columns, you would generally have to decrypt it in the source database, copy it over to the new database in the clear, and re-encrypt it in the destination database. Unless you happen to have chosen an encryption algorithm that guarantees the output to have the same representation in both Windows-1252 and UTF-8 character sets, which would seem exceptionally unlikely.
Justin

How to change character set to arabic in Develper suite forms 10g

Dear all,
Our company wants migrate oracle forms 4.5/6i applications to Oracle developer suite 10g version.
They also want there database to upgrade from 9i to 10g.
They gave me test machine, on which windows xp is installed and i did the following:-
1,Installed Oracle 10g Xe edition database.
2,Installed Oracle 10g Devloper suite(oracle forms, Oracle reports).
3, Configured the connection of Oracle Developer suite 10g to Oracle Database 10g.
4, Loaded Data into the 10g database. *( they are few columns like DEPARTMENT_NAME_ARB, FUNCTION_NAME_ARB which is supposed to show in arabic fonts, as it is in arabic in 9i database, now they are showing in some special characters)*
What I would like to know is: Is there a way through which i can set characterset?
Is it in the database in which i have to make character set change?
Is it in the oracle developer suite application in which i have to make character set change?
Is it in the registry in which i have to make changes?
please help.

Hi freinds,
It is very encouraging to see your replies, i apologize for the late reponse, i still got no success with updating PROPS$,
I relgiously followed all the instructions given to me by all of you. Like u could see in my previous posts
Luckiily, i am able to insert one row at a time manaually iin arabic and in english by pressing (ALT+SHIFT).
When i create datablocks in forms builder, i do see output in arabic.
When i create report in group style i do see ouptput in arabic.
i have thousands of rows(in GB's) which needs to be inserted in this new database 10g XE edition (downloaded from oracle)
I have attempted multiple times insertion of data by just running script, or simply copying numerous insert statement rows from notepad to sql*plus, unluckily it alwayz retreived the special charachters rather than retreiving arabic characters.
Is there a way to insert data in this new oracle 10g xe editon database via oracle developer suite 10g forms/reports?
Do I have to use inbuilt data load/unload utilities in oracle 10g xe edition?
Do I have to install sql*loader separately to load the data?
Do you think TOAD can help in this?
Could you please tell me how to add snap shots in this post?(user10947262)
Here are the following details of National Language Parameter Value
Before
NLS_CALENDAR     GREGORIAN
NLS_CHARACTERSET     AL32UTF8 (IS this multibyte (UTF-8) character set SARAH?)
NLS_COMP     BINARY
NLS_CURRENCY     $
NLS_DATE_FORMAT     DD-MON-RR
NLS_DATE_LANGUAGE     AMERICAN
NLS_DUAL_CURRENCY     $
NLS_ISO_CURRENCY     AMERICA
NLS_LANGUAGE     AMERICAN
NLS_LENGTH_SEMANTICS     BYTE
NLS_NCHAR_CHARACTERSET     AL16UTF16
NLS_NCHAR_CONV_EXCP     FALSE
NLS_NUMERIC_CHARACTERS     .,
NLS_SORT     BINARY
NLS_TERRITORY     AMERICA
NLS_TIMESTAMP_FORMAT     DD-MON-RR HH.MI.SSXFF AM
NLS_TIMESTAMP_TZ_FORMAT     DD-MON-RR HH.MI.SSXFF AM TZR
NLS_TIME_FORMAT     HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT     HH.MI.SSXFF AM TZR
After
NLS_CALENDAR     GREGORIAN
NLS_CHARACTERSET     AL32UTF8
NLS_COMP     BINARY
NLS_CURRENCY     ر.س.
NLS_DATE_FORMAT     DD/MM/RR
NLS_DATE_LANGUAGE     ARABIC
NLS_DUAL_CURRENCY     ر.س.
NLS_ISO_CURRENCY     SAUDI ARABIA
NLS_LANGUAGE     ARABIC
NLS_LENGTH_SEMANTICS     BYTE
NLS_NCHAR_CHARACTERSET     AL16UTF16
NLS_NCHAR_CONV_EXCP     FALSE
NLS_NUMERIC_CHARACTERS     .,
NLS_SORT     ARABIC
NLS_TERRITORY     SAUDI ARABIA
NLS_TIME_FORMAT     HH12:MI:SSXFF PM
NLS_TIMESTAMP_FORMAT     DD/MM/RR HH12:MI:SSXFF PM
NLS_TIMESTAMP_TZ_FORMAT     DD/MM/RR HH12:MI:SSXFF PM TZR
NLS_TIME_TZ_FORMAT     HH12:MI:SSXFF PM TZR
Certainly Christian, i dont want to screw my oracle 10g database xe edition software and installation, and i agree and hope with this that creating new database and doing exporting and importing will work for me. (XE edition doesnt give option to create new database, i need to install 10g release2 media pack from edelivery)
However, with the above informatiion provided, Is it really needed?
Please help me.
Thanks and Regards

Change character set

Hi
is anyone can tell me how to change characterset.
i try with alter session but it doesnt work.
thanks

Article from Metalink
Doc ID:      Note:66320.1
Subject:      Changing the Database Character Set or the Database National Character Set
Type:      BULLETIN
Status:      PUBLISHED
     Content Type:      TEXT/PLAIN
Creation Date:      23-OCT-1998
Last Revision Date:      12-DEC-2003
PURPOSE ======= To explain how to change the database character set or national character set of an existing Oracle8(i) or Oracle9i database without having to recreate the database. 1. SCOPE & APPLICATION ====================== The method described here is documented in the Oracle 8.1.x and Oracle9i documentation. It is not documented but it can be used in version 8.0.x. It does not work in Oracle7. The database character set is the character set of CHAR, VARCHAR2, LONG, and CLOB data stored in the database columns, and of SQL and PL/SQL text stored in the Data Dictionary. The national character set is the character set of NCHAR, NVARCHAR2, and NCLOB data. In certain database configurations the CLOB and NCLOB data are stored in the fixed-width Unicode encoding UCS-2. If you are using CLOB or NCLOB please make sure you read section "4. HANDLING CLOB AND NCLOB COLUMNS" below in this document. Before changing the character set of a database make sure you understand how Oracle deals with character sets. Before proceeding please refer to [NOTE:158577.1] "NLS_LANG Explained (How Does Client-Server Character Conversion Work?)". See also [NOTE:225912.1] "Changing the Database Character Set - an Overview" for general discussion about various methods of migration to a different database character set. If you are migrating an Oracle Applications instance, read [NOTE:124721.1] "Migrating an Applications Installation to a New Character Set" for specific steps that have to be performed. If you are migrating from 8.x to 9.x please have a look at [NOTE:140014.1] "ALERT: Oracle8/8i to Oracle9i Using New "AL16UTF16"" and other referenced notes below. Before using the method described in this note it is essential to do a full backup of the database and to use the Character Set Scanner utility to check your data. See the section "2. USING THE CHARACTER SET SCANNER" below. Note that changing the database or the national character set as described in this document does not change the actual character codes, it only changes the character set declaration. If you want to convert the contents of the database (character codes) from one character set to another you must use the Oracle Export and Import utilities. This is needed, for example, if the source character set is not a binary subset of the target character set, i.e. if a character exists in the source and in the target character set but not with the same binary code. All binary subset-superset relationships between characters sets recognized by the Oracle Server are listed in [NOTE:119164.1] "Changing Database Character Set - Valid Superset Definitions". Note: The varying width character sets (like UTF8) are not supported as national character sets in Oracle8(i) (see [NOTE:62107.1]). Thus, changing the national character set from a fixed width character set to a varying width character set is not supported in Oracle8(i). NCHAR types in Oracle8 and Oracle8i were designed to support special Oracle specific fixed-width Asian character sets, that were introduced to provide higher performance processing of Asian character data. Examples of these character sets are : JA16EUCFIXED ,JA16SJISFIXED , ZHT32EUCFIXED. For a definition of varying width character sets see also section "4. HANDLING CLOB AND NCLOB COLUMNS" below. WARNING: Do not use any undocumented Oracle7 method to change the database character set of an Oracle8(i) or Oracle9i database. This will corrupt the database. 2. USING THE CHARACTER SET SCANNER ================================== Character data in the Oracle 8.1.6 and later database versions can be efficiently checked for possible character set migration problems with help of the Character Set Scanner utility. This utility is included in the Oracle Server 8.1.7 software distribution and the newest Character Set Scanner version can be downloaded from the Oracle Technology Network site, http://otn.oracle.com The Character Set Scanner on OTN is available for limited number of platforms only but it can be used with databases on other platforms in the client/server configuration -- as long as the database version matches the Character Set Scanner version and platforms are either both ASCII-based or both EBCDIC-based. It is recommended to use the newest Character Set Scanner version available from the OTN site. The Character Set Scanner is documented in the following manuals: - "Oracle8i Documentation Addendum, Release 3 (8.1.7)", Chapter 3 - "Oracle9i Globalization Support Guide, Release 1 (9.0.1)", Chapter 10 - "Oracle9i Database Globalization Support Guide, Release 2 (9.2)", Chapter 11 Note: The Character Set Scanner coming with Oracle 8.1.7 and Oracle 9.0.1 does not have a separate version number. It reports the database release number in its banner. This version of the Scanner does not check for illegal character codes in a database if the FROMCHAR and TOCHAR (or FROMNCHAR and TONCHAR) parameters have the same value (i.e. you simulate migration from a character set to itself). The Character Set Scanner 1.0, available on OTN, reports its version number as x.x.x.1.0, where x.x.x is the database version number. This version adds a few bug fixes and it supports FROMCHAR=TOCHAR provided it is not UTF8. The Character Set Scanner 1.1, available on OTN and with Release 2 (9.2) of the Oracle Server, reports its version number as v1.1 followed by the database version number. This version adds another bug fixes and the full support for FROMCHAR=TOCHAR. None of the above versions of the Scanner can correctly analyze CLOB or NCLOB values if the database or the national character set, respectively, is multibyte. The Scanner reports such values randomly as Convertible or Lossy. The version 1.2 of the Scanner will mark all such values as Changeless (as they are always stored in the Unicode UCS-2 encoding and thus they do not change when the database or national character set is changed from one multibyte to another). Character Set Scanner 2.0 will correctly check CLOBs and NCLOBs for possible data loss when migrating from a multibyte character set to its subset. To verify that your database contains only valid codes, specify the new database character set in the TOCHAR parameter and/or the new national character set in the TONCHAR parameter. Specify FULL=Y to scan the whole database. Set ARRAY and PROCESS parameters depending on your system's resources to speed up the scanning. FROMCHAR and FROMNCHAR will default to the original database and national character sets. The Character Set Scanner should report only Changless data in both the Data Dictionary and in application data. If any Convertible or Exceptional data are reported, the ALTER DATABASE [NATIONAL] CHARACTER SET statement must not be used without further investigation of the source and type of these data. In situations in which the ALTER DATABASE [NATIONAL] CHARACTER SET statement is used to repair an incorrect database character set declaration rather than to simply migrate to a new wider character set, you may be advised by Oracle Support Services analysts to execute the statement even if Exceptional data are reported. For more information see also [NOTE:225912.1] "Changing the Database Character Set - a short Overview". 3. CHANGING THE DATABASE OR THE NATIONAL CHARACTER SET ====================================================== Oracle8(i) introduces a new documented method of changing the database and national character sets. The method uses two SQL statements, which are described in the Oracle8i National Language Support Guide: ALTER DATABASE [<db_name>] CHARACTER SET <new_character_set> ALTER DATABASE [<db_name>] NATIONAL CHARACTER SET <new_NCHAR_character_set> The database name is optional. The character set name should be specified without quotes, for example: ALTER DATABASE CHARACTER SET WE8ISO8859P1 To change the database character set perform the following steps. Note that some of them have been erroneously omitted from the Oracle8i documentation: 1. Use the Character Set Scanner utility to verify that your database contains only valid character codes -- see "2. USING THE CHARACTER SET SCANNER" above. 2. If necessary, prepare CLOB columns for the character set change -- see "4. HANDLING CLOB AND NCLOB COLUMNS" below. Omitting this step can lead to corrupted CLOB/NCLOB values in the database. If SYS.METASTYLESHEET (STYLESHEET) is populated (9i and up only) then see [NOTE:213015.1] "SYS.METASTYLESHEET marked as having convertible data (ORA-12716 when trying to convert character set)" for the actions that need to be taken. 3. Make sure the parallel_server parameter in INIT.ORA is set to false or it is not set at all. 4. Execute the following commands in Server Manager (Oracle8) or sqlplus (Oracle9), connected as INTERNAL or "/ AS SYSDBA": SHUTDOWN IMMEDIATE; -- or NORMAL <do a full database backup> STARTUP MOUNT; ALTER SYSTEM ENABLE RESTRICTED SESSION; ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0; ALTER SYSTEM SET AQ_TM_PROCESSES=0; ALTER DATABASE OPEN; ALTER DATABASE CHARACTER SET <new_character_set>; SHUTDOWN IMMEDIATE; -- OR NORMAL STARTUP RESTRICT; 5. Restore the parallel_server parameter in INIT.ORA, if necessary. 6. Execute the following commands: SHUTDOWN IMMEDIATE; -- OR NORMAL STARTUP; The double restart is necessary in Oracle8(i) because of a SGA initialization bug, fixed in Oracle9i. 7. If necessary, restore CLOB columns -- see "4. HANDLING CLOB AND NCLOB COLUMNS" below. To change the national character set replace the ALTER DATABASE CHARACTER SET statement with ALTER DATABASE NATIONAL CHARACTER SET. You can issue both statements together if you wish. Error Conditions ---------------- A number of error conditions may be reported when trying to change the database or national character set. In Oracle8(i) the ALTER DATABASE [NATIONAL] CHARACTER SET statement will return: ORA-01679: database must be mounted EXCLUSIVE and not open to activate - if you do not enable restricted session - if you startup the instance in PARALLEL/SHARED mode - if you do not set the number of queue processes to 0 - if you do not set the number of AQ time manager processes to 0 - if anybody is logged in apart from you. This error message is misleading. The command requires the database to be open but only one session, the one executing the command, is allowed. For the above error conditions Oracle9i will report one of the errors: ORA-12719: operation requires database is in RESTRICTED mode ORA-12720: operation requires database is in EXCLUSIVE mode ORA-12721: operation cannot execute when other sessions are active Oracle9i can also report: ORA-12718: operation requires connection as SYS if you are not connect as SYS (INTERNAL, "/ AS SYSDBA"). If the specified new character set name is not recognized, Oracle will report one of the errors: ORA-24329: invalid character set identifier ORA-12714: invalid national character set specified ORA-12715: invalid character set specified The ALTER DATABASE [NATIONAL] CHARACTER SET command will only work if the old character set is considered a binary subset of the new character set. Oracle Server 8.0.3 to 8.1.5 recognizes US7ASCII as the binary subset of all ASCII-based character sets. It also treats each character set as a binary subset of itself. No other combinations are recognized. Newer Oracle Server versions recognize additional subset/superset combinations, which are listed in [NOTE:119164.1]. If the old character set is not recognized as a binary subset of the new character set, the ALTER DATABASE [NATIONAL] CHARACTER SET statement will return: - in Oracle 8.1.5 and above: ORA-12712: new character set must be a superset of old character set - in Oracle 8.0.5 and 8.0.6: ORA-12710: new character set must be a superset of old character set - in Oracle 8.0.3 and 8.0.4: ORA-24329: invalid character set identifier You will also get these errors if you try to change the characterset of a US7ASCII database that was started without a (correct) ORA_NLSxx parameter. See [NOTE:77442.1] It may be necessary to switch off the superset check to allow changes between formally incompatible character sets to solve certain character set problems or to speed up migration of huge databases. Oracle Support Services may pass the necessary information to customers after verifying the safety of the change for the customers' environments. If in Oracle9i an ALTER DATABASE NATIONAL CHARACTER SET is issued and there are N-type colums who contain data then this error is returned: ORA-12717:Cannot ALTER DATABASE NATIONAL CHARACTER SET when NCLOB data exists The error only speaks about Nclob but Nchar and Nvarchar2 are also checked see [NOTE:2310895.9] for bug [BUG:2310895] 4. HANDLING CLOB AND NCLOB COLUMNS ================================== Background ---------- In a fixed width character set codes of all characters have the same number of bytes. Fixed width character sets are: all single-byte character sets and those multibyte character sets which have names ending with 'FIXED'. In Oracle9i the character set AL16UTF16 is also fixed width. In a varying width character set codes of different characters may have different number of bytes. All multibyte character sets except those with names ending with FIXED (and except Oracle9i AL16UTF16 character set) are varying width. Single-byte character sets are character sets with names of the form xxx7yyyyyy and xxx8yyyyyy. Each character code of a single-byte character set occupies exactly one byte. Multibyte character sets are all other character sets (including UTF8). Some -- usually most -- character codes of a multibyte character set occupy more than one byte. CLOB values in a database whose database character set is fixed width are stored in this character set. CLOB values in an Oracle 8.0.x database whose database character set is varying width are not allowed. They have to be NULL. CLOB values in an Oracle >= 8.1.5 database whose database character set is varying width are stored in the fixed width Unicode UCS-2 encoding. The same holds for NCLOB values and the national character set. The UCS-2 storage format of character LOB values, as implemented in Oracle8i, ensures that calculation of character positions in LOB values is fast. Finding the byte offset of a character stored in a varying width character set would require reading the whole LOB value up to this character (possibly 4GB). In the fixed width character sets the byte offsets are simply character offsets multiplied by the number of bytes in a character code. In UCS-2 byte offsets are simply twice the character offsets. As the Unicode character set contains all characters defined in any other Oracle character set, there is no data loss when a CLOB/NCLOB value is converted to UCS-2 from the character set in which it was provided by a client program (usually the NLS_LANG character set). CLOB Values and the Database Character Set Change ------------------------------------------------- In Oracle 8.0.x CLOB values are invalid in varying width character sets. Thus you must delete all CLOB column values before changing the database character set to a varying width character set. In Oracle 8.1.5 and later CLOB values are valid in varying width character sets but they are converted to Unicode UCS-2 before being stored. But UCS-2 encoding is not a binary superset of any other Oracle character set. Even codes of the basic ASCII characters are different, e.g. single-byte code for "A"=0x41 becomes two-byte code 0x0041. This implies that even if the new varying width character set is a binary superset of the old fixed width character set and thus VARCHAR2/LONG character codes remain valid, the fixed width character codes in CLOB values will not longer be valid in UCS-2. As mentioned above, the ALTER DATABASE [NATIONAL] CHARACTER SET statement does not change character codes. Thus, before changing a fixed width database character set to a varying width character set (like UTF8) in Oracle 8.1.5 or later, you first have to export all tables containing non-NULL CLOB columns, then truncate these tables, then change the database character set and, finally, import the tables back to the database. The import step will perform the required conversion. If you omit the steps above, the character set change will succeed in Oracle8(i) (Oracle9i disallows the change in such situation) and the CLOBs may appear to be correctly legible but as their encoding is incorrect, they will cause problems in further operations. For example, CREATE TABLE AS SELECT will not correctly copy such CLOB columns. Also, after installation of the 8.1.7.3 server patchset the CLOB columns will not longer be legible. LONG columns are always stored in the database character set and thus they behave like CHAR/VARCHAR2 in respect to the character set change. BLOBs and BFILEs are binary raw datatypes and their processing does not depend on any Oracle character set setting. NCLOB Values and the National Character Set Change -------------------------------------------------- The above discussion about changing the database character set and exporting and importing CLOB values is theoretically applicable to the change of the national character set and to NCLOB values. But as varying width character sets are not supported as national character sets in Oracle8(i), changing the national character set from a fixed width character set to a varying width character set is not supported at all. Preparing CLOB Columns for the Character Set Change --------------------------------------------------- Take a backup of the database. If using Advanced Replication or deferred transactions functionality, make sure that there are no outstanding deferred transactions with CLOB parameters, i.e. DEFLOB view must have no rows with non-NULL CLOB_COL column; to make sure that replication environment remains consistent use only recommended methods of purging deferred transaction queue, preferably quiescing the replication environment. Then: - If changing the database character set from a fixed width character set to a varying with character set in Oracle 8.0.x, set all CLOB column values to NULL -- you are not allowed to use CLOB columns after the character set change. - If changing the database character set from a fixed width character set to a varying width character set in Oracle 8.1.5 or later, perform table-level export of all tables containing CLOB columns, including SYSTEM's tables. Set NLS_LANG to the old database character set for the Export utility. Then truncate these tables. Restoring CLOB Columns after the Character Set Change ----------------------------------------------------- In Oracle 8.1.5 or later, after changing the character set as described above (steps 3. to 6.), restore CLOB columns exported in step 2. by importing them back into the database. Set NLS_LANG to the old database character set for the Import utility to avoid IMP-16 errors and data loss. RELATED DOCUMENTS: ================== [NOTE:13856.1] V7: Changing the Database Character Set -- This note has limited distribution, please contact Oracle Support [NOTE:62107.1] The National Character Set in Oracle8 [NOTE:119164.1] Changing Database Character set - Valid Superset definitions [NOTE:118242.1] ALERT: Changing the Database or National Character Set Can Corrupt LOB Values <Note.158577.1> NLS_LANG Explained (How Does Client-Server Character Conversion Work?) [NOTE:140014.1] ALERT: Oracle8/8i to Oracle9i using New "AL16UTF16" [NOTE:159657.1] Complete Upgrade Checklist for Manual Upgrades from 8.X / 9.0.1 to Oracle9i (incl. 9.2) [NOTE:124721.1] Migrating an Applications Installation to a New Character Set Oracle8i National Language Support Guide Oracle8i Release 3 (8.1.7) Readme - Section 18.12 "Restricted ALTER DATABASE CHARACTER SET Command Support (CLOB and NCLOB)" Oracle8i Documentation Addendum, Release 3 (8.1.7) - Chapter 3 "New Character Set Scanner Utility" Oracle8i Application Developer's Guide - Large Objects (LOBs), Release 2 - Chapter 2 "Basic Components" Oracle8 Application Developer's Guide, Release 8.0 - Chapter 6 "Large Objects (LOBs)", Section "Introduction to LOBs" Oracle9i Globalization Guide, Release 1 (9.0.1) Oracle9i Database Globalization Guide, Release 2 (9.2) For further NLS / Globalization information you may start here: [NOTE:150091.1] Globalization Technology (NLS) Library index .
     Copyright (c) 1995,2000 Oracle Corporation. All Rights Reserved. Legal Notices and Terms of Use.
Joel P�rez

Fixing a US7ASCII - WE8ISO8859P1 Character Set Conversion Disaster

In hopes that it might be helpful in the future, here's the procedure I followed to fix a disastrous unintentional US7ASCII on 9i to WE8ISO8859P1 on 10g migration.
BACKGROUND
Oracle has multiple character sets, ranging from US7ASCII to AL32UTF16.
US7ASCII, of course, is a cheerful 7 bit character set, holding the basic ASCII characters sufficient for the English language.
However, it also has a handy feature: character fields under US7ASCII will accept characters with values > 128. If you have a web application, users can type (or paste) Us with umlauts, As with macrons, and quite a few other funny-looking characters.
These will be inserted into the database, and then -- if appropriately supported -- can be selected and displayed by your app.
The problem is that while these characters can be present in a VARCHAR2 or CLOB column, they are not actually legal. If you try within Oracle to convert from US7ASCII to WE8ISO8859P1 or any other character set, Oracle recognizes that these characters with values greater than 127 are not valid, and will replace them with a default "unknown" character. In the case of a change from US7ASCII to WE8ISO8859P1, it will change them to 191, the upside down question mark.
Oracle has a native utility, introduced in 8i, called csscan, which assists in migrating to different character sets. This has been replaced in newer versions with the Database MIgration Assistant for Unicode (DMU), which is the new recommended tool for 11.2.0.3+.
These tools, however, do no good unless they are run. For my particular client, the operations team took a database running 9i and upgraded it to 10g, and as part of that process the character set was changed from US7ASCII to WE8ISO8859P1. The database had a large number of special characters inserted into it, and all of these abruptly turned into upside-down question marks. The users of the application didn't realize there was a problem until several weeks later, by which time they had put a lot of new data into the system. Rollback was not possible.
FIXING THE PROBLEM
How fixable this problem is and the acceptable methods which can be used depend on the application running on top of the database. Fortunately, the client app was amenable.
(As an aside note: this approach does not use csscan -- I had done something similar previously on a very old system and decided it would take less time in this situation to revamp my old procedures and not bring a new utility into the mix.)
We will need to separate approaches -- one to fix the VARCHAR2 & CHAR fields, and a second for CLOBs.
In order to set things up, we created two environments. The first was a clone of production as it is now, and the second a clone from before the upgrade & character set change. We will call these environments PRODCLONE and RESTORECLONE.
Next, we created a database link, OLD6. This allows PRODCLONE to directly access RESTORECLONE. Since they were cloned with the same SID, establishing the link needed the global_names parameter set to false.
alter system set global_names=false scope=memory;
CREATE PUBLIC DATABASE LINK OLD6
CONNECT TO DBUSERNAME
IDENTIFIED BY dbuserpass
USING 'restoreclone:1521/MYSID';
Testing the link...
SQL> select count(1) from users@old6;
COUNT(1)
       454
Here is a row in a table which contains illegal characters. We are accessing RESTORECLONE from PRODCLONE via our link.
PRODCLONE> select dump(title) from my_contents@old6 where pk1=117286;
DUMP(TITLE)
Typ=1 Len=49: 78,67,76,69,88,45,80,78,174,32,69,120,97,109,32,83,116,121,108,101
,32,73,110,116,101,114,97,99,116,105,118,101,32,82,101,118,105,101,119,32,81,117
,101,115,116,105,111,110,115
By comparison, a dump of that row on PRODCLONE's my_contents gives:
PRODCLONE> select dump(title) from my_contents where pk1=117286;
DUMP(TITLE)
Typ=1 Len=49: 78,67,76,69,88,45,80,78,191,32,69,120,97,109,32,83,116,121,108,101
,32,73,110,116,101,114,97,99,116,105,118,101,32,82,101,118,105,101,119,32,81,117
,101,115,116,105,111,110,115
Note that the "174" on RESTORECLONE was changed to "191" on PRODCLONE.
We can manually insert CHR(174) into our PRODCLONE and have it display successfully in the application.
However, I tried a number of methods to copy the data from RESTORECLONE to PRODCLONE through the link, but entirely without success. Oracle would recognize the character as invalid and silently transform it.
Eventually, I located a clever workaround at this link:
https://kr.forums.oracle.com/forums/thread.jspa?threadID=231927
It works like this:
On RESTORECLONE you create a view, vv, with UTL_RAW:
RESTORECLONE> create or replace view vv as select pk1,utl_raw.cast_to_raw(title) as title from my_contents;
View created.
This turns the title to raw on the RESTORECLONE.
You can now convert from RAW to VARCHAR2 on the PRODCLONE database:
PRODCLONE> select dump(utl_raw.cast_to_varchar2 (title)) from vv@old6 where pk1=117286;
DUMP(UTL_RAW.CAST_TO_VARCHAR2(TITLE))
Typ=1 Len=49: 78,67,76,69,88,45,80,78,174,32,69,120,97,109,32,83,116,121,108,101
,32,73,110,116,101,114,97,99,116,105,118,101,32,82,101,118,105,101,119,32,81,117
,101,115,116,105,111,110,115
The above works because oracle on PRODCLONE never knew that our TITLE string on RESTORE was originally in US7ASCII, so it was unable to do its transparent character set conversion.
PRODCLONE> update my_contents set title=( select utl_raw.cast_to_varchar2 (title) from vv@old6 where pk1=117286) where pk1=117286;
PRODCLONE> select dump(title) from my_contents where pk1=117286;
DUMP(UTL_RAW.CAST_TO_VARCHAR2(TITLE))
Typ=1 Len=49: 78,67,76,69,88,45,80,78,174,32,69,120,97,109,32,83,116,121,108,101
,32,73,110,116,101,114,97,99,116,105,118,101,32,82,101,118,105,101,119,32,81,117
,101,115,116,105,111,110,115
Excellent! The "174" character has survived the transfer and is now in place on PRODCLONE.
Now that we have a method to move the data over, we have to identify which columns /tables have character data that was damaged by the conversion. We decided we could ignore anything with a length smaller than 10 -- such fields in our application would be unlikely to have data with invalid characters.
RESTORECLONE> select count(1) from user_tab_columns where data_type in ('CHAR','VARCHAR2') and data_length > 10;
   COUNT(1)
    533
By converting a field to WE8ISO8859P1, and then comparing it with the original, we can see if the characters change:
RESTORECLONE> select count(1) from my_contents where title != convert (title,'WE8ISO8859P1','US7ASCII') ;
COUNT(1)
     10568
So 10568 rows have characters which were transformed into 191s as part of the original conversion.
[ As an aside, we can't use CONVERT() on LOBs -- for them we will need another approach, outlined further below.
RESTOREDB> select count(1) from my_contents where main_data != convert (convert(main_DATA,'WE8ISO8859P1','US7ASCII'),'US7ASCII','WE8ISO8859P1') ;
select count(1) from my_contents where main_data != convert (convert(main_DATA,'WE8ISO8859P1','US7ASCII'),'US7ASCII','WE8ISO8859P1')
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected - got CLOB
Anyway, now that we can identify VARCHAR2 fields which need to be checked, we can put together a PL/SQL stored procedure to do it for us:
create or replace procedure find_us7_strings
(table_name varchar2,
fix_col varchar2 )
authid current_user
as
orig_sql varchar2(1000);
begin
orig_sql:='insert into cnv_us7(mytablename,myindx,mycolumnname) select '''||table_name||''',pk1,'''||fix_col||''' from '||table_name||' where '||fix_col||' != CONVERT(CONVERT('||fix_col||',''WE8ISO8859P1''),''US7ASCII'') and '||fix_col||' is not null';
-- Uncomment if debugging:
-- dbms_output.put_line(orig_sql);
execute immediate orig_sql;
end;
And create a table to store the information as to which tables, columns, and rows have the bad characters:
drop table cnv_us7;
create table cnv_us7 (mytablename varchar2(50), myindx number,      mycolumnname varchar2(50) ) tablespace myuser_data;
create index list_tablename_idx on cnv_us7(mytablename) tablespace myuser_indx;
With a SQL-generating SQL script, we can iterate through all the tables/columns we want to check:
--example of using the data: select title from my_contents where pk1 in (select myindx from cnv_us7)
set head off pagesize 1000 linesize 120
spool runme.sql
select 'exec find_us7_strings ('''||table_name||''','''||column_name||'''); ' from user_tab_columns
      where
          data_type in ('CHAR','VARCHAR2')
          and table_name in (select table_name from user_tab_columns where column_name='PK1' and table_name not in ('HUGETABLEIWANTTOEXCLUDE','ANOTHERTABLE'))
          and char_length > 10
          order by table_name,column_name;
spool off;
set echo on time on timing on feedb on serveroutput on;
spool output_of_runme
@./runme.sql
spool off;
Which eventually gives us the following inserted into CNV_US7:
20:48:21 SQL> select count(1),mycolumnname,mytablename from cnv_us7 group by mytablename,mycolumnname;
         4 DESCRIPTION                                        MY_FORUMS
     21136 TITLE                                              MY_CONTENTS
Out of 533 VARCHAR2s and CHARs, we only had five or six columns that needed fixing
We create our views on RESTOREDB:
create or replace view my_forums_vv as select pk1,utl_raw.cast_to_raw(description) as description from forum_main;
create or replace view my_contents_vv as select pk1,utl_raw.cast_to_raw(title) as title from my_contents;
And then we can fix it directly via sql:
update my_contents taborig1 set TITLE= (select utl_raw.cast_to_varchar2 (TITLE) from my_contents_vv@old6 where pk1=taborig1.pk1)
where pk1 in (
select tabnew.pk1 from my_contents@old6 taborig,my_contents tabnew,cnv_us7@old6
      where taborig.pk1=tabnew.pk1
          and myindx=tabnew.pk1
          and mycolumnname='TITLE'
          and mytablename='MY_CONTENTS'
          and convert(taborig.TITLE,'US7ASCII','WE8ISO8859P1') = tabnew.TITLE );
Note this part:
      "and convert(taborig.TITLE,'US7ASCII','WE8ISO8859P1') = tabnew.TITLE "
This checks to verify that the TITLE field on the PRODCLONE and RESTORECLONE are the same (barring character set issues). This is there because if the users have changed TITLE -- or any other field -- on their own between the time of the upgrade and now, we do not want to overwrite their changes. We make the assumption that as part of the process, they may have changed the bad character on their own.
We can also create a stored procedure which will execute the SQL for us:
create or replace procedure fix_us7_strings
(TABLE_NAME varchar2,
FIX_COL varchar2 )
authid current_user
as
orig_sql varchar2(1000);
TYPE cv_type IS REF CURSOR;
orig_cur cv_type;
begin
orig_sql:='update '||TABLE_NAME||' taborig1 set '||FIX_COL||'= (select utl_raw.cast_to_varchar2 ('||FIX_COL||') from '||TABLE_NAME||'_vv@old6 where pk1=taborig1.pk1)
where pk1 in (
select tabnew.pk1 from '||TABLE_NAME||'@old6 taborig,'||TABLE_NAME||' tabnew,cnv_us7@old6
      where taborig.pk1=tabnew.pk1
          and myindx=tabnew.pk1
          and mycolumnname='''||FIX_COL||'''
          and mytablename='''||TABLE_NAME||'''
          and convert(taborig.'||FIX_COL||',''US7ASCII'',''WE8ISO8859P1'') = tabnew.'||FIX_COL||')';
dbms_output.put_line(orig_sql);
execute immediate orig_sql;
end;
exec fix_us7_strings('MY_FORUMS','DESCRIPTION');
exec fix_us7_strings('MY_CONTENTS','TITLE');
commit;
To validate this before and after, we can run something like:
select dump(description) from my_forums where pk1 in (select myindx from cnv_us7@old6 where mytablename='MY_FORUMS');
The above process fixes all the VARCHAR2s and CHARs. Now what about the CLOB columns?
Note that we're going to have some extra difficulty here, not just because we are dealing with CLOBs, but because we are working with CLOBs in 9i, whose functions have less CLOB-related functionality.
This procedure finds invalid US7ASCII strings inside a CLOB in 9i:
create or replace procedure find_us7_clob
(table_name varchar2,
fix_col varchar2)
authid current_user
as
orig_sql varchar2(1000);
type cv_type is REF CURSOR;
orig_table_cur cv_type;
my_chars_read NUMBER;
my_offset NUMBER;
my_problem NUMBER;
my_lob_size NUMBER;
my_indx_var NUMBER;
my_total_chars_read NUMBER;
my_output_chunk VARCHAR2(4000);
my_problem_flag NUMBER;
my_clob CLOB;
my_total_problems NUMBER;
ins_sql VARCHAR2(4000);
BEGIN
   DBMS_OUTPUT.ENABLE(1000000);
   orig_sql:='select pk1,dbms_lob.getlength('||FIX_COL||') as cloblength,'||fix_col||' from '||table_name||' where dbms_lob.getlength('||fix_col||') >0 and '||fix_col||' is not null order by pk1';
   open orig_table_cur for orig_sql;
   my_total_problems := 0;
   LOOP
        FETCH orig_table_cur INTO my_indx_var,my_lob_size,my_clob;
                EXIT WHEN orig_table_cur%NOTFOUND;
        my_offset :=1;
        my_chars_read := 512;
        my_problem_flag :=0;
        WHILE my_offset < my_lob_size and my_problem_flag =0
                LOOP
                DBMS_LOB.READ(my_clob,my_chars_read,my_offset,my_output_chunk);
                my_offset := my_offset + my_chars_read;
                IF my_output_chunk != CONVERT(CONVERT(my_output_chunk,'WE8ISO8859P1'),'US7ASCII')
                        THEN
                        -- DBMS_OUTPUT.PUT_LINE('Problem with '||my_indx_var);
                        -- DBMS_OUTPUT.PUT_LINE(my_output_chunk);
                        my_problem_flag:=1;
                END IF;
        END LOOP;
        IF my_problem_flag=1
                THEN my_total_problems := my_total_problems +1;
                ins_sql:='insert into cnv_us7(mytablename,myindx,mycolumnname) values ('''||table_name||''','||my_indx_var||','''||fix_col||''')';
                execute immediate ins_sql;
                END IF;
   END LOOP;
   DBMS_OUTPUT.PUT_LINE('We found '||my_total_problems||' problem rows in table '||table_name||', column '||fix_col||'.');
END;
And we can use SQL-generating SQL to find out which CLOBs have issues, out of all the ones in the database:
RESTOREDB> select 'exec find_us7_clob('''||table_name||''','''||column_name||''');' from user_tab_columns where data_type='CLOB';
exec find_us7_clob('MY_CONTENTS','DATA');
After completion, the CNV_US7 table looked like this:
RESTOREDB> set linesize 120 pagesize 100;
RESTOREDB> select count(1),mytablename,mycolumnname from cnv_us7
   where mytablename||' '||mycolumnname in (select table_name||' '||column_name from user_tab_columns
         where data_type='CLOB' )
      group by mytablename,mycolumnname;
COUNT(1) MYTABLENAME                                        MYCOLUMNNAME
     69703 MY_CONTENTS                                  DATA
On RESTOREDB, our 9i version, we will use this procedure (found many years ago on the internet):
create or replace procedure CLOB2BLOB (p_clob in out nocopy clob, p_blob in out nocopy blob) is
-- transforming CLOB to BLOB
l_off number default 1;
l_amt number default 4096;
l_offWrite number default 1;
l_amtWrite number;
l_str varchar2(4096 char);
begin
loop
dbms_lob.read ( p_clob, l_amt, l_off, l_str );
l_amtWrite := utl_raw.length ( utl_raw.cast_to_raw( l_str) );
dbms_lob.write( p_blob, l_amtWrite, l_offWrite,
utl_raw.cast_to_raw( l_str ) );
l_offWrite := l_offWrite + l_amtWrite;
l_off := l_off + l_amt;
l_amt := 4096;
end loop;
exception
when no_data_found then
NULL;
end;
We can test out the transformation of CLOBs to BLOBs with a single row like this:
drop table my_contents_lob;
Create table my_contents_lob (pk1 number,data blob);
DECLARE
      v_clob CLOB;
      v_blob BLOB;
    BEGIN
      SELECT data INTO v_clob FROM my_contents WHERE pk1 = 16 ;
      INSERT INTO my_contents_lob (pk1,data) VALUES (16,empty_blob() );
      SELECT data INTO v_blob FROM my_contents_lob WHERE pk1=16 FOR UPDATE;
      clob2blob (v_clob, v_blob);
    END;
select dbms_lob.getlength(data) from my_contents_lob;
DBMS_LOB.GETLENGTH(DATA)
                             329
SQL> select utl_raw.cast_to_varchar2(data) from my_contents_lob;
UTL_RAW.CAST_TO_VARCHAR2(DATA)
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam...
Now we need to push it through a loop. Unfortunately, I had trouble making the "SELECT INTO" dynamic. Thus I used a version of the procedure for each table. It's aesthetically displeasing, but at least it worked.
create table my_contents_lob(pk1 number,data blob);
create index my_contents_lob_pk1 on my_contents_lob(pk1) tablespace my_user_indx;
create or replace procedure blob_conversion_my_contents
(table_name varchar2,
fix_col varchar2)
authid current_user
as
orig_sql varchar2(1000);
type cv_type is REF CURSOR;
orig_table_cur cv_type;
my_chars_read NUMBER;
my_offset NUMBER;
my_problem NUMBER;
my_lob_size NUMBER;
my_indx_var NUMBER;
my_total_chars_read NUMBER;
my_output_chunk VARCHAR2(4000);
my_problem_flag NUMBER;
my_clob CLOB;
my_blob BLOB;
my_total_problems NUMBER;
new_sql VARCHAR2(4000);
BEGIN
DBMS_OUTPUT.ENABLE(1000000);
   orig_sql:='select pk1,dbms_lob.getlength('||FIX_COL||') as cloblength,'||fix_col||' from '||table_name||' where pk1 in (select myindx from cnv_us7 where mytablename='''||TABLE_NAME||''' and mycolumnname='''||FIX_COL||''') order by pk1';
   open orig_table_cur for orig_sql;
   LOOP
        FETCH orig_table_cur INTO my_indx_var,my_lob_size,my_clob;
                EXIT WHEN orig_table_cur%NOTFOUND;
        new_sql:='INSERT INTO '||table_name||'_lob(pk1,'||fix_col||') values ('||my_indx_var||',empty_blob() )';
        dbms_output.put_line(new_sql);
      execute immediate new_sql;
-- Here's the bit that I had trouble making dynamic. Feel free to let me know what I am doing wrong.
-- new_sql:='SELECT '||fix_col||' INTO my_blob from '||table_name||'_lob where pk1='||my_indx_var||' FOR UPDATE';
--        dbms_output.put_line(new_sql);
        select data into my_blob from my_contents_lob where pk1=my_indx_var FOR UPDATE;
      clob2blob(my_clob,my_blob);
   END LOOP;
   CLOSE orig_table_cur;
DBMS_OUTPUT.PUT_LINE('Completed program');
END;
exec blob_conversion_my_contents('MY_CONTENTS','DATA');
Verify that things work properly:
select dump( utl_raw.cast_to_varchar2(data)) from my_contents_lob where pk1=xxxx;
This should let you see see characters > 150. Thus, the method works.
We can now take this data, export it from RESTORECLONE
exp file=a.dmp buffer=4000000 userid=system/XXXXXX tables=my_user.my_contents rows=y
and import the data on prodclone
imp file=a.dmp fromuser=my_user touser=my_user userid=system/XXXXXX buffer=4000000;
For paranoia's sake, double check that it worked properly:
select dump( utl_raw.cast_to_varchar2(data)) from my_contents_lob;
On our 10g PRODCLONE, we'll use these stored procedures:
CREATE OR REPLACE FUNCTION CLOB2BLOB(L_CLOB CLOB) RETURN BLOB IS
L_BLOB BLOB;
L_SRC_OFFSET NUMBER;
L_DEST_OFFSET NUMBER;
L_BLOB_CSID NUMBER := DBMS_LOB.DEFAULT_CSID;
V_LANG_CONTEXT NUMBER := DBMS_LOB.DEFAULT_LANG_CTX;
L_WARNING NUMBER;
L_AMOUNT NUMBER;
BEGIN
DBMS_LOB.CREATETEMPORARY(L_BLOB, TRUE);
L_SRC_OFFSET := 1;
L_DEST_OFFSET := 1;
L_AMOUNT := DBMS_LOB.GETLENGTH(L_CLOB);
DBMS_LOB.CONVERTTOBLOB(L_BLOB,
L_CLOB,
L_AMOUNT,
L_SRC_OFFSET,
L_DEST_OFFSET,
1,
V_LANG_CONTEXT,
L_WARNING);
RETURN L_BLOB;
END;
CREATE OR REPLACE FUNCTION BLOB2CLOB(L_BLOB BLOB) RETURN CLOB IS
L_CLOB CLOB;
L_SRC_OFFSET NUMBER;
L_DEST_OFFSET NUMBER;
L_BLOB_CSID NUMBER := DBMS_LOB.DEFAULT_CSID;
V_LANG_CONTEXT NUMBER := DBMS_LOB.DEFAULT_LANG_CTX;
L_WARNING NUMBER;
L_AMOUNT NUMBER;
BEGIN
DBMS_LOB.CREATETEMPORARY(L_CLOB, TRUE);
L_SRC_OFFSET := 1;
L_DEST_OFFSET := 1;
L_AMOUNT := DBMS_LOB.GETLENGTH(L_BLOB);
DBMS_LOB.CONVERTTOCLOB(L_CLOB,
L_BLOB,
L_AMOUNT,
L_SRC_OFFSET,
L_DEST_OFFSET,
1,
V_LANG_CONTEXT,
L_WARNING);
RETURN L_CLOB;
END;
And now, for the piece de' resistance, we need a BLOB to CLOB conversion that assumes that the BLOB data is stored initially in WE8ISO8859P1.
To find correct CSID for WE8ISO8859P1, we can use this query:
select nls_charset_id('WE8ISO8859P1') from dual;
Gives "31"
create or replace FUNCTION BLOB2CLOBASC(L_BLOB BLOB) RETURN CLOB IS
L_CLOB CLOB;
L_SRC_OFFSET NUMBER;
L_DEST_OFFSET NUMBER;
L_BLOB_CSID NUMBER := 31;      -- treat blob as WE8ISO8859P1
V_LANG_CONTEXT NUMBER := 31;   -- treat resulting clob as WE8ISO8850P1
L_WARNING NUMBER;
L_AMOUNT NUMBER;
BEGIN
DBMS_LOB.CREATETEMPORARY(L_CLOB, TRUE);
L_SRC_OFFSET := 1;
L_DEST_OFFSET := 1;
L_AMOUNT := DBMS_LOB.GETLENGTH(L_BLOB);
DBMS_LOB.CONVERTTOCLOB(L_CLOB,
L_BLOB,
L_AMOUNT,
L_SRC_OFFSET,
L_DEST_OFFSET,
L_BLOB_CSID,
V_LANG_CONTEXT,
L_WARNING);
RETURN L_CLOB;
END;
select dump(dbms_lob.substr(blob2clobasc(data),4000,1)) from my_contents_lob;
Now, we can compare these:
select dbms_lob.compare(blob2clob(old.data),new.data) from my_contents new,my_contents_lob old where new.pk1=old.pk1;
DBMS_LOB.COMPARE(BLOB2CLOB(OLD.DATA),NEW.DATA)
                                                             0
                                                             0
                                                             0
Vs
select dbms_lob.compare(blob2clobasc(old.data),new.data) from my_contents new,my_contents_lob old where new.pk1=old.pk1;
DBMS_LOB.COMPARE(BLOB2CLOBASC(OLD.DATA),NEW.DATA)
                                                               -1
                                                               -1
                                                               -1
update my_contents a set data=(select blob2clobasc(data) from my_contents_lob b where a.pk1= b.pk1)
    where pk1 in (select al.pk1 from my_contents_lob al where dbms_lob.compare(blob2clob(al.data),a.data) =0 );
SQL> select dump(dbms_lob.substr(data,4000,1)) from my_contents where pk1 in (select pk1 from my_contents_lob);
Confirms that we're now working properly.
To run across all the _LOB tables we've created:
[oracle@RESTORECLONE ~]$ exp file=all_fixed_lobs.dmp buffer=4000000 userid=my_user/mypass tables=MY_CONTENTS_LOB,MY_FORUM_LOB...
[oracle@RESTORECLONE ~]$ scp all_fixed_lobs.dmp jboulier@PRODCLONE:/tmp
And then on PRODCLONE we can import:
imp file=all_fixed_lobs.dmp buffer=4000000 userid=system/XXXXXXX fromuser=my_user touser=my_user
Instead of running the above update statement for all the affected tables, we can use a simple stored procedure:
create or replace procedure fix_us7_CLOBS
(TABLE_NAME varchar2,
     FIX_COL varchar2 )
    authid current_user
    as
     orig_sql varchar2(1000);
     bak_sql varchar2(1000);
    begin
    dbms_output.put_line('Creating '||TABLE_NAME||'_PRECONV to preserve the original data in the table');
    bak_sql:='create table '||TABLE_NAME||'_preconv as select pk1,'||FIX_COL||' from '||TABLE_NAME||' where pk1 in (select pk1 from '||TABLE_NAME||'_LOB) ';
    execute immediate bak_sql;
    orig_sql:='update '||TABLE_NAME||' tabnew set '||FIX_COL||'= (select blob2clobasc ('||FIX_COL||') from '||TABLE_NAME||'_LOB taborig where tabnew.pk1=taborig.pk1)
   where pk1 in (
   select a.pk1 from '||TABLE_NAME||'_LOB a,'||TABLE_NAME||' b
      where a.pk1=b.pk1
             and dbms_lob.compare(blob2clob(a.'||FIX_COL||'),b.'||FIX_COL||') = 0 )';
    -- dbms_output.put_line(orig_sql);
    execute immediate orig_sql;
   end;
Now we can run the procedure and it fixes everything for our previously-broken tables, keeping the changed rows -- just in case -- in a table called table_name_PRECONV.
set serveroutput on time on timing on;
exec fix_us7_clobs('MY_CONTENTS','DATA');
commit;
After confirming with the client that the changes work -- and haven't noticeably broken anything else -- the same routines can be carefully run against the actual production database.

We converted using the database using scripts I developed. I'm not quite sure how we converted is relevant, other than saying that we did not use the Oracle conversion utility (not csscan, but the GUI Java tool).
A summary:
1) We replaced the lossy characters by parsing a csscan output file
2) After re-scanning with csscan and coming up clean, our DBA converted the database to AL32UTF8 (changed the parameter file, changing the character set, switched the semantics to char, etc).
3) Final step was changing existing tables to use char semantics by changing the table schema for VARCHAR2 columns
Any specific steps I cannot easily answer, I worked with a DBA at our company to do this work. I handled the character replacement / DDL changes and the DBA ran csscan & performed the database config changes.
Our actual error message:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00210: expected '<' instead of '�Error at line 1
31011. 00000 - "XML parsing failed"
*Cause:    XML parser returned an error while trying to parse the document.
*Action:   Check if the document to be parsed is valid.
Error at Line: 24 Column: 15
This seems to match the the document ID referenced below. I will ask our DBA to pull it up and review it.
Please advise if more information is needed from my end.

German character set issues on Solaris

Hi,
I am facing an issue with German character settings with my Java application on a solaris box.
When I run my application on the box, and I pass an input file with German special characters they get converted as ?. However, other normal English characters are formed properly.
When I run the same application on another Solaris box with a different JRE, the German characters are formed properly.
I understand that there is a difference in the archiecture between the 2 boxes ie.e
64 bit SPARC machine v/s 32 bit x86 machine
the JRE
1.4.2_03(64bit) v/s 1.4.1_01
I am tryinbg to evaludate further differences between the 2 environments to pinpoint the issue, and get this resolved on the 1st box.
Can anyone provide me any inputs?
Lavin

When you read the file, please point out what character set using. For example:
FileInputStream fstream = new FileInputStream(url.getFile());
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in, Charset.forName("ISO-8859-1")));
br.readLine();
This link possibly can help you.
http://www.velocityreviews.com/forums/t126128-jdk-14-character-set-change.html

Character set migration error to UTF8 urgent

Hi
when we migrated from ar8iso889p6 to utf8 characterset we are facing one error when i try to compile one package through forms i am getting error program unit pu not found.
When i running the source code of that procedure direct from database using sqlplus its running wihtout any problem.How can i migrate this forms from ar8iso889p6 to utf8 characterset. We migrated from databas with ar8iso889p6 oracle 81.7 database to oracle 9.2. database with character set UTF8 (windows 2000) export and import done without any error
I am using oracle 11i inside the calling forms6i and reports 6i
with regards
ramya
1) this is server side program yaa when connecting with forms i am getting error .When i am running this program using direct sql its working when i running compiling i am getting this error.
3) yes i am using 11 i (11.5.10) inside its calling forms 6i and reports .Why this is giving problem using forms.Is there any setting changing in forms nls_lang
with regards

Hi Ramya
what i understand from your question is that you are trying to compile a procedure from a forms interface at client side?
if yes you should check the code in the forms that is calling the compilation package.
does it contains strings that might be affected from the character set change???
Tony G.

Character set Problem (From WE8ISO8859P1 to EL8MSWIN1253)

Hi there,
I would like to describe a problem that I face with import, export and Greek characters.
I have two databases with the following characteristics
Source database:
O/S version Windows
Database version à 10GR2
NLS_CHARACTERSET à WE8ISO8859P1
NLS_NCHAR_CJARACTERSET à AL16UTF16
NLS_LANGUAGE à GREEK
NLS_TERRITORY à GREECE
Target database:
O/S version Windows
Database version à 10GR2
NLS_CHARACTERSET à EL8MSWIN1253
NLS_NCHAR_CJARACTERSET à AL16UTF16
NLS_LANGUAGE à GREEK
NLS_TERRITORY à GREECE
From the source database, using the export tool I am exporting a table (TABLE_A) which contains Greek records. At this point I want to mention that the from the source database I am able to read the Greek characters from TABLE_A by using sqlplus i.e (select * from table_a;)
On the target database by using import I load the table TABLE_A into the database.
When I select the newly imported TABLE_A on the target database I am not able to read the Greek characters.
I have tested various scenarios by using different values for the NLS_LANG variable regarding export and import clients but I did not manage to read the Greek characters on the target database.
If anyone faced the same or a similar problem please I am asking for assistance.
Thank you in advance

Please, review this thread:
Re: Arabic Character set conversion-help needed
The thread describes a similar issue for Arabic data. Therefore, when reading the thread, substitute 'WE8ISO8859P1' for 'AR8ISO8859P6', 'EL8MSWIN1253' for 'AR8MSWIN1256', and 'Greek' for 'Arabic'.
-- Sergiusz

Character set change? Help!

Similar Messages

Maybe you are looking for