Multi byte character set

Hi,
I am going to create a oracle 8i database in linux OS.In that i want to set english, italian and chinese language.
My question is
1. what are the parameter to be set in the OS level for this multi byte characterset
2.how to set these characterset in database level at the time of creation.
3.and also i am going to migrate one database to the new database. but the old one contains only english and italy. so while migration what are the parameter we have to set in the new database.
Kindly provide some solutions..
rgds..

1) I'm not sure what you're asking here.
2) While creating the database, you would want to set the NLS_CHARACTERSET to UTF8 (I don't believe AL32UTF8 was available in 8i).
3) How are you migrating the database? Via export & import? If so, you'd need to ensure that the NLS_LANG on the client machine(s) that do the actual export and import are set appropriately.
Justin

Similar Messages

How to set Multi Byte Character Set ( MBCS ) to Particular String In MFC VC++

I Use Unicode Character Set in my MFC Application ( VC++) .
now i get the output ठ桔湡⁫潹⁵潦⁲獵 (like this )character and i want to convert this character in english language (means MBCS),
But i need Unicode to My Applicatiion. when i change the Multi-Byte Character set It give Correct output in English but other Objects ( TreeCtrl Selection ) will perform wrongly . so i need to convert the particular String to MBCS
how can i do that ? In MFC

I assume your string read from your hardware device is an plains "C" string (ANSI string). This type of string has one byte per character. Unicode has two bytes per character.
From the situation you explained I'd convert the string returned by the hardware to an Unicode string using i.e. MultibyteTowideChar with CP_ACP. You may also use mbstowcs or some similar functions to convert your string to an Unicode string.
Best regards
Bordon
Note: Posted code pieces may not have a good programming style and may not perfect. It is also possible that they do not work in all situations. Code pieces are only indended to explain something particualar.

Converting from Single Byte to Multi Byte character set

Hello,
I'm trying to migrate one schema, including data, from a 10g (10.1.0.2.0) DB with IW8ISO8859P8 character set, to a 10g (10.2.0.1.0) DB with AL32UTF8 character set.
The original tables are using VARCHAR2 columns, including some VARCHAR2(1) columns.
I'm trying to use exp and imp for the task, but during import I'm receiving errors like:
IMP-00019: row rejected due to ORACLE error 12899
IMP-00003: ORACLE error 12899 encountered
ORA-12899: value too large for column "SHAMAUT"."TIKIM"."GAR_SET" (actual: 2, maximum: 1)These errors are not limited to the one-character columns only.
Is there a way to export/import the data with AL32UTF8 in mind, so the system will automatically convert the data properly?
Thanks for the help,
Arie.

It's not a true conversion problem that you have but more a space problem. Tables columns are created by default with the init. parameter NLS_LENGTH_SEMANTICS character semantics:
If NLS_LENGTH_SEMANTICS = BYTE
then 1 character = 1 byte whatever the db character set
If NLS_LENGTH_SEMANTICS = CHAR
then 1 character = 1 character size for the db character set.
If this parameter is changed it is only taken into account for newly created tables or columns: existing columns are not changed.
See http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96529/ch2.htm#104327
The only solution I see is to enlarge your VARCHAR2 columns before running the import...
Message was edited by:
Pierre Forstmann

Crystal XI R2 exporting issues with double-byte character sets

NOTE: I have also posted this in the Business Objects General section with no resolution, so I figured I would try this forum as well.
We are using Crystal Reports XI Release 2 (version 11.5.0.313).
We have an application that can be run using multiple cultures/languages, chosen at login time. We have discovered an issue when exporting a Crystal report from our application while using a double-byte character set (Korean, Japanese).
The original text when viewed through our application in the Crystal preview window looks correct:
性能著概要
When exported to Microsoft Word, it also looks correct. However, when we export to PDF or even RPT, the characters are not being converted. The double-byte characters are rendered as boxes instead. It seems that the PDF and RPT exports are somehow not making use of the linked fonts Windows provides for double-byte character sets. This same behavior is exhibited when exporting a PDF from the Crystal report designer environment. We are using Tahoma, a TrueType font, in our report.
I did discover some new behavior that may or may not have any bearing on this issue. When a text field containing double-byte characters is just sitting on the report in the report designer, the box characters are displayed where the Korean characters should be. However, when I double click on the text field to edit the text, the Korean characters suddenly appear, replacing the boxes. And when I exit edit mode of the text field, the boxes are back. And they remain this way when exported, whether from inside the design environment or outside it.
Has anyone seen this behavior? Is SAP/Business Objects/Crystal aware of this? Is there a fix available? Any insights would be welcomed.
Thanks,
Jeff

Hi Jef
I searched on the forums and got the following information:
1) If font linking is enabled on your device, you can examine the registry by enumerating the subkeys of the registry key at HKEY_LOCAL_MACHINEu2013\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink to determine the mappings of linked fonts to base fonts. You can add links by using Regedit to create additional subkeys. Once you have located the registry key that has just been mentioned, from the Edit menu, Highlight the font face name of the font you want to link to and then from the Edit menu, click Modify. On a new line in the dialog field "Value data" of the Edit Multi-String dialog box, enter "path and file to link to," "face name of the font to link".u201D
2) "Fonts in general, especially TrueType and OpenType, are u201CUnicodeu201D.
Since you are using a 'true type' font, it may be an Unicode type already.However,if Bud's suggestion works then nothing better than that.
Also, could you please check the output from crystal designer with different version of pdf than the current one?
Meanwhile, I will look out for any additional/suitable information on this issue.

Multi-byte character

If DATABASE CHARACTER SET is UTF-8 than
Than can i use VARCHAR2 to store multi-byte character or i still have to use
nvarchar2
also vachar2(1),nvarchar2(1) can store how much (max) bytes in case of UTF-8 CHARACTER SET

If you create VARCHAR2(1) then you possibly can not store anything as your first character might be multibyte.
My recommendation would be to consider defining by character rather than by byte.
CREATE TABLE tbyte (
testcol VARCHAR2(20));
CREATE TABLE tchar (
testcol VARCHAR2(20 CHAR));The second will always hold 20 characters without regard to the byte count.
Demos here:
http://www.morganslibrary.org/library.html

Multi-byte character encoding issue in HTTP adapter

Hi Guys,
I am facing problem in the multi-byte character conversion.
Problem:
I am posting data from SAP CRM to third party system using XI as middle ware. I am using HTTP adapter to communicate XI to third party system.
I have given XML code as UT-8 in the XI payload manipulation block.
I am trying to post Chines characters from SAP CRM to third party system. junk characters are going to third party system. my assumption is it is double encoding.
Can you please guide me how to proceed further.
Please let me know if you need more info.
Regards,
Srini

Srinivas,
Can you go through the url:
UTF-8 encoding problem in HTTP adapter
---Satish

Multi Language Character sets

Does anyone know if the oracle odbc drivers support multi language character sets?
I am trying to retrieve Chinese (prc) characters from the database (it is stored correctly and I have the Microsoft Multilanguage service pack installed). Odbc won't retrieve them correctly (actually stops after 1 row).
If I use the OLE DB driver, it does retrieve them. Is there a converter inside the OLE DB driver that ODBC doesn't have or is there a setting I'm missing? (The tool I want to use this with does not recognize OLE DB, is there a way top make it use oledb but defining an odbc connection??)
Cheers
Chris

The version number you're providing doesn't seem to make any sense to me. Oracle's ODBC drivers are versioned to match the version of the Oracle client they work with, i.e. 8.1.7.8 is the latest Oracle ODBC driver for the 8.1.7 Oracle client. In the Oracle 7 days, there was a 2.5x series of Oracle ODBC drivers. So far as I'm aware, there's never been a 4.x series of Oracle ODBC drivers.
AMERICAN_AMERICAN.UTF8 would be the option I'd tend to prefer on the client, particularly if you'll be working with more than just Chinese data (i.e. English & Chinese). I'm not sure what AMERICAN_AMERICAN.<some Chinese character set> would end up doing. There's a lot of info out there about NLS settings (including an NLS discussion forum) that might be helpful to you.
What OLE DB provider are you using that works?
Justin

Where is the Multi-Byte Character.

Hello All
While reading data from DB, our middileware interface gave following error.
java.sql.SQLException: Fail to convert between UTF8 and UCS2: failUTF8Conv
I understand that this failure is because of a multi-byte character, where 10g driver will fix this bug.
I suggested the integration admin team to replace current 9i driver with 10g one and they are on it.
In addition to this, I wanted to suggest to the data input team on where exactly is the failure occured.
I have asked them and got the download of the dat file and my intention was to findout where exactly is
that multi-byte character located which caused this failure.
I wrote the following code to check this.
import java.io.*;
public class X
public static void main(String ar[])
int linenumber=1,columnnumber=1;
long totalcharacters=0;
try
File file = new File("inputfile.dat");
FileInputStream fin = new FileInputStream(file);
byte fileContent[] = new byte[(int)file.length()];
fin.read(fileContent);
for(int i=0;i<fileContent.length;i++)
   columnnumber++;totalcharacters++;
   if(fileContent<0 && fileContent[i]!=10 && fileContent[i]!=13 && fileContent[i]>300) // if invalid
{System.out.println("failure at position: "+i);break;}
if(fileContent[i]==10 || fileContent[i]==13) // if new line
{linenumber++;columnnumber=1;}
fin.close();
System.out.println("Finished successfully, total lines : "+linenumber+" total file size : "+totalcharacters);
catch (Exception e)
e.printStackTrace();
System.out.println("Exception at Line: "+linenumber+" columnnumber: " +columnnumber);
}But this shows that the file is good and no issue with this.
Where as the middleware interface fails with above exception while reading exactly the same input file.
Anywhere I am doing wrong to locate that multi-byte character ?
Greatly appreciate any help everyone !
Thanks.

My challenge is to spot the multi-byte character hidden in this big dat file.
This is because the data entry team asked me to spot out the record and column that has issue out of
lakhs of records they sent inside this file.
Lets have the validation code like this...
   if( (fileContent<0 && fileContent[i]!=10 && fileContent[i]!=13) || fileContent[i]>300) // if invalid
{System.out.println("failure at position: "+i);break;}lessthan 0 - I saw some -ve values when I was testing with other files.
greaterthan 300 - was a try to find out if any characters exceeds actual chars. range.
if 10 and 13 are for line-feed.
with this, I randomly placed chinese, korean characters and program found them.
any alternative (better code ofcourse) way to catch this black sheep ?
Edited by: Sanath_K on Oct 23, 2009 8:06 PM

Problem to display japanese/multi-byte character on weblogic server 9.1

Hi experts
We are running weblogic 9.1 on linux box [REHL v4] and trying to display Japanese characters embedded in some of html files, but Japanese characters are converted into a question mark [?]. The html files that contain Japanese characters are stored properly in the file system and retained the Japanese characters as they should be.
I changed character setting in the html header to shift_jis, but no luck. Then I added the encoding scheme for shift_jis in jsp_description and charset-parameter section in weblogic.xml but also no luck.
I am wondering how I can properly display multi-byte characters/Japanese on weblogic server without setting up internationalization tools.
I will appreciate for your advice.
Thanks,
yasushi

This was fixed by removing everything except teh following files from the original ( 8.1 ) domain directory
1. config.xml
2. SerializedSystemIni.dat
3. *.ldift
4. applications directory
Is this a bug in the upgrade tool ? Or did I miss a part of the documentation ?
Thanks
--sony

CUSTOM Service - multi Byte character issue

Hi Experts,
I wrote a custom Service. What this service is doing, its id reading some data from Database and then generates CSV report. Code is working fine. But if we have multi - byte characters in data, then these characters are not properly shown in report. Given below is my service code :
byte bytes[] = CustomServiceHelper.getReport(this.m_binder,providerName);
                    DataStreamWrapper wrapper = new DataStreamWrapper();
                    wrapper.m_dataEncoding="UTF-8";
                    wrapper.m_dataType="application/vnd.ms-excel;charset=UTF-8";
                    wrapper.m_clientFileName="Report.csv";
                    wrapper.initWithInputStream(new ByteArrayInputStream(bytes), bytes.length);
                    this.m_service.getHttpImplementor().sendStreamResponse(m_binder, wrapper);
NOTE - This code is working fine on my local ucm (windows) for multi-byte characters. But When I install this service on our DEV and Staging servers (SOLARIS), then multi-byte characters issue occurs.
Thanks in Advance..!!
Edited by: user4884609 on May 17, 2011 4:12 PM

Please Help

Character sets - UTF8 or Chinese

Hi,
I am looking into enhancing the application I have built in Oracle to save/display data in Chinese & English. I have looking into how to change the character set of a database to accept different languages i.e. different characters.
From what I understand I can create a database to use a Chinese character set (apparently English ascii characters are also a part of any Chinese character set) or I can set the database to use a unicode multi-byte character set (UTF8) - which seems to be okay for all languages.
Has anyone had any experience of a) changing an existing standard 7 byte ascii database into database which can handle Chinese and/or b) the difference/ implications between using a Chinese and unicode character sets.
I am using Oracle RDBMS 8.1.7 on SuSE Linux 7.2
Thanks in advance.
Dan

If the data is segmented so that character set 1 data is in a table and character set 2 data is in another table then you may have a chance to salvage the data with help from support. The idea would be to first export and import only your CL8MSWIN1251 data to UTF8. Be careful that your NLS_LANG is set to CL8MSWIN1251 for export so that no conversion takes place. Confirm the import is successful and remove CL8MSWIN1251 data from database. Oracle support can now help you override the character set via ALTER database to say MSWIN1252. Now selectively export/import this data, again make sure NLS_LANG is set to MSWIN1252 for export so that no conversion takes place. Confirm the import is successful and remove MSWIN1252 data from database. And then do the same steps for 1250 data.

NLS character set

Hi,
We have datawarehouse environment..
Currently the NLS Character set in our database is WE8MSWIN1252 which is non multibyte character set.But since this is a datawarehouse environment datawill be coming from source which is multi byte character set.Could you please let us know whether this will be supported in WE8MSWIN1252 character set or not??
Thanks,
Nab

user10124609 wrote:
For changing the NLS character set by export and import do we need to install the database again???yes,you can create new database with Unicode character set and can import there.But for full steps please refer
http://download.oracle.com/docs/cd/E11882_01/server.112/e10729/ch11charsetmig.htm#CEGDHJFF
Changing the Database Character Set ( NLS_CHARACTERSET ) [ID 225912.1]
Changing the Database Character Set - Frequently Asked Questions [ID 227337.1]
Edited by: Chinar on Nov 29, 2010 3:21 AM

Single and multi byte settings

Hello,
We are trying to implement multibyte char loading and I have a few questions:
1) Our current char coding is in UTF-8. What char coding should we use for multi byte loading?
2) In DDL, the column can be declared as a BYTE or CHAR, such as varchar2(20 CHAR). For multi byte, we can either change the size of the column or change from BYTE to CHAR for column definition. Which is a better way of implementation?
3) Any other setting changes we need to be aware of from single to multi bye implementation?
Regards

First off, I'm a bit confused. If your database's character set us UTF-8, you already have a multi-byte character set. I'm not sure what it is that you're converting in this case.
As to changing the table definition-- that depends primarily on your application(s). Generally, I find it easier to declare a field with character length semantics, which gives users in every language certainty about the number of characters a field can support. There are probably people that think the other way because they're allocating memory in a client application based on bytes and want to ensure that the definitions on the client and the server match.
Since I don't quite understand what it is that you're converting, I'm hard pressed to come up with what "other setting changes" might be appropriate.
Justin

Risk involved converting Oracle character set to Unicode (AL32UTF8 or UTF8)

Hi All -
I am a PL/SQL devloper and quite new in Database Adminstration have very little knowledge base on this.
Currently I am working on project where we have requirement to store data in Multiple Languages in Database.
After my findings via Google I am clear that our database character set needs to be changed to Unicode (AL32UTF8 or UTF8). Before moving forward I would like to know what are the risk involved doing this?
Few Question:-
Would this change take long time & involve lots of effort ?
Can we revert back once this chnage is done, with no data loss?
Will there be any changes required while wrting SQL on tables having multi language data?
As of now requirement to store data in Multi Language is very specfic to some tables only, not the whole DB, are there any other options storing data in diffrent languages like (Spanish,Japnese,Chinese,Italian, German, and French) in just one specific table?
Thanks...
Edited by: user633761 on Jun 7, 2009 9:15 PM

>
Will there be any changes required while wrting SQL on tables having multi language data?If you move from single byte character set to multi byte character set, you should take into count that 1 character my use 1,2,3 or 4 bytes to be stored: http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch2charset.htm#i1006683
This may impact SQL or PL/SQL code that is working on character string lengths.
Note also that using exp/imp to change database character set is not so simple; see following message:
Re: charset conversion from WE8ISO8859P1 (8.1.7.0) to AL32UTF8(9.0.1.1)
>
As of now requirement to store data in Multi Language is very specfic to some tables only, not the whole DB, are there any other options storing data in diffrent languages like (Spanish,Japnese,Chinese,Italian, German, and French) in just one specific table?Using NCHAR character types is another possibility:
http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch11charsetmig.htm#sthref1493
Edited by: P. Forstmann on Jun 8, 2009 9:10 AM

US7ASCII character set for 10G?

we are planing to migrate the 8i data to 10G. whether 10G will support the US7ASCII character set ? if i convert character us7ascii to al32utf8 is there any issue? doubt abt extra spaces,due to al32utf8 is multi byte character set.

If there is a ever a chance in the future to have multi-byte data in your DB, now is your chance to do it easily.

Multi byte character set

Similar Messages

Maybe you are looking for