Unicode datatypes v.s. unicode databases

We have a legacy system that is implemented in Powerbuilder and C++. We are pretty sure about which columns we need to convert to support Unicode. Besides, some of our clients have cooperate standard (AMERICAN_AMERICA.WE8MSWIN1252) for the NLS_LANG on the Oracle clients set up, .
Therefore, we decided to use the unicode datatypes approach to only update the columns identified to NVARCHAR2 and NCLOB with AL16UTF16 as the national character set. Our understanding is that this is the safe and easy way for our situation since both C++ and Powerbuilder support UTF-16 standard as default. This will not require any change on the NLS_LANG set up.
However, one of our clients seems to have strong opinions against the unicode datatypes option and would rather migrating the entire database to be Unicode. The client mentioned that "AL16UTF16 has to be used in a Unicode database with UTF8 or AL32UTF8 as the database character set in order to display characters correctly". To our knowledge and understanding we have not heard about this requirement. I didn't see anything like this in Oracle official document.
Could anyone advise if Unicode database is really better than Unicode datatype option?
Thanks!

Besides, some of our clients have cooperate standard
(AMERICAN_AMERICA.WE8MSWIN1252) for the NLS_LANG on
the Oracle clients set up, . This might even be necessary requirement since they are using Windows-1252 code page.
that "AL16UTF16 has to
be used in a Unicode database with UTF8 or AL32UTF8
as the database character set in order to display
characters correctly".Hard to say without knowing what they refer to specifically.
They might have been thinking about the requirement to use AL32UTF8, depending on how binds are done. If you insert string literals, which is interpreted in the database character set, into NCHAR columns, you obvisouly need a character set that supports all characters you are going to insert (i.e. AL32UTF8 in unicode case).
This is described very clearly by Sergiusz Wolicki, in Re: store/retrieve data in lang other than eng when CHARACTERSET is not UTF8.

Similar Messages

Moving to unicode datatype for an entire database - SQL Server 2012

Hi,
I've a SQL Server 2012 database with more tables having more char and varchar columns.
I'd like to change quickly char columns in nchar columns and varchar columns in nvarchar columns.
Is it possible to solve this issue, please? Many thanks

Hello,
Creating a script could do it quickly as shown in the following article:
http://blog.sqlauthority.com/2010/10/18/sql-server-change-column-datatypes/
But creating the scripts may take you some time.
You will find more options here:
https://social.technet.microsoft.com/Forums/sqlserver/en-US/e7b70add-f390-45ee-8e3e-8ed6c6fa0f77/changing-data-type-to-the-fields-of-my-tables?forum=transactsql
Hope this helps.
Regards,
Alberto Morillo
SQLCoffee.com

Unicode datatype

Hi,
Â· Unicode database (changing the database character set to AL32UTF8) is working fine, we tested with the with asp .NET application, we are able to see English, Japanese, Arabic and Urdu.
Â· Unicode datatype (database character set is default âWE8MSWIN1252â), with column datatype as NVARCHAR2, we are able to enter any language, but while querying from database the values are displayed as inverted â???????â. We tried the above as per the oracle documentation (Globalization Support Guide â
Chapter 5 - Supporting Multilingual Databases with Unicode - a96529.pdf) but still it displays junk characters only.
Is any client setting am missing here?.
Thanks in Advance.

There is no character set that supports both Arabic and Japanese data other then a Unicode character set. The restriction you are encountering should only be for string literals you are trying to load into Unicode datatypes. For literals in this scenario where the database character set does not support the characters in the literal string the only work around is to use UNISTR. This problem with Unicode datatypes and literals was addressed in 10gR2.

CMP Bean's Field Mapping with oracle unicode Datatypes

Hi,
I have CMP which mapps to RDBMS table and table has some Unicode datatype such as NVARCHAR NCAR
now i was woundering how OC4J/ oracle EJB container handles queries with Unicode datatypes.
What i have to do in order to properly develope and deploy CMP bean which has fields mapped onto the data based UNICODE field.?
Regards
atif

Based on the sun-cmp-mapping file descriptor
<schema>Rol</schema>
It is expected a file called Rol.schema is packaged with the ejb.jar. Did you perform capture-schema after you created your table?

To Determine Unicode Datatype encoding

Hi,
Going through the Oracle documentation found that Oracle Unicode datatype (NCHAR or NVARCHAR2) supports AL16UTF16 and UTF8 Unicode encodings.
Is there a way to determine which encoding is being used by Oracle Unicode datatypes thorugh OCI interface?
Thanks,
Sachin

That's a rather hard problem. You would, realistically, either have to make a bunch of simplifying assumptions based on the data or you would want to buy a commercial tool that does character set detection.
There are a number of different ways to encode Unicode (UTF-8, UTF-16, UTF-32, USC-2, etc.) and a number of different versions of the Unicode standard. UTF-8 is one of the more common ways to encode Unicode. But it is popular precisely because the first 127 characters (which is the majority of what you'd find in English text) are encoded identically to 7-bit ASCII. Depending on the size and contents of the document, it may not be possible to determine whether the data is encoded in 7-bit ASCII, UTF-8, or one of the various single-byte character sets that are built off of 7-bit ASCII (ISO 8859-15, Windows-1252, ISO 8859-1, etc).
Depending on how many different character sets you are trying to distinguish between, you'd have to look for binary values that are valid in one character set and not in another.
Justin

See Unicode and Volume in my database

Hi
How can I to see the Unicode and Volume in my database ?
thank
Message was edited by:
muttleychess

For example:
select * from nls_database_parameters where parameter like '%SET%';gives the character sets used by your database.
select sum(bytes)/(1024*1024*1024) from v$datafile; gives the size of all datafiles (in GB) of your database (excluding tempfiles, control files and redo logs files).
select sum(bytes)/(1024*1024*1024) from dba_segments;give the size of all database objects (in GB) in your database.
Message was edited by:
Pierre Forstmann

Unicode(String) to actual Unicode !

Hi, i have an unicode data which is retrieved from database, the unicode is in String format(\u4eba\u53c2), how to make it to be actual unicode "\u4eba\u53c2". i have a problem where the unicode from database doesn't give me the actual character but the unicode string itself. Please comment on it . Thanks.
_calv

Hi Calv,
I'm pretty sure that the conversion from the ASCII string to Unicode is not available within the API. (If someone knows otherwise please jump in). However it should be fairly easy for you to program this conversion: for example, you could parse your string into the six character substrings that represent characters, strip off the \u, and then cast the sixteen bit integer into a Character.
In case it's helpful, I am pasting a couple of methods I wrote to go in the opposite direction:
returns a (ASCII) string that represents the specified character by a unicode escape sequence
static public String toUnicodeString( char character) {
     short unicode = (short) character;
   char hexDigit[] = {
      '0', '1', '2', '3', '4', '5', '6', '7',
      '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'
   char[] array = {hexDigit[( unicode >> 12) & 0x0f],hexDigit[( unicode >> 8) & 0x0f],
        hexDigit[( unicode >> 4) & 0x0f], hexDigit[ unicode & 0x0f] };
String result = new String ("\\u" + new String (array));
   return result;
returns a (ASCII) string representing the java string argument:
e.g. -> "\u1234\u5678"
static public String toUnicodeString( String string) {
     String result = "\"";
     for (int   index =      0; index < string.length (); index++) {
          result = result + toUnicodeString (string.charAt (index));
     result = result + "\"";
   return result;
}   Regards,
Joe

URGENT: building Unicode ASO application from Unicode BSO application?

Hello Experts,
Is there other options available for building Unicode ASO application from Unicode BSO application?
As per readme documentation of version 11.1.1.3.00,
http://download.oracle.com/docs/cd/E12825_01/epm.111/esb_1111300_readme.pdf
Block storage non-Unicode outlines can be converted to aggregate storage non-Unicode outlines; however, block storage Unicode outlines cannot be converted to aggregate storage Unicode outlines. [8208584]
Can anyone suggest how to extract the Unicode Block Storage Outline and build as Unicode Aggregate Storage Outline? Thanks in advance.
Regards,
Sudhir

Hi,
You could use the [Outline Extractor|http://www.appliedolap.com/free-tools/outline-extractor] or ODI to extract the metadata from an outline and then use it to load into another database.
Or use the same methods you built the metadata in the original BSO database.
Cheers
John
http://john-goodwin.blogspot.com/

Create unicode file and read unicode file

Hi
How can create a unicode file and open unicode file in LV
Regards
Madhu

gmadhu wrote:
Hi
How can create a unicode file and open unicode file in LV
Regards
Madhu
In principle you can't. LabVIEW does not support Unicode (yet)! When it will officially support that is a question that I can't answer since I don't know it and as far as I know NI doesn't want to answer.
So the real question you have to ask first is where and why do you want to read and write Unicode file. And what type of Unicode? Unicode is definitly not just Unicode as Windows has a different notion of Unicode (16 Bit characters) than Unix has (32 Bit characters). The 16 Bit Unicode from Windows is able to cover most languages on this globe but definitly not all without code expansion techniques.
If you want to do this on Windows and have decided that there is no other way to do what you want you will probably have to access the WideCharToMultiByte() and MultibyteToWideChar() Windows APIs using the Call Library Node in order to convert between 8 bit multybyte strings as used in LabVIEW and the Unicode format necessary in your file.
Rolf Kalbermatter
Rolf Kalbermatter
CIT Engineering Netherlands
a division of Test & Measurement Solutions

How to create a non-unicode transport on a unicode system?

Folks,
Occassionally, I have to create transports for some of the functions from our unicode-based SAP system. The created transports by default are unicode and thus cannot be installed on a non-unicode SAP client. Is there a way to create non-unicode transport from a unicode SAP system?
Note that the transport contains only the code (function modules). There is no data.
Thank you in advance for your help.
Regards,
Peter

Hi Peter,
Note 638357 - Transport between Unicode systems and non-Unicode systems
Regards
Ashok Dalai

Wht is the difference between unicode program and non unicode program ?

Hi guru,
wht is the difference between unicode program and non unicode program ?
Regards
Subash

About brief idea about unicode
In the past, SAP developers used various codes to encode characters of different alphabets, for example, ASCII, EBCDI, or double-byte code pages.
ASCII (American Standard Code for Information Interchange) encodes each character using 1 byte = 8 bit. This makes it possible to represent a maximum of 28 = 256 characters to which the combinations 00000000, 11111111 are assigned. Common code pages are, for example, ISO88591 for West European or ISO88595 for Cyrillic fonts.
EBCDI (Extended Binary Coded Decimal Interchange) also uses 1 byte to encode each character, which again makes it possible to represent 256 characters. EBCDIC 0697/0500 is an old IBM format that is used on AS/400 machines for West European fonts, for example.
Double-byte code pages require 1 or 2 bytes for each character. This allows you to form 216 = 65536 combinations where usually only 10,000 - 15,000 characters are used. Double-byte code pages are, for example, SJIS for Japanese and BIG5 for traditional Chinese.
Using these character sets, you can account for each language relevant to the SAP System. However, problems occur if you want to merge texts from different incompatible character sets in a central system. Equally, exchanging data between systems with incompatible character sets can result in unprecedented situations.
One solution to this problem is to use a code comprising all characters used on earth. This code is called Unicode (ISO/IEC 10646) and consists of at least 16 bit = 2 bytes, alternatively of 32 bit = 4 bytes per character. Although the conversion effort for the R/3 kernel and applications is considerable, the migration to Unicode provides great benefits in the long run:
The Internet and consequently also mySAP.com are entirely based on Unicode, which thus is a basic requirement for international competitiveness.
Unicode allows all R/3 users to install a central R/3 System that covers all business processes worldwide.
Companies using different distributed systems frequently want to aggregate their worldwide corporate data. Without Unicode, they would be able to do this only to a limited degree.
With Unicode, you can use multiple languages simultaneously at a single frontend computer.
Unicode is required for cross-application data exchange without loss of data due to incompatible character sets. One way to present documents in the World Wide Web (www) is XML, for example.
ABAP programs must be modified wherever an explicit or implicit assumption is made with regard to the internal length of a character. As a result, a new level of abstraction is reached which makes it possible to run one and the same program both in conventional and in Unicode systems. In addition, if new characters are added to the Unicode character set, SAP can decide whether to represent these characters internally using 2 or 4 bytes.
A Unicode-enabled ABAP program (UP) is a program in which all Unicode checks are effective. Such a program returns the same results in a non-Unicode system (NUS) as in a Unicode system (US). In order to perform the relevant syntax checks, you must activate the Unicode flag in the screens of the program and class attributes.
In a US, you can only execute programs for which the Unicode flag is set. In future, the Unicode flag must be set for all SAP programs to enable them to run on a US. If the Unicode flag is set for a program, the syntax is checked and the program executed according to the rules described in this document, regardless of whether the system is a US or an NUS. From now on, the Unicode flag must be set for all new programs and classes that are created.
If the Unicode flag is not set, a program can only be executed in an NUS. The syntactical and semantic changes described below do not apply to such programs. However, you can use all language extensions that have been introduced in the process of the conversion to Unicode.
As a result of the modifications and restrictions associated with the Unicode flag, programs are executed in both Unicode and non-Unicode systems with the same semantics to a large degree. In rare cases, however, differences may occur. Programs that are designed to run on both systems therefore need to be tested on both platforms.
You can also check out these official SAP locations on the SAP Service Marketplace:
http://service.sap.com/unicode
http://service.sap.com/unicode@SAP
http://service.sap.com/i18n
Regards,
Santosh

Handling Multi-byte/Unicode (Japanese) characters in Oracle Database

Hello,
How do I handle the Japanase characters with Oracle database?
I have a Java application which retrieves some values from the database; makes some changes to these [ex: change value of status column, add comments to Varchar2 column, etc] and then performs an UPDATE back to the database.
Everything works fine for the English. But NOT for Japanese language, which uses Multi-byte/Unicode characters. The Japanese characters are garbled after the performing the database UPDATE.
I verified that Java by default uses UTF16 encoding. So there shouldn't be any problem with Java/JDBC.
What do I need to change at #1- Oracle (Database) side or #2- at the OS (Linux) side?
/* I tried changing the NLS_LANG value from OS and NLS_SESSION_PARAMETERS settings in Database and tried 'test' insert from SQL*plus. But SQL*Plus converts all Japanese characters to a question mark (?). So could not test it via SQL*plus on my XP (English) edition.
Any help will be really appreciated.
Thanks

Hello Sergiusz,
Here are the values before & after Update:
--BEFORE update:
select tar_sid, DUMP(col_name, 1016) from table_name where tar_sid in ('6997593.880');
/* Output copied from SQL-Developer: */
6997593.88 Typ=1 Len=144 CharacterSet=UTF8: 54,45,53,54,5f,41,42,53,54,52,41,43,54,e3,81,ab,e3,81,a6,4f,52,41,2d,30,31,34,32,32,e7,99,ba,e7,94,9f,29,a,4d,65,74,61,6c,69,6e,6b,20,e3,81,a7,e7,a2,ba,e8,aa,8d,e3,81,84,e3,81,9f,e3,81,97,e3,81,be,e3,81,97,e3,81,9f,e3,81,8c,e3,80,81,52,31,30,2e,32,2e,30,2e,34,20,a,e3,81,a7,e3,81,af,e4,bf,ae,e6,ad,a3,e6,b8,88,e3,81,bf,e3,81,ae,e4,ba,8b,e4,be,8b,e3,81,97,e3,81,8b,e7,a2,ba,e8,aa,8d,e3,81,a7,e3,81,8d,e3,81,be,e3,81,9b,e3,82,93,2a
--AFTER Update:
select tar_sid, DUMP(col_name, 1016) from table_name where tar_sid in ('6997593.880');
/* Output copied from SQL-Developer: */
6997593.88 Typ=1 Len=144 CharacterSet=UTF8: 54,45,53,54,5f,41,42,53,54,52,41,43,54,e3,81,ab,e3,81,a6,4f,52,41,2d,30,31,34,32,32,e7,99,ba,e7,94,9f,29,a,4d,45,54,41,4c,49,4e,4b,20,e3,81,a7,e7,a2,ba,e8,aa,8d,e3,81,84,e3,81,9f,e3,81,97,e3,81,be,e3,81,97,e3,81,9f,e3,81,8c,e3,80,81,52,31,30,2e,32,2e,30,2e,34,20,a,e3,81,a7,e3,81,af,e4,bf,ae,e6,ad,a3,e6,b8,88,e3,81,bf,e3,81,ae,e4,ba,8b,e4,be,8b,e3,81,97,e3,81,8b,e7,a2,ba,e8,aa,8d,e3,81,a7,e3,81,8d,e3,81,be,e3,81,9b,e3,82,93,2a
So the values BEFORE & AFTER Update are the same!
The problem is that sometimes, the Japanese data in VARCHAR2 (abstract) column gets corrupted. What could be the problem here? Any clues?

Can Unicode system have Non Unicode Database

i have installed Nw2004 Unicode .
But if i install NW2004 unicode the database is also unicode or not

Hi,
Unicode and non unicode depends on how many bytes are reserved at database
if itis 1byte it is non unicode supports only english and germany
2 bytes it is non-unicode
so the DB is created with an Unicode Characterset once you install sap as unicode system
Samrat

Unicode - DataType Currency error

Hi experts.
Please can you help me?
I used below method instead of move clause.
I can transfer (wa_table> to buffer.
But i found ##―ఀ###ఀ ###ఀ contents in Buffer.
This filed of buffer is Curr(15.2) datatype.
Please Can notice me how can slove this problem ?
Thanks.
DATA: buffer(30000) OCCURS 10 WITH HEADER LINE.
DATA : st_table TYPE REF TO data,
tb_table TYPE REF TO data,
FIELD-SYMBOLS : <wa_table> TYPE ANY,
<it_table> TYPE STANDARD TABLE,
<wa_table2> TYPE ANY.
CREATE DATA : tb_table TYPE TABLE OF (query_table), "Object Create.
st_table TYPE (query_table).
ASSIGN : tb_table->* TO <it_table>, "INTERNAL TABLE.
st_table->* TO <wa_table>. "WORK AREA.
SELECT * FROM (query_table)
INTO CORRESPONDING FIELDS OF TABLE <it_table> WHERE (options).
LOOP AT <it_table> INTO <wa_table>.
CLEAR buffer.
CALL METHOD cl_abap_container_utilities=>fill_container_c
EXPORTING
im_value = <wa_table>
IMPORTING
ex_container = buffer
EXCEPTIONS
illegal_parameter_type = 1
OTHERS = 2.
APPEND buffer.
endloop.

Hello Kalidas
Here is a simple "smoke test". Try to see if the system accept the following statement:
" NOTE: Try to write the packed field only
WRITE: / i_z008-packed_field.
If you receive an error you cannot WRITE packed values directly.
Alternative solution: write your structure to a string.
DATA:
ls_z008 LIKE LINE OF i_z008,
ld_string TYPE string.
LOOP AT i_z008 INTO ls_z008.
CALL METHOD cl_abap_container_utilities=>fill_container_c
    EXPORTING
      im_value = ls_z008
    IMPORTING
      ex_container = ld_string.
WRITE: / ld_string.
ENDLOOP.
Regards
Uwe

How to upload data to a "CLOB" datatype field in a database table

Using sqlldr what is the correct way to upload character data (far greater than the 4000 Bytes allowed in varchar2)?
setting the table field name datatype to clob and then using the field name against a large dataset in the control file I get the error that the input record (field) is to large.. adding "clob" after the table field name in the sqlldr control file just gives me a syntax error..
Running Oracle 9.2.0.6 on Solaris 9

user511518,
I think you're in the wrong forum. Perhaps you should start with this Web page:
http://www.oracle.com/technology/products/database/utilities/index.html
Look for the SQL*Loader links.
Good Luck,
Avi.

Unicode datatypes v.s. unicode databases

Similar Messages

Maybe you are looking for