ASCII characters to Character ( Unicode Conversion)

In a non unicode system,i see logic where they are replacing any value from '00' to '1F' with space in a variable of type X.
In a unicode system,how this can be done since type x is no more valid.
I am planning to convert variable to type c, but if i convert to type c, how do I replace those characters from 00 to 1F in unicode system.

CALL METHOD cl_abap_conv_in_ce=>uccp
        EXPORTING
                uccp = '000D'
        RECEIVING
                char = cr.
link:[UCCP: converts a unicode code point (hexa representation) into a character|http://wiki.sdn.sap.com/wiki/display/ABAP/CL_ABAP_CONV_IN_CE]

Similar Messages

Convert number to character - unicode conversion

I have a variable defined as
v (4) type n
I need to put this into a char field of 4 as part of our unicode conversion.
Is there a FM that can help ?

Hi,
Check this
report ztest.
class cl_abap_char_utilities definition load.
data: v_char(4) type c value cl_abap_char_utilities=>cr_lf.
data: c_numc(4) type n value 1234.
v_char = c_numc.
break-point.
a®

Username with ascii characters

Hello, i'm having and html form and i would like the user in
the username field to type ONLY ascii characters.
For example, in other fields of the form i
would like the user to type his mother language but
as far as the username and password fields are concerned
the characters have to be ascii.
How am i supposed to check when the username is accepted/correct (*consists of ascii characters*)?
and which are the desirable characters a username must have (e.g. *?* is a desirable character in a username , *:* this one?)
Thanks, in advance!

g_p_java wrote:
How am i supposed to check when the username is accepted/correct (*consists of ascii characters*)?ASCII characters are the Unicode characters whose code points are between 0 and 127.
and which are the desirable characters a username must have (e.g. *?* is a desirable character in a username , *:* this one?)I don't understand this. You have already said they must be ASCII. You have other requirements? Fine, go ahead and program them and ask questions if you have problems with that. Personally I don't think that requiring somebody to have a question mark in their user name is a good idea -- but probably you didn't mean it when you suggested that.

Some character display incorrect after unicode conversion

Dear Experts,
Currently we upgrade our SAP system from 4.6B NU to ECC6.0 UN.
in 4.6B system we had implemented Simple Chinese and French language package to the system. after upgrade to ECC6.0 no-unicode system. we perform a MDMP unicode conversion.currently system running with ECC6.0 unicode interface with language 1EFD. SUMG releated tasks had been finished. Chinese and French language checked OK in ECC6 UN system.
Our issue is. in the past time. some Spanish local end user typing some spanish character in some master date field when they login in 4.6B GUI choice english as login language.these kind of spanish characters could not display correct in current unicode environment.
could you please given us some suggestion for how to fix this issue.
I also sent this message in the Netweaver administrator forums with below links.
Spanish display incorrect after unicode conversion

Hello,
I doubt that the data shown by you in the link was caused by Spanish users logged on in EN.
I think this was rather caused by utf-8 data uploaded into the Non-Unicode system.
In this case, you need to repair the according texts manually, I do not know any automatic repair.
Best regards,
Nils Buerckel
SAP AG

Given filename or path contains Unicode or double-byte characters.Retry using ASCII characters for filename and path What does this mean? it happen when I publish an OAM

Given file name or path contains Unicode or double-byte characters. Retry using ASCII characters for filename and path
What does this mean? It is happening when I try to publish an OAM for Dreamweaver.
Also: How can I specify the browser in Edge Animate? It is just going wherever. Are there no Preferences for Edge Animate?
BTW. Just call it Edge. Seriously. Do you call it Illustrator Draw? Photoshop Retouching?

No, my file name is mainContent.oam
My project name is mainContent.an
This error happens when I try to import into Dreamweaver. Sorry, I wasn't clear on that earlier.
I thought maybe it was because I had saved my image as a png. So re-saved as a svg, still get the error.
DO I have a setting is Dreamweaver CC that is wrong? Should I try this in Dreamweaver CS6? I might try that next.
Why is this program so difficult? I know Flash. I know After Effects. I can work the timeline part just great. It's always in the export that I have problems.
On a MacPro, 10.7.
Are you an Adobe person or just a nice helper?

Unreadable characters in SM21 after Unicode Conversion

Hi all,
After upgrading from BW 3.1 content to BI 7.0, I did unicode conversion.
Now in tcode SM21, all log messages are in unreadable characters.
I tried applying SAP Notes 688089 and 45580, but no use.
Pls give me some inputs on this regard
Thanks
Karthikeyan

Issue got resolved.
Have to delete a logfile generated OS level in global folder then restart the SAP instance.

NSString with Unicode + ASCII characters

Hi All,
I am developing an mac application in 10.5.
I need to deal with normal ASCII characters and Unicode characters.
I want to calculate number of bytes taken to store the string.
I am using [[unicodeStr dataUsingEncoding:NSUTF8StringEncoding]bytes].
This is working fine if "unicodeStr" string has only unicode characters.
If "unicodeStr" it is combination of (unicode characters + ASCII Characters) "NSUTF8StringEncoding" is not working.
Can anybody tell me how to proceed??
Thanks in advance.

There is no such thing as a combination of Unicode and ASCII. ASCII is part of Unicode. Any ASCII string is a valid UTF8 string.

Time comparision of ASCII conversion VS Unicode conversion

In general how does the runtime of a Unicode conversion compare to that of the ASCII conversion? For example, you perform the Unicode conversion on the same hardware on which you completed the ASCII conversion. The ASCII conversion takes 20 hours. Does the Unicode conversion take a similiar time?
thanx
Mark

Hi Mark,
first an answer, that comes only now, because my user was somehow locked or whatever ...
How long does it take compared to the ASCII CPC ?
As you are a latin-1 customer I guess, you could make use of the InPlace version in both attempts.
The InPlace Unicode CPC Export will take 6-7h or the time of the Ascii Export if this was longer. The reload will take about 3h only.
The major problem is on some other edge:
The combined upgrade only works fine if you run all the scans and this will take several days ... even when they are "pretty useless" in your case. Therefore, I would recommend an ECC6 upgrade without unicode and then the unicode conversion directly afterwards - in a short special technique, that I used several times already.
If it is "just" from 4.6C ASCII to ECC 6.0 Unicode, the week will be fully sufficient including all backups - which can sometimes run in parallel. (at least with my ideas)
I did something like that from 4.6B ASCII to ecc5 Unicode on one weekend from friday evening until sunday afternoon. (for sure, depending on the systemsize)
Regards
Volker Gueldenpfennig, consolut.gmbh
http://www.consolut.de - http://www.4soi.de - http://www.easymarketplace.de

Convert smart quotes and other high ascii characters to HTML

I'd like to set up Dreamweaver CS4 Mac to automatically convert smart quotes and other high ASCII characters (m-dashes, accent marks, etc.) pasted from MS Word into HTML code. Dreamweaver 8 used to do this by default, but I can't find a way to set up a similar auto-conversion in CS 4. Is this possible? If not, it really should be a preference option. I code a lot of HTML emails and it is very time consuming to convert every curly quote and dash.
Thanks,
Robert
Digital Arts

I too am having a related problem with Dreamweaver CS5 (running under Windows XP), having just upgraded from CS4 (which works fine for me) this week.
In my case, I like to convert to typographic quotes etc. in my text editor, where I can use macros I've written to speed the conversion process. So my preferred method is to key in typographic letters & symbols by hand (using ALT + ASCII key codes typed in on the numeric keypad) in my text editor, and then I copy and paste my *plain* ASCII text (no formatting other than line feeds & carriage returns) into DW's DESIGN view. DW displays my high-ASCII characters just fine in DESIGN view, and writes the proper HTML code for the character into the source code (which is where I mostly work in DW).
I've been doing it this way for years (first with GoLive, and then with DW CS4) and never encountered any problems until this week, when I upgraded to DW CS5.
But the problem I'm having may be somewhat different than what others have complained of here.
In my case, some high-ASCII (above 128) characters convert to HTML just fine, while others do not.
E.g., en and em dashes in my cut-and-paste text show as such in DESIGN mode, and the right entries
    –
    —
turn up in the source code. Same is true for the ampersand
    &
and the copyright symbol
    ©
and for such foreign letters as the e with acute accent (ALT+0233)
    é
What does NOT display or code correctly are the typographic quotes. E.g., when I paste in (or special paste; it doesn't seem to make any difference which I use for this) text with typographic double quotes (ALT+0147 for open quote mark and ALT+0148 for close quote mark), which should appear in source code as
    “[...]”
DW strips out the ASCII encoding, displaying the inch marks in DESIGN mode, and putting this
    "[...]"
in my source code.
The typographic apostrophe (ALT+0146) is treated differently still. The text I copy & paste into DW should appear as
    [...]’[...]
in the source code, but instead I get the foot mark (both in DESIGN and CODE views):
I've tried adjusting the various DW settings for "encoding"
    MODIFY > PAGE PROPERTIES > TITLE/ENCODING > Encoding:
and for fonts
    EDIT > PREFERENCES > FONTS
but switching from "Unicode (UTF-8)" to "Western European" hasn't solved the problem (probably because in my case many of the higher ASCII characters convert just fine). So I don't think it's the encoding scheme I use that's the problem.
Whatever the problem is, it's caused me enough headaches and time lost troubleshooting that I'm planning to revert to CS4 as soon as I post this.
Deborah

Replacing non-ASCII characters with HTML charcter references

Hi All,
In Oracle 10g or greater is there a built-in function that will convert a string with non-ASCII characters like this
a b č 뮼
into an ASCII string with HTML character references like this?
a b & # x 0 1 0 D ; & # x B B B C ;
(note I had to include spaces between each character in the sample code for message to prevent the forum software from converting my text)
I tried using
utl_i18n.escape_reference( val, 'us7ascii' )
but for some reason it returns
a b c & # x B B B C ;
Note how it converted the Western European character "č" to its unaccented counterpart "c", not "& # x 0 1 0 D ;" (is this a bug?).
I also tried a custom solution using regexp_replace and asciistr (which I can't include here because the forum software chokes on it) but it only returns the correct result for values <=4000 characters long. Unfortunately asciistr doesn't appear to accept CLOB values larger than 4000 characters. It returns an error message like
(ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 30251, maximum: 4000) ).
I'm looking for a solution that works on CLOB data of any size.
Thanks in advance for any insight you can provide.
Joe Fuda

So with that (UTF8) in mind, let's take another look.....
As shown below, I used a AL32UTF8 database.
Note: I did not use a unicode capable tool for querying. So I set console mode code page to 1250 just to have č displayed properly (instead of posing as an è).
Also, as a result of using windows-1250 for client character set, in the val column and in the second select's ncr column (iso8859-1), è (00e8) has been replaced with e through character set conversion going from server back to client.
Running the same code on a database with a db character set such as we8mswin1252, that doesn't define the č (latin small c with caron) character, would yield results with a c in the ncr column.
C:\>chcp 1250
Aktuell teckentabell: 1250
C:\>set nls_lang=.ee8mswin1250
C:\>sqlplus test/test
SQL*Plus: Release 11.1.0.6.0 - Production on Fri May 23 21:25:29 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the OLAP option
SQL> select * from nls_database_parameters where parameter like '%CHARACTERSET';
PARAMETER              VALUE
NLS_CHARACTERSET       AL32UTF8
NLS_NCHAR_CHARACTERSET AL16UTF16
SQL> select unistr('\010d \00e8') val, utl_i18n.escape_reference(unistr('\010d \00e8'),'us7ascii') NCR from dual;
VAL NCR
č e c e
SQL> select unistr('\010d \00e8') val, utl_i18n.escape_reference(unistr('\010d \00e8'),'we8iso8859p1') NCR from dual;
VAL NCR
č e &# x10d; e     <- "è"
SQL> select unistr('\010d \00e8') val, utl_i18n.escape_reference(unistr('\010d \00e8'),'ee8iso8859p2') NCR from dual;
VAL NCR
č e č &# xe8;
SQL> select unistr('\010d \00e8') val, utl_i18n.escape_reference(unistr('\010d \00e8'),'cl8iso8859p5') NCR from dual;
VAL NCR
č e &# x10d; &# xe8;In the US7ASCII case, where it should be possible for all non-ascii characters to be escaped, it seems as if the actual escape step is skipped over.
Hope this helps to understand whether utl_i8n is usable or not in your case.
Message was edited by:
orafad
Fixed replaced character references :)

Fixing a US7ASCII - WE8ISO8859P1 Character Set Conversion Disaster

In hopes that it might be helpful in the future, here's the procedure I followed to fix a disastrous unintentional US7ASCII on 9i to WE8ISO8859P1 on 10g migration.
BACKGROUND
Oracle has multiple character sets, ranging from US7ASCII to AL32UTF16.
US7ASCII, of course, is a cheerful 7 bit character set, holding the basic ASCII characters sufficient for the English language.
However, it also has a handy feature: character fields under US7ASCII will accept characters with values > 128. If you have a web application, users can type (or paste) Us with umlauts, As with macrons, and quite a few other funny-looking characters.
These will be inserted into the database, and then -- if appropriately supported -- can be selected and displayed by your app.
The problem is that while these characters can be present in a VARCHAR2 or CLOB column, they are not actually legal. If you try within Oracle to convert from US7ASCII to WE8ISO8859P1 or any other character set, Oracle recognizes that these characters with values greater than 127 are not valid, and will replace them with a default "unknown" character. In the case of a change from US7ASCII to WE8ISO8859P1, it will change them to 191, the upside down question mark.
Oracle has a native utility, introduced in 8i, called csscan, which assists in migrating to different character sets. This has been replaced in newer versions with the Database MIgration Assistant for Unicode (DMU), which is the new recommended tool for 11.2.0.3+.
These tools, however, do no good unless they are run. For my particular client, the operations team took a database running 9i and upgraded it to 10g, and as part of that process the character set was changed from US7ASCII to WE8ISO8859P1. The database had a large number of special characters inserted into it, and all of these abruptly turned into upside-down question marks. The users of the application didn't realize there was a problem until several weeks later, by which time they had put a lot of new data into the system. Rollback was not possible.
FIXING THE PROBLEM
How fixable this problem is and the acceptable methods which can be used depend on the application running on top of the database. Fortunately, the client app was amenable.
(As an aside note: this approach does not use csscan -- I had done something similar previously on a very old system and decided it would take less time in this situation to revamp my old procedures and not bring a new utility into the mix.)
We will need to separate approaches -- one to fix the VARCHAR2 & CHAR fields, and a second for CLOBs.
In order to set things up, we created two environments. The first was a clone of production as it is now, and the second a clone from before the upgrade & character set change. We will call these environments PRODCLONE and RESTORECLONE.
Next, we created a database link, OLD6. This allows PRODCLONE to directly access RESTORECLONE. Since they were cloned with the same SID, establishing the link needed the global_names parameter set to false.
alter system set global_names=false scope=memory;
CREATE PUBLIC DATABASE LINK OLD6
CONNECT TO DBUSERNAME
IDENTIFIED BY dbuserpass
USING 'restoreclone:1521/MYSID';
Testing the link...
SQL> select count(1) from users@old6;
COUNT(1)
       454
Here is a row in a table which contains illegal characters. We are accessing RESTORECLONE from PRODCLONE via our link.
PRODCLONE> select dump(title) from my_contents@old6 where pk1=117286;
DUMP(TITLE)
Typ=1 Len=49: 78,67,76,69,88,45,80,78,174,32,69,120,97,109,32,83,116,121,108,101
,32,73,110,116,101,114,97,99,116,105,118,101,32,82,101,118,105,101,119,32,81,117
,101,115,116,105,111,110,115
By comparison, a dump of that row on PRODCLONE's my_contents gives:
PRODCLONE> select dump(title) from my_contents where pk1=117286;
DUMP(TITLE)
Typ=1 Len=49: 78,67,76,69,88,45,80,78,191,32,69,120,97,109,32,83,116,121,108,101
,32,73,110,116,101,114,97,99,116,105,118,101,32,82,101,118,105,101,119,32,81,117
,101,115,116,105,111,110,115
Note that the "174" on RESTORECLONE was changed to "191" on PRODCLONE.
We can manually insert CHR(174) into our PRODCLONE and have it display successfully in the application.
However, I tried a number of methods to copy the data from RESTORECLONE to PRODCLONE through the link, but entirely without success. Oracle would recognize the character as invalid and silently transform it.
Eventually, I located a clever workaround at this link:
https://kr.forums.oracle.com/forums/thread.jspa?threadID=231927
It works like this:
On RESTORECLONE you create a view, vv, with UTL_RAW:
RESTORECLONE> create or replace view vv as select pk1,utl_raw.cast_to_raw(title) as title from my_contents;
View created.
This turns the title to raw on the RESTORECLONE.
You can now convert from RAW to VARCHAR2 on the PRODCLONE database:
PRODCLONE> select dump(utl_raw.cast_to_varchar2 (title)) from vv@old6 where pk1=117286;
DUMP(UTL_RAW.CAST_TO_VARCHAR2(TITLE))
Typ=1 Len=49: 78,67,76,69,88,45,80,78,174,32,69,120,97,109,32,83,116,121,108,101
,32,73,110,116,101,114,97,99,116,105,118,101,32,82,101,118,105,101,119,32,81,117
,101,115,116,105,111,110,115
The above works because oracle on PRODCLONE never knew that our TITLE string on RESTORE was originally in US7ASCII, so it was unable to do its transparent character set conversion.
PRODCLONE> update my_contents set title=( select utl_raw.cast_to_varchar2 (title) from vv@old6 where pk1=117286) where pk1=117286;
PRODCLONE> select dump(title) from my_contents where pk1=117286;
DUMP(UTL_RAW.CAST_TO_VARCHAR2(TITLE))
Typ=1 Len=49: 78,67,76,69,88,45,80,78,174,32,69,120,97,109,32,83,116,121,108,101
,32,73,110,116,101,114,97,99,116,105,118,101,32,82,101,118,105,101,119,32,81,117
,101,115,116,105,111,110,115
Excellent! The "174" character has survived the transfer and is now in place on PRODCLONE.
Now that we have a method to move the data over, we have to identify which columns /tables have character data that was damaged by the conversion. We decided we could ignore anything with a length smaller than 10 -- such fields in our application would be unlikely to have data with invalid characters.
RESTORECLONE> select count(1) from user_tab_columns where data_type in ('CHAR','VARCHAR2') and data_length > 10;
   COUNT(1)
    533
By converting a field to WE8ISO8859P1, and then comparing it with the original, we can see if the characters change:
RESTORECLONE> select count(1) from my_contents where title != convert (title,'WE8ISO8859P1','US7ASCII') ;
COUNT(1)
     10568
So 10568 rows have characters which were transformed into 191s as part of the original conversion.
[ As an aside, we can't use CONVERT() on LOBs -- for them we will need another approach, outlined further below.
RESTOREDB> select count(1) from my_contents where main_data != convert (convert(main_DATA,'WE8ISO8859P1','US7ASCII'),'US7ASCII','WE8ISO8859P1') ;
select count(1) from my_contents where main_data != convert (convert(main_DATA,'WE8ISO8859P1','US7ASCII'),'US7ASCII','WE8ISO8859P1')
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected - got CLOB
Anyway, now that we can identify VARCHAR2 fields which need to be checked, we can put together a PL/SQL stored procedure to do it for us:
create or replace procedure find_us7_strings
(table_name varchar2,
fix_col varchar2 )
authid current_user
as
orig_sql varchar2(1000);
begin
orig_sql:='insert into cnv_us7(mytablename,myindx,mycolumnname) select '''||table_name||''',pk1,'''||fix_col||''' from '||table_name||' where '||fix_col||' != CONVERT(CONVERT('||fix_col||',''WE8ISO8859P1''),''US7ASCII'') and '||fix_col||' is not null';
-- Uncomment if debugging:
-- dbms_output.put_line(orig_sql);
execute immediate orig_sql;
end;
And create a table to store the information as to which tables, columns, and rows have the bad characters:
drop table cnv_us7;
create table cnv_us7 (mytablename varchar2(50), myindx number,      mycolumnname varchar2(50) ) tablespace myuser_data;
create index list_tablename_idx on cnv_us7(mytablename) tablespace myuser_indx;
With a SQL-generating SQL script, we can iterate through all the tables/columns we want to check:
--example of using the data: select title from my_contents where pk1 in (select myindx from cnv_us7)
set head off pagesize 1000 linesize 120
spool runme.sql
select 'exec find_us7_strings ('''||table_name||''','''||column_name||'''); ' from user_tab_columns
      where
          data_type in ('CHAR','VARCHAR2')
          and table_name in (select table_name from user_tab_columns where column_name='PK1' and table_name not in ('HUGETABLEIWANTTOEXCLUDE','ANOTHERTABLE'))
          and char_length > 10
          order by table_name,column_name;
spool off;
set echo on time on timing on feedb on serveroutput on;
spool output_of_runme
@./runme.sql
spool off;
Which eventually gives us the following inserted into CNV_US7:
20:48:21 SQL> select count(1),mycolumnname,mytablename from cnv_us7 group by mytablename,mycolumnname;
         4 DESCRIPTION                                        MY_FORUMS
     21136 TITLE                                              MY_CONTENTS
Out of 533 VARCHAR2s and CHARs, we only had five or six columns that needed fixing
We create our views on RESTOREDB:
create or replace view my_forums_vv as select pk1,utl_raw.cast_to_raw(description) as description from forum_main;
create or replace view my_contents_vv as select pk1,utl_raw.cast_to_raw(title) as title from my_contents;
And then we can fix it directly via sql:
update my_contents taborig1 set TITLE= (select utl_raw.cast_to_varchar2 (TITLE) from my_contents_vv@old6 where pk1=taborig1.pk1)
where pk1 in (
select tabnew.pk1 from my_contents@old6 taborig,my_contents tabnew,cnv_us7@old6
      where taborig.pk1=tabnew.pk1
          and myindx=tabnew.pk1
          and mycolumnname='TITLE'
          and mytablename='MY_CONTENTS'
          and convert(taborig.TITLE,'US7ASCII','WE8ISO8859P1') = tabnew.TITLE );
Note this part:
      "and convert(taborig.TITLE,'US7ASCII','WE8ISO8859P1') = tabnew.TITLE "
This checks to verify that the TITLE field on the PRODCLONE and RESTORECLONE are the same (barring character set issues). This is there because if the users have changed TITLE -- or any other field -- on their own between the time of the upgrade and now, we do not want to overwrite their changes. We make the assumption that as part of the process, they may have changed the bad character on their own.
We can also create a stored procedure which will execute the SQL for us:
create or replace procedure fix_us7_strings
(TABLE_NAME varchar2,
FIX_COL varchar2 )
authid current_user
as
orig_sql varchar2(1000);
TYPE cv_type IS REF CURSOR;
orig_cur cv_type;
begin
orig_sql:='update '||TABLE_NAME||' taborig1 set '||FIX_COL||'= (select utl_raw.cast_to_varchar2 ('||FIX_COL||') from '||TABLE_NAME||'_vv@old6 where pk1=taborig1.pk1)
where pk1 in (
select tabnew.pk1 from '||TABLE_NAME||'@old6 taborig,'||TABLE_NAME||' tabnew,cnv_us7@old6
      where taborig.pk1=tabnew.pk1
          and myindx=tabnew.pk1
          and mycolumnname='''||FIX_COL||'''
          and mytablename='''||TABLE_NAME||'''
          and convert(taborig.'||FIX_COL||',''US7ASCII'',''WE8ISO8859P1'') = tabnew.'||FIX_COL||')';
dbms_output.put_line(orig_sql);
execute immediate orig_sql;
end;
exec fix_us7_strings('MY_FORUMS','DESCRIPTION');
exec fix_us7_strings('MY_CONTENTS','TITLE');
commit;
To validate this before and after, we can run something like:
select dump(description) from my_forums where pk1 in (select myindx from cnv_us7@old6 where mytablename='MY_FORUMS');
The above process fixes all the VARCHAR2s and CHARs. Now what about the CLOB columns?
Note that we're going to have some extra difficulty here, not just because we are dealing with CLOBs, but because we are working with CLOBs in 9i, whose functions have less CLOB-related functionality.
This procedure finds invalid US7ASCII strings inside a CLOB in 9i:
create or replace procedure find_us7_clob
(table_name varchar2,
fix_col varchar2)
authid current_user
as
orig_sql varchar2(1000);
type cv_type is REF CURSOR;
orig_table_cur cv_type;
my_chars_read NUMBER;
my_offset NUMBER;
my_problem NUMBER;
my_lob_size NUMBER;
my_indx_var NUMBER;
my_total_chars_read NUMBER;
my_output_chunk VARCHAR2(4000);
my_problem_flag NUMBER;
my_clob CLOB;
my_total_problems NUMBER;
ins_sql VARCHAR2(4000);
BEGIN
   DBMS_OUTPUT.ENABLE(1000000);
   orig_sql:='select pk1,dbms_lob.getlength('||FIX_COL||') as cloblength,'||fix_col||' from '||table_name||' where dbms_lob.getlength('||fix_col||') >0 and '||fix_col||' is not null order by pk1';
   open orig_table_cur for orig_sql;
   my_total_problems := 0;
   LOOP
        FETCH orig_table_cur INTO my_indx_var,my_lob_size,my_clob;
                EXIT WHEN orig_table_cur%NOTFOUND;
        my_offset :=1;
        my_chars_read := 512;
        my_problem_flag :=0;
        WHILE my_offset < my_lob_size and my_problem_flag =0
                LOOP
                DBMS_LOB.READ(my_clob,my_chars_read,my_offset,my_output_chunk);
                my_offset := my_offset + my_chars_read;
                IF my_output_chunk != CONVERT(CONVERT(my_output_chunk,'WE8ISO8859P1'),'US7ASCII')
                        THEN
                        -- DBMS_OUTPUT.PUT_LINE('Problem with '||my_indx_var);
                        -- DBMS_OUTPUT.PUT_LINE(my_output_chunk);
                        my_problem_flag:=1;
                END IF;
        END LOOP;
        IF my_problem_flag=1
                THEN my_total_problems := my_total_problems +1;
                ins_sql:='insert into cnv_us7(mytablename,myindx,mycolumnname) values ('''||table_name||''','||my_indx_var||','''||fix_col||''')';
                execute immediate ins_sql;
                END IF;
   END LOOP;
   DBMS_OUTPUT.PUT_LINE('We found '||my_total_problems||' problem rows in table '||table_name||', column '||fix_col||'.');
END;
And we can use SQL-generating SQL to find out which CLOBs have issues, out of all the ones in the database:
RESTOREDB> select 'exec find_us7_clob('''||table_name||''','''||column_name||''');' from user_tab_columns where data_type='CLOB';
exec find_us7_clob('MY_CONTENTS','DATA');
After completion, the CNV_US7 table looked like this:
RESTOREDB> set linesize 120 pagesize 100;
RESTOREDB> select count(1),mytablename,mycolumnname from cnv_us7
   where mytablename||' '||mycolumnname in (select table_name||' '||column_name from user_tab_columns
         where data_type='CLOB' )
      group by mytablename,mycolumnname;
COUNT(1) MYTABLENAME                                        MYCOLUMNNAME
     69703 MY_CONTENTS                                  DATA
On RESTOREDB, our 9i version, we will use this procedure (found many years ago on the internet):
create or replace procedure CLOB2BLOB (p_clob in out nocopy clob, p_blob in out nocopy blob) is
-- transforming CLOB to BLOB
l_off number default 1;
l_amt number default 4096;
l_offWrite number default 1;
l_amtWrite number;
l_str varchar2(4096 char);
begin
loop
dbms_lob.read ( p_clob, l_amt, l_off, l_str );
l_amtWrite := utl_raw.length ( utl_raw.cast_to_raw( l_str) );
dbms_lob.write( p_blob, l_amtWrite, l_offWrite,
utl_raw.cast_to_raw( l_str ) );
l_offWrite := l_offWrite + l_amtWrite;
l_off := l_off + l_amt;
l_amt := 4096;
end loop;
exception
when no_data_found then
NULL;
end;
We can test out the transformation of CLOBs to BLOBs with a single row like this:
drop table my_contents_lob;
Create table my_contents_lob (pk1 number,data blob);
DECLARE
      v_clob CLOB;
      v_blob BLOB;
    BEGIN
      SELECT data INTO v_clob FROM my_contents WHERE pk1 = 16 ;
      INSERT INTO my_contents_lob (pk1,data) VALUES (16,empty_blob() );
      SELECT data INTO v_blob FROM my_contents_lob WHERE pk1=16 FOR UPDATE;
      clob2blob (v_clob, v_blob);
    END;
select dbms_lob.getlength(data) from my_contents_lob;
DBMS_LOB.GETLENGTH(DATA)
                             329
SQL> select utl_raw.cast_to_varchar2(data) from my_contents_lob;
UTL_RAW.CAST_TO_VARCHAR2(DATA)
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam...
Now we need to push it through a loop. Unfortunately, I had trouble making the "SELECT INTO" dynamic. Thus I used a version of the procedure for each table. It's aesthetically displeasing, but at least it worked.
create table my_contents_lob(pk1 number,data blob);
create index my_contents_lob_pk1 on my_contents_lob(pk1) tablespace my_user_indx;
create or replace procedure blob_conversion_my_contents
(table_name varchar2,
fix_col varchar2)
authid current_user
as
orig_sql varchar2(1000);
type cv_type is REF CURSOR;
orig_table_cur cv_type;
my_chars_read NUMBER;
my_offset NUMBER;
my_problem NUMBER;
my_lob_size NUMBER;
my_indx_var NUMBER;
my_total_chars_read NUMBER;
my_output_chunk VARCHAR2(4000);
my_problem_flag NUMBER;
my_clob CLOB;
my_blob BLOB;
my_total_problems NUMBER;
new_sql VARCHAR2(4000);
BEGIN
DBMS_OUTPUT.ENABLE(1000000);
   orig_sql:='select pk1,dbms_lob.getlength('||FIX_COL||') as cloblength,'||fix_col||' from '||table_name||' where pk1 in (select myindx from cnv_us7 where mytablename='''||TABLE_NAME||''' and mycolumnname='''||FIX_COL||''') order by pk1';
   open orig_table_cur for orig_sql;
   LOOP
        FETCH orig_table_cur INTO my_indx_var,my_lob_size,my_clob;
                EXIT WHEN orig_table_cur%NOTFOUND;
        new_sql:='INSERT INTO '||table_name||'_lob(pk1,'||fix_col||') values ('||my_indx_var||',empty_blob() )';
        dbms_output.put_line(new_sql);
      execute immediate new_sql;
-- Here's the bit that I had trouble making dynamic. Feel free to let me know what I am doing wrong.
-- new_sql:='SELECT '||fix_col||' INTO my_blob from '||table_name||'_lob where pk1='||my_indx_var||' FOR UPDATE';
--        dbms_output.put_line(new_sql);
        select data into my_blob from my_contents_lob where pk1=my_indx_var FOR UPDATE;
      clob2blob(my_clob,my_blob);
   END LOOP;
   CLOSE orig_table_cur;
DBMS_OUTPUT.PUT_LINE('Completed program');
END;
exec blob_conversion_my_contents('MY_CONTENTS','DATA');
Verify that things work properly:
select dump( utl_raw.cast_to_varchar2(data)) from my_contents_lob where pk1=xxxx;
This should let you see see characters > 150. Thus, the method works.
We can now take this data, export it from RESTORECLONE
exp file=a.dmp buffer=4000000 userid=system/XXXXXX tables=my_user.my_contents rows=y
and import the data on prodclone
imp file=a.dmp fromuser=my_user touser=my_user userid=system/XXXXXX buffer=4000000;
For paranoia's sake, double check that it worked properly:
select dump( utl_raw.cast_to_varchar2(data)) from my_contents_lob;
On our 10g PRODCLONE, we'll use these stored procedures:
CREATE OR REPLACE FUNCTION CLOB2BLOB(L_CLOB CLOB) RETURN BLOB IS
L_BLOB BLOB;
L_SRC_OFFSET NUMBER;
L_DEST_OFFSET NUMBER;
L_BLOB_CSID NUMBER := DBMS_LOB.DEFAULT_CSID;
V_LANG_CONTEXT NUMBER := DBMS_LOB.DEFAULT_LANG_CTX;
L_WARNING NUMBER;
L_AMOUNT NUMBER;
BEGIN
DBMS_LOB.CREATETEMPORARY(L_BLOB, TRUE);
L_SRC_OFFSET := 1;
L_DEST_OFFSET := 1;
L_AMOUNT := DBMS_LOB.GETLENGTH(L_CLOB);
DBMS_LOB.CONVERTTOBLOB(L_BLOB,
L_CLOB,
L_AMOUNT,
L_SRC_OFFSET,
L_DEST_OFFSET,
1,
V_LANG_CONTEXT,
L_WARNING);
RETURN L_BLOB;
END;
CREATE OR REPLACE FUNCTION BLOB2CLOB(L_BLOB BLOB) RETURN CLOB IS
L_CLOB CLOB;
L_SRC_OFFSET NUMBER;
L_DEST_OFFSET NUMBER;
L_BLOB_CSID NUMBER := DBMS_LOB.DEFAULT_CSID;
V_LANG_CONTEXT NUMBER := DBMS_LOB.DEFAULT_LANG_CTX;
L_WARNING NUMBER;
L_AMOUNT NUMBER;
BEGIN
DBMS_LOB.CREATETEMPORARY(L_CLOB, TRUE);
L_SRC_OFFSET := 1;
L_DEST_OFFSET := 1;
L_AMOUNT := DBMS_LOB.GETLENGTH(L_BLOB);
DBMS_LOB.CONVERTTOCLOB(L_CLOB,
L_BLOB,
L_AMOUNT,
L_SRC_OFFSET,
L_DEST_OFFSET,
1,
V_LANG_CONTEXT,
L_WARNING);
RETURN L_CLOB;
END;
And now, for the piece de' resistance, we need a BLOB to CLOB conversion that assumes that the BLOB data is stored initially in WE8ISO8859P1.
To find correct CSID for WE8ISO8859P1, we can use this query:
select nls_charset_id('WE8ISO8859P1') from dual;
Gives "31"
create or replace FUNCTION BLOB2CLOBASC(L_BLOB BLOB) RETURN CLOB IS
L_CLOB CLOB;
L_SRC_OFFSET NUMBER;
L_DEST_OFFSET NUMBER;
L_BLOB_CSID NUMBER := 31;      -- treat blob as WE8ISO8859P1
V_LANG_CONTEXT NUMBER := 31;   -- treat resulting clob as WE8ISO8850P1
L_WARNING NUMBER;
L_AMOUNT NUMBER;
BEGIN
DBMS_LOB.CREATETEMPORARY(L_CLOB, TRUE);
L_SRC_OFFSET := 1;
L_DEST_OFFSET := 1;
L_AMOUNT := DBMS_LOB.GETLENGTH(L_BLOB);
DBMS_LOB.CONVERTTOCLOB(L_CLOB,
L_BLOB,
L_AMOUNT,
L_SRC_OFFSET,
L_DEST_OFFSET,
L_BLOB_CSID,
V_LANG_CONTEXT,
L_WARNING);
RETURN L_CLOB;
END;
select dump(dbms_lob.substr(blob2clobasc(data),4000,1)) from my_contents_lob;
Now, we can compare these:
select dbms_lob.compare(blob2clob(old.data),new.data) from my_contents new,my_contents_lob old where new.pk1=old.pk1;
DBMS_LOB.COMPARE(BLOB2CLOB(OLD.DATA),NEW.DATA)
                                                             0
                                                             0
                                                             0
Vs
select dbms_lob.compare(blob2clobasc(old.data),new.data) from my_contents new,my_contents_lob old where new.pk1=old.pk1;
DBMS_LOB.COMPARE(BLOB2CLOBASC(OLD.DATA),NEW.DATA)
                                                               -1
                                                               -1
                                                               -1
update my_contents a set data=(select blob2clobasc(data) from my_contents_lob b where a.pk1= b.pk1)
    where pk1 in (select al.pk1 from my_contents_lob al where dbms_lob.compare(blob2clob(al.data),a.data) =0 );
SQL> select dump(dbms_lob.substr(data,4000,1)) from my_contents where pk1 in (select pk1 from my_contents_lob);
Confirms that we're now working properly.
To run across all the _LOB tables we've created:
[oracle@RESTORECLONE ~]$ exp file=all_fixed_lobs.dmp buffer=4000000 userid=my_user/mypass tables=MY_CONTENTS_LOB,MY_FORUM_LOB...
[oracle@RESTORECLONE ~]$ scp all_fixed_lobs.dmp jboulier@PRODCLONE:/tmp
And then on PRODCLONE we can import:
imp file=all_fixed_lobs.dmp buffer=4000000 userid=system/XXXXXXX fromuser=my_user touser=my_user
Instead of running the above update statement for all the affected tables, we can use a simple stored procedure:
create or replace procedure fix_us7_CLOBS
(TABLE_NAME varchar2,
     FIX_COL varchar2 )
    authid current_user
    as
     orig_sql varchar2(1000);
     bak_sql varchar2(1000);
    begin
    dbms_output.put_line('Creating '||TABLE_NAME||'_PRECONV to preserve the original data in the table');
    bak_sql:='create table '||TABLE_NAME||'_preconv as select pk1,'||FIX_COL||' from '||TABLE_NAME||' where pk1 in (select pk1 from '||TABLE_NAME||'_LOB) ';
    execute immediate bak_sql;
    orig_sql:='update '||TABLE_NAME||' tabnew set '||FIX_COL||'= (select blob2clobasc ('||FIX_COL||') from '||TABLE_NAME||'_LOB taborig where tabnew.pk1=taborig.pk1)
   where pk1 in (
   select a.pk1 from '||TABLE_NAME||'_LOB a,'||TABLE_NAME||' b
      where a.pk1=b.pk1
             and dbms_lob.compare(blob2clob(a.'||FIX_COL||'),b.'||FIX_COL||') = 0 )';
    -- dbms_output.put_line(orig_sql);
    execute immediate orig_sql;
   end;
Now we can run the procedure and it fixes everything for our previously-broken tables, keeping the changed rows -- just in case -- in a table called table_name_PRECONV.
set serveroutput on time on timing on;
exec fix_us7_clobs('MY_CONTENTS','DATA');
commit;
After confirming with the client that the changes work -- and haven't noticeably broken anything else -- the same routines can be carefully run against the actual production database.

We converted using the database using scripts I developed. I'm not quite sure how we converted is relevant, other than saying that we did not use the Oracle conversion utility (not csscan, but the GUI Java tool).
A summary:
1) We replaced the lossy characters by parsing a csscan output file
2) After re-scanning with csscan and coming up clean, our DBA converted the database to AL32UTF8 (changed the parameter file, changing the character set, switched the semantics to char, etc).
3) Final step was changing existing tables to use char semantics by changing the table schema for VARCHAR2 columns
Any specific steps I cannot easily answer, I worked with a DBA at our company to do this work. I handled the character replacement / DDL changes and the DBA ran csscan & performed the database config changes.
Our actual error message:
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00210: expected '<' instead of '�Error at line 1
31011. 00000 - "XML parsing failed"
*Cause:    XML parser returned an error while trying to parse the document.
*Action:   Check if the document to be parsed is valid.
Error at Line: 24 Column: 15
This seems to match the the document ID referenced below. I will ask our DBA to pull it up and review it.
Please advise if more information is needed from my end.

How can I convert ASCII characters to ISO8859?

Hi All,
I have written a little application that renames a TV episode by scraping a TV listing site for the episode name. It is written in SWT and works great apart from on small problem. When getting the html back from the site, it sometimes contains special ASCII characters that are not in the ISO8859 (Windows filesystem) character set.
For example, this is the line that I have to parse:
<td style='padding-left: 6px;' class='b2'><a href='/Prison_Break/episodes/569183/03x01'>Orientaci��n</a></td>When viewing it in a browser, it is:
<td style="padding-left: 6px;" class="b2"><a href="/Prison_Break/episodes/569183/03x01">Orientaci�n</a></td>Notice that the o in the title has an accent on it. While researching this problem I stumbled across 'HTML Entities to ISO 8859-1 Converter' at http://www.inweb.de/chetan/English/Resources/Java/HTML%202%20ISO.html. This open source project takes in an html entity like & and returns '&'.
So that is not quite what I want, as my BufferedReader is converting the html entity into the ASCII representation already. I need a way of detecting a non ISO8859 character within an ASCII string, and hopefully replacing its natural 'equivalent' (would be o in this case).
Does anyone know how I could do it without having to check for every special char and replacing (not really an option unless someone has done it before!!)
If not that then, perhaps another way to attack the problem?
Any help greatly appreciated ;)
Dave

Hi,
NZ_Dave wrote:
For example, this is the line that I have to parse:
<td style='padding-left: 6px;' class='b2'><a href='/Prison_Break/episodes/569183/03x01'>Orientaci��n</a></td>
This is coded in UTF-8. If you convert the bytes to a String using the UTF-8 encoding, then you will have the correct characters "Orientaci�n" in the string.
Check your parser where it converts the bytes (coming from e.g. an InputStream) to characters. Use UTF-8 as the charset when doing that conversion.

Error while database export in package SAPDODS_33 (Unicode conversion)

Hello,
I am performing Unicode conversion on an upgraded BI 7.0 system. This is running on AIX/DB2.
When I take database export for conversion, one package fails (SAPDODS_33) with error "The file system is full". In fact, df -k doesn't show that any file system is full.
I am taking the export to the following location: /db2/BWS/sapdata/EXPORT_DIR
There are 3 system-managed temporary tablespaces: PSAPTEMP, PSATEMP16 & SYSTOOLSTMPSPACE which were residing in /db2/BWS/saptemp1/NODE0000. This was getting full during the export. Now these temp.tablespaces have been relocated to a new file system of size 15GB.
But still the package SAPDODS_33 doesn't get exported. Could you please let me know how to solve this issue?
Thanks a lot!
Sundar.
=====================================================================================
SAPDODS_33.log
(DB) INFO: connected to DB
(rscpsumg) Please look also into "SAPDODS_33012.xml".
(rscpMC) Warn: env I18N_NAMETAB_TIMESTAMPS = IGNORE
(rscpMC) Warn: UMGCONTAINER has 1 problems.
(rscpMC) Warn: Global fallback code page = 1160
(rscpMC) Warn: Common character set is not 7-bit-ASCII
(rscpMC) Warn: Collision resolution method is 'fine'
(rscpMC) Warn: R3trans code pages = Normal
(rscpMC) Warn: EXPORT TO ... code pages = Normal
(rscpMC) Warn: Source is 'MDMP'
(rscpMC) Warn: I18N_NAMETAB_NORM_ALLOW = 0
(rscpMC) Warn: I18N_NAMETAB_NORM_LOG   = 0
(rscpMC) Warn: I18N_NAMETAB_ALT_ALLOW = 0
(rscpMC) Warn: I18N_NAMETAB_ALT_LOG    = 0
(rscpMC) Warn: I18N_NAMETAB_OLD_ALLOW = 0
(rscpMC) Warn: I18N_NAMETAB_OLD_LOG    = 0
(GSI) INFO: dbname   = "BWS
(GSI) INFO: vname    = "DB6                             "
(GSI) INFO: hostname = "UNKNOWN
(GSI) INFO: sysname = "AIX"
(GSI) INFO: nodename = "itmlartdev01x"
(GSI) INFO: release = "3"
(GSI) INFO: version = "5"
(GSI) INFO: machine = "002417CA4C00"
(GSI) INFO: instno   = "0020169337"
(EXP) ERROR: DbSlExeRead failed
rc = 99, table "/BIC/B0000786000"
(SQL error -968)
error message returned by DbSl:
SQL0968C The file system is full. SQLSTATE=57011
(DB) INFO: disconnected from DB
/usr/sap/BWS/SYS/exe/run/R3load: job finished with 1 error(s)
/usr/sap/BWS/SYS/exe/run/R3load: END OF LOG: 20071211053119
p/sapinst/sapinst_instdir/NW04S/LM/COPY/DB6/EXP/CENTRAL/AS-ABAP/EXP #

Hi Sundar,
if the problem is due to a high temp space usage. The root cause is typically the
sort during export. If the table is large, the sort spills to temp.
You can avoid the sort by doing an unsorted export or tweak the DB2 optimizer to use the primary key to do the sort.
Before DB2 9, you can reduce the OVERHEAD tablespace parameter to zero. Return this value to its previous value after the successful export. This is a indirect hint to the optimizer that random IOs are cheap. This might not help.
With DB2 9, this should not happen, because the DB2_OPT_MAX_TEMP_SIZE=10240 registry variable is set under DB2_WORKLOAD=SAP which limits the temp space usage to 10GB if possible. You can reduce this parameter further if necessary. Another problem could be that the optimizer has chosen the wrong temp space. The optimizer choses the temp space with the largest amount of pages (not size).
Regards, Jens

Non US-ASCII characters in download file names

I am trying to implement a simple file download in a JSP, and trying to get IE, Firefox and Opera to all display and handle non US-ASCII characters in the suggested download file name. Only concerned with Windows platform for now. Here's the code I am currently using:
String agent = request.getHeader("USER-AGENT");
if (null != agent && -1 != agent.indexOf("MSIE"))
String codedfilename = URLEncoder.encode(cfrfilename, "UTF8");
response.setContentType("application/x-download");
response.setHeader("Content-Disposition","attachment;filename=" + codedfilename);
else if (null != agent && -1 != agent.indexOf("Mozilla"))
String codedfilename = MimeUtility.encodeText(cfrfilename, "UTF8", "B");
response.setContentType("application/x-download");
response.setHeader("Content-Disposition","attachment;filename=" + codedfilename);
else
response.setContentType("application/x-download");
response.setHeader("Content-Disposition","attachment;filename=" + cfrfilename);
}This URL encodes the file name if the browser is IE, MIME encodes it if the browser is Mozilla, and sends plain UTF-8 (the encoding of the JSP) for all other browsers. I get "cfrfilename" from translated properties files, and the string can contain characters from any character set - Chinese, Thai, Korean, etc.
This code works correctly for IE - the file name is displayed correctly in the file Save as dialog, and it is saved correctly on disk, no matter which character set is used.
For Firefox, the file name is displayed correctly in the file Save as dialog, but it is only saved correctly to disk if the file name is in a character set supported by the system locale. This seems to be a known Firefox bug (not fully using the Windows Unicode APIs), so nothing I can do about that.
Nothing seems to work for Opera, however - I cannot get the file name to display correctly in the file Save as dialog, no matter which method I use (I have tried URL encoding and MIME encoding in addition to the plain UTF-8).
Has anybody implemented something similar that works for at least these 3 browsers?

I tested your code today,
                     dialog           save           open
Firefox 1.5          OK                 OK               OK
IE 6.0                OK                 OK                NGdailog: filename show in download popup dialog
save: save to disk from dialog
open: open directly from dailog

How can I get the ASCII code instead of Unicode?

If I use getNumericValue to return the Unicode of a character, I found out "a" = "A". Is there any way I can find out the different code for capital letter and small letter? I'm thinking about ASCII code, any suggestion for which method to return the ASCII code?

That is not what getNumericValue does. (Try getNumericValue('*'), for example, and you will see it returns -1.) If you want to find the Unicode value of a character, simply cast the character to an int:char letter = 'A';
int unicodeValue = (int)letter;If the character you chose was in the ASCII character set, you will automatically have its ASCII code, because ASCII is simply the first 128 characters in the Unicode character set.

ASCII characters to Character ( Unicode Conversion)

Similar Messages

Maybe you are looking for