Inserting strings of printable and non printable characters

I would very much appreciate some help with the following
To handle an interface with a legacy system I need to create strings containing both printable and non-printabel ascii characters. And with non printable characters I mean in particular those in the range of ASCII 128 to 159.
It seems it is not possible to insert a string containting both printable and not printable characters from the afore mentioned range into a VARCHAR2 table column as the following demonstrates:
insert into test values(chr(156)); -- this inserts the 'œ' symbol.
SQL> select test, ascii(test), length(test), substr(test,1,1), ascii(substr(test,1,1))from test;
TEST       ASCII(TEST) LENGTH(TEST) SUBSTR(TEST,1,1) ASCII(SUBSTR(TEST,1,1))
┐                  156            1That the the character mapped is shown as '┐' and not 'œ' is not really issue for my application, what is important is that the ASCII value is shown as 156, which is the ASCII code of the character I inserted.
What is however strange (actually probably not strange but has to do with the lack of understanding of the issue at hand) is that substr returns an empty string...
Now I try to insert a concatenated string, first the "non printable" character then a printable character
insert into test values(chr(156)||chr(65));
SQL> select test, ascii(test), length(test), substr(test,1,1), ascii(substr(test,1,1))from test;
TEST       ASCII(TEST) LENGTH(TEST) SUBSTR(TEST,1,1) ASCII(SUBSTR(TEST,1,1))
A                   65            1 A                                     65For some reason the not printable character (chr(156)) is now not inserted or at least does not appear when I selected the data from the table, this effect seems to apply to all characters in the range of ASCII 128 to 159 (tried some but not all) However for instance CHR(13) can be inserted as part of a string as shown above .
For our application I really don't care much what character is shown or not show, what is important is that I can retrieve the ASCII value and that this value matches the one I inserted which for some reason does not seem to work.
This seems to be, at least to some extent a character set issue. I have also tested this on a database with character sets set as follows
NLS_CHARACTERSET
WE8MSWIN1252
NLS_NCHAR_CHARACTERSET
AL16UTF16
With WE8MSWIN1252 the described issue does NOT occur, however unfortunately I must use NLS_CHARACTERSET AL32UTF8 which produces the results as described above!
As said any insights would be much appreciated as I am slowly but surely starting to despair.
For completions sake, character sets are set as follows (changing it is NOT an option):
NLS_CHARACTERSET
AL32UTF8
NLS_NCHAR_CHARACTERSET
AL16UTF16
The test table is created as follows
CREATE TABLE TEST
TEST VARCHAR2(1000 BYTE)
Database Version 11.2.0.3.0
Edited by: helios.taraba on Dec 2, 2012 10:18 AM --Added database version
Edited by: helios.taraba on Dec 2, 2012 10:24 AM Added description of test results using NLS_CHARACTERSET WE8MSWIN1252

Hello Orafad,
Thanks for your reply, at least I understand the effects I'm seeing i.e.
+"For multibyte character sets, n must resolve to one entire code point. Invalid code points are not validated, and the result of specifying invalid code points is indeterminate."+
http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions026.htm
You are absolutely right I could use chr(50579) to get the ligature symbol. However as what we are trying to achieve is to implement a legacy interface to a 20+ years old subsystem we are actually not so much interested in the symbol itself but rather in the ascii value of that symbol (156 as you so rightly point out in the win-1252 characterset), this particular field represents the lenght of the message being sent to the subsystem and can vary from decimal 68 to 164 and is also considered in a checksum calculation which is part of the message.
As changing the nls_characterset of the database is not an option I guess I only have one reasonable avenue to resolve this namely to push the functionality to added the "encoded" length of the message (and the calculation of the checksum) to the java driver which is responsible for sending the message (tcp/ip) to the subsystem. Here we should not have any issues adding a byte with the value 156 (or any other for that matter) to the datastream.
Thankfully all other fields have characters with ascii values below 128 and above 31.
I'm going to leave my question as un-answered for a bit longer in the hopes of someone coming up with a golden bullet, although not getting my hopes up.
Thanks, Helios

Similar Messages

Validation for Unique Name and Non-AlphaNumeric Characters

Hi All,
How to do Unique name and Non-Alpha Numeric Characters Validation?
Name should be unique and only allow alphanumeric characters.
Where all validations must be done? In EOImpl or any other file?
Plz help
Thanks,
Sk

SK
Here are the steps you need to perform to check duplicate Employee Names need not entered by user.
First create a VVO in your schema.server package(that of EO) with the following query.Generate both VVOImpl and VVORowImpl file.
select full_name
from fwk_tbx_employees
where full_name =:1Now add this VVO to your VAM means give instance to it.
Now open the VVOImpl and write below code to execute query based on new name entered by user
    public void initQuery(String name)
      setWhereClauseParams(null); // Always reset
      setWhereClauseParam(0, name);
      executeQuery();
    }Now open Your Entity Expert class and add below method in it
   public boolean isEmployeeNameExists(String name)
     boolean isExists = false;
           // Note that we want to use a cached, declaratively defined VO instead of creating
           // one from a SQL statement which is far less performant.
     EmployeeNameVVOImpl employeeNameVVO =
       (EmployeeNameVVOImpl)findValidationViewObject("EmployeeNameVVO1");
     employeeNameVVO.initQuery(name);
     // We're just doing a simple existance check. If we don't find a match, return false.
     if (employeeNameVVO.hasNext())
       isExists = true;
     return isExists;
   }Now you need to call this expert class method from setter method of EOImpl for name.so Open your EOImpl file and go to setter method of name
      if ((value != null) || (!("".equals(value.trim()))))
        EmployeeEntityExpert expert = getEmployeeEntityExpert(getOADBTransaction());
        if (!(expert.isEmployeeNameExists(value)))
            throw new OAException("Duplicate Employee Name", OAException.ERROR);
      }Remember to write code above setAttributeInternal.
Hope it helps!!!!
Let me know if you have queries in it.
Thanks
AJ

Flex, xml, and non-English characters

Hello! I have a Flex web app with AdvancedDataGrid. And I use httpService component to load some data to grid. The .xml file contains non-english characters in attributes (russian in my case) like this:
<?xml version="1.0" encoding="utf-8" ?>
   <Autoparts>
    <autopart DESCRIPTION="Барабан">
</Autoparts>
And when i run app, AdvancedDataGrid display it like "ÑÐŸÐ". How can i fix it? I try to change encoding="utf-8" with some another charsets, bun unsuccesfully. Thank you.

Try changing the xml structure by using CDATA instead of having the russian part as an attribute and see if that makes any difference.
What I meant is use something like this:
<?xml version="1.0" encoding="utf-8" ?>
   <Autoparts>
    <autopart>
       <description><![CDATA[Барабан]]></description>
</autopart>
</Autoparts>
instead of the current xml.

Search for users and non-ASCII characters

I am having a little issue with the "Accounts - Find Users" functionality. The search breaks on what I assume is non-ASCII characters (we use the following three up here in Denmark: �, �, �). To be precise, I have a user with the first name "J�rgen". Searching for first names starting with "J" works just fine but "J�" returns zero matches.
My setup is with two machines, one (A) holding the MySQL database and one (B) serving Identity Manager on top of tomcat.
Both A and B are RHEL boxes, and both have da_DK.UTF-8 as default locale.
MySQL's /etc/my.cnf file has the following entry (as recommended in create_waveset_tables.mysql):
[mysqld]
default-character-set=utf8
default-collation=binFor clarity, some functionality works just fine in Identity Manager with these non-ASCII characters such as adding a user whose name contains non-ASCII characters (not only �� but also � for example). At the moment, it appears to be the search functionality which is not working correctly as I would expect it to. I'm still on the fence concerning whether I've missed something in terms of configuration, or whether this is a limitation.
Does anyone know whether this problem is on my side or the software's side?

I am having a little issue with the "Accounts - Find Users" functionality. The search breaks on what I assume is non-ASCII characters (we use the following three up here in Denmark: �, �, �). To be precise, I have a user with the first name "J�rgen". Searching for first names starting with "J" works just fine but "J�" returns zero matches.
My setup is with two machines, one (A) holding the MySQL database and one (B) serving Identity Manager on top of tomcat.
Both A and B are RHEL boxes, and both have da_DK.UTF-8 as default locale.
MySQL's /etc/my.cnf file has the following entry (as recommended in create_waveset_tables.mysql):
[mysqld]
default-character-set=utf8
default-collation=binFor clarity, some functionality works just fine in Identity Manager with these non-ASCII characters such as adding a user whose name contains non-ASCII characters (not only �� but also � for example). At the moment, it appears to be the search functionality which is not working correctly as I would expect it to. I'm still on the fence concerning whether I've missed something in terms of configuration, or whether this is a limitation.
Does anyone know whether this problem is on my side or the software's side?

Ignoring spaces and non-alphanumeric characters

I need some help with this program im working on. It's testing to see whether words are palindromes or not. The trouble is, i need to make the program ignore spaces and puncuation. Does anyone know of a command to do this, or a command where i can list all the common types of puncuation and get the program to ignore them.
Thanks

you can use regular expressions (regex) for this. I'm not very experienced with regex (though I admit I should become familiar with it ;)), so for more information check out Sun's regex tutorial and regular-expressions.info.
if you're not interested in regex (or if you don't have access to a 1.4+ SDK), you can just use ASCII filtering to achieve this (see this ASCII table for basic ASCII values). Basically, you're only going to accept A-Z and 0-9 characters (if I understand your question correctly). This means the only ASCII values you want to accept is 48-57, 65-90, and 97-122. You can use a simple boolean method to check if a character is valid. Something like this should work:
public boolean valid(char c) {
     int x = (int)c; // ascii value of c
     return ((x >= 48 && x <= 57) || (x >= 65 && x <= 90) || (x >= 97 && x <= 122));
}Use this method on every character of the String you want to process. So to validate an entire String you can use a method like this:
public boolean valid(String s) {
     for (int j = 0;j < s.length();j++) {
          if (!valid(s.charAt(j))) {
               return false;
     return true;
}Beware though that this is considerably slower than using regular expressions. Also I didn't compile this example so there might be a small mistake you'll have to fix.

Java.io.File and non-unicode characters in file name

Unix filesystem object names are byte sequences. These byte sequences are not required to correspond to any character sequence in the current or any locale. How do I open a file if it has characters that do not corrospond to a valid unicode encoding for some current locale? Unless I am missing something, if I do a list on a parent directory that has some file names like this, those file names do not get added to the list. Hmmm....
R.

OK, create.c is a program that will create a file whose name is not a character in the 'ja' locale.
Lister.java defines a class that lists files in the current directory. For each file, it spits out the 'toString()' version of the file, the char array of the name as hex, and the 'getBytes' byte array of the name.
So, what you can do is compile and run create.c, which will create a file whose name is a single byte whose hex value is 99. Then compile and run Lister.java, which will give you the following output (shown for two different locales:
$ export LANG=
$ java Lister
name:?; chars:99,; bytes:99,
$ export LANG=ja
$ java Lister
name:?; chars:fffd,; bytes:3f,
---------------------------------------------Note that when running in the JA locale, there is no character corresponding to byte value 0x99. So, Java uses the replacement character 0xFFFD, and the '?' character 0x3F, as a replacement.
The point is that there are files which Java cannot uniquely represent as a straight String. I suppose we could get the filename via JNI, do the conversion ourselves, and then use the private-use area of Unicode to encode all our strings, but ugh.
//create.c
#include <stdio.h>
int main()
   const char* name = "\x99";
   FILE* file = fopen( name, "w" );
   if( file == NULL )
      printf( "could not open file %s\n", name );
      return 1;
   fclose( file );
   return 0;
// Lister.java
import java.io.*;
public class Lister
    public static void main( String[] args )
        new Lister().run();
    public void run()
        try
            doRun();
        catch( Exception e )
            System.out.println( "Encountered exception: " + e );
    private void doRun() throws Exception
        File cwd = new File( "." );
        String[] children = cwd.list();
        for( int i = 0; i < children.length; ++i )
            printName( children[ i ] );
    private void printName( String s )
        System.out.print( "name:" );
        System.out.print( s );
        System.out.print( "; chars:" );
        printCharsAsHex( s );
        System.out.print( "; bytes:" );
        printBytesAsHex( s );
        System.out.println();
    private void printCharsAsHex( String s )
        for( int i = 0; i < s.length(); ++i )
            char ch = s.charAt( i );
            System.out.print( Integer.toHexString( ch ) + "," );
    private void printBytesAsHex( String s )
        byte[] bytes = s.getBytes();
        for( int i = 0; i < bytes.length; ++i )
            byte b = bytes[ i ];
            System.out.print( Integer.toHexString( unsignedExtension( b ) ) + "," );
    private int unsignedExtension( byte b )
        return (int)b & 0xFF;
}

Cfimage and non-english characters

I've been googling for hours and just can not find anything
related to this, very strange, I am trying to use ImageDrawText
draw Chinese onto an image but could never get it to work, tried so
many things, setting the page encoding to UTF-8, cfcontent,
setEncoding etc., it displays the Chinese fine on the page, but
passing it to the ImageDrawText method, writetobrowser gives me an
image with all square boxes instead of the chinese characters.
anyone else having same problem with other non-english
languages?
thanks

edwardch wrote:
> thanks, tried Arial unicode MS, but unfortunately my
hosting server doesn't
> have the font :(, CF gives an error: Unable to find
font: Serif.
serif? not sure where that's coming from. can you try "@Arial
Unicode MS"?
unfortunately this is image work so you *have* to have a
unicode capable font
physically available (unlike PDF/flashpaper where you might
"poke & hope" that
the font's on the client & simply not embed the font).
> Have also tried Lucida Sans Unicode, doesn't work for
Chinese.
no it doesn't.
> and can't find the Arial unicode MS.ttf file, even if I
have the file, how can
> I install it using coldfusion codes?
you can't, either have to add it via windows control panel
(or whatever for
linux) or via cfadmin font management. see these for
potential fonts, etc.
http://en.wikipedia.org/wiki/Arial_Unicode_MS
http://en.wikipedia.org/wiki/Free_software_Unicode_typefaces
http://en.wikipedia.org/wiki/Unicode_typefaces
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&cat_id=FontDownloads
or you might simply ask your host what fonts are on the
server that can support
chinese.

INSO_FILTER and non-English characters

Hello!
CLOB colomn contains various formated documents (doc, pdf, plain
text) mostly in win1251 charset.
After succesfull indexing, I have "no result rows" when using
CONTAINS with russian word query, but everething's fine with
english words. Why did iMText index CLOB incorrectly and what
should I do?
Also when I try to retrieve document with highlighting
(ctx_doc.markup), all russian characters are replacing by '*',
NLS_LANG = AMIRICAN_AMERICA.CL8MSWIN1251
Oracle 8.1.7
Linux Red Hat 6.2
create index idx_files on files(content)
indextype is ctxsys.context parameters
('filter ctxsys.inso_filter');

Did you try your NLS_LANG set to Russian?

Why doesn't the quality meter approve of any of my choices of master passwords, no matter how I mix both letters, digits and non-alphanumeric characters?

I tried to create a master password, but it couldn't be done 'cause the quality meter never accepted any of my passwords, even though I used a very complicated mix of letters, digits and special characters as well.

There are no requirements when setting Firefox's master password. The password quality meter is purely informative.
If you're trying to fill the bar all the way, that's unlikely unless you create a lengthy password, or a very complex mix of special characters, digits and lowercase and uppercase letters. Keep in mind the master password should be easily remembered and typed in, time after a time. A highly secure password that is easily mistyped or forgotten is of no use.

Non printable characters

Hello
I have seen some posts about this topic, but none helped me.
I need to send to a device, via serial port (rs-232), four codes composed by printable and non-printable characters
For instance, I need to send a string with ASCII character 224, ASCII character 87, ASCII character 10, ASCII character 0 and ASCII character 191, together in the same string
Can someone help me to do this?
Thanks

What part are you having trouble doing?
To create those ASCII characters you can use a string constant or control set to '\' codes or hex display.
Lynn

Non printable characters in a text file..

hi,
How to get blank lines and non-printable characters
and remove those characters from the text file being uploaded from application server .
thanks,
Anil.

Take a look at the constants in cl_abap_char_utilities. A simpler solution would be to ask for a file without such characters...

Strings, byte[]s, and encoding ....

I'm realising I really don't know anything of the java encoding fonctionnalities ...
For example: I' ve written amethod String encode(String) which transforms a bytecode (I hope parameter's byecode) in an other bytecode (I hope result's bytecoed). This transformation is done with a special Charset that I've built. And I use Charset.encode(String).
I've try to call this method with two String parameters. These two String give the same result when calling System.out.println(....) and String.getBytes(). But the two results of my method's call with these two parameters are different !!!!!! How is it possible ?????
There a lot of things I don't know. For example:
- how many bytes are needed to encode an ISO-8859-1 (default java encoding) or UTF-8 character ?
- How can I get the real bytecode of a String, I mean the bytecodes of all its characters ? (for example if a String contains 4 characters, its real bytecode should contain 8 bytes)
Thank's for any help.

I've try to call this method with two String
parameters. These two String give the same result when
calling System.out.println(....) and
String.getBytes(). But the two results of my method's
call with these two parameters are different !!!!!!
How is it possible ?????You used getBytes(String charsetName) with a set of characters that included both overlapping and non-overlapping characters?

How to load Registered TradeMark and the Copyright Characters ?

Hello,
How can I insert the Registered TradeMark and the Copyright Characters in the database ?
Using SQL*Loader I am trying to load a flat file into a table having CLOB column. But in place of these chars a ? gets into the database.
Regards,
Swati.

<BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by Swati Agrawal ([email protected]):
Hello,
How can I insert the Registered TradeMark and the Copyright Characters in the database ?
Using SQL*Loader I am trying to load a flat file into a table having CLOB column. But in place of these chars a ? gets into the database.
Regards,
Swati.<HR></BLOCKQUOTE>
Hi
Maybe you have different character sets on client and server and one of them doesn't support this characters. If it's true for the server you no chance to insert this symbols in character columns. If it's on client you can try to use different character set for SQL*Loader
Regards
null

Robohelp 9 .properties file inserting non-printable characters @ export

I have a mapped help file that I am generating for integration to an online application. When we export the .properties file from the Project Set-up pod, the mapped files appear to be fine, if viewed in Notepad (see below).
However, when this is viewed in a different text editor, you can see that RoboHelp added additional non-printable characters to the .properties file (see below).
We've tried generating this from different computers, exporting it to different locations, retyping the initial entry, and haven't found a solution to this issue.
Does anyone know if there is a fix available? Are we doing something wrong?
Thanks!!
Kelly

Ask your developers if they think these characters could be what are known as BOM (byte order marks).
That is something can be seen in some files using the default encoding. There it can be changed by changing the encoding in the SSL dialog.
Maybe that explains it and if that is the cause, I don't know how you would prevent it here in Rh. I think you will have to live with your own solution.
See www.grainge.org for RoboHelp and Authoring tips
@petergrainge

Remove of non-printable characters from string

Hi Gurus,
How can I achive that? I have a string in which sign "end of line" occurss. How can I delete them?
BR
Marcin Cholewczuk

Hi Marcin,
Just use a [REPLACE|http://help.sap.com/abapdocu_70/en/ABAPREPLACE.htm] with a regular expression (Assuming variable STRING holds your data):
replace all occurrences of regex '[\n\r]+' in STRING with ''.
Note that I replaced newline ('\n') - also called end of line - and carriage return ('\r') with nothing (matching your delete request). This might not be wanted if you have a true multiline string. If you're more paranoid and want to cover more cases, you might want to replace any non-displayable characters using '\[\[:print:\]\]+' as your search pattern. So if you want to replace non-displayable characters with a space and just kill it at the end of the string you could use something like this:
replace all occurrences of regex '[^[:print:]]+(?!$)' in STRING with ` `.
replace all occurrences of regex '[^[:print:]]+$' in STRING with ''.
Cheers, harald

Inserting strings of printable and non printable characters

Similar Messages

Maybe you are looking for