Detecting non printables characters in a text file

Hi,
I need to remove some non printable characters like tabs, carriage returns, line feeds,.... and so!
i want to do something like
aString.replaceAll(<the non-printable char>, "");

str = str.replaceAll("\\P{Print}+", "");From http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html, \p{print} is printable characters and \P is the negative of \p.

Similar Messages

  • Non printable characters in a text file..

    hi,
    How to get blank lines and non-printable characters
    and remove those characters from the text file being uploaded from application server .
    thanks,
    Anil.

    Take a look at the constants in cl_abap_char_utilities. A simpler solution would be to ask for a file without such characters...

  • Removing non printable characters from an excel file using powershell

    Hello,
    anyone know how to remove non printable characters from an excel file using powershell?
    thanks,
    jose.

    To add - Excel is a binary file.  It cannot be managed via external methods easily.  You can write a macro that can do this.  Post in the Excel forum and explain what you are seeing and get the MVPs there to show you how to use the macro facility
    to edit cells.  Outside of cell text "unprintable" characters are a normal part of Excel.
    ¯\_(ツ)_/¯

  • Robohelp 9 .properties file inserting non-printable characters @ export

    I have a mapped help file that I am generating for integration to an online application. When we export the .properties file from the Project Set-up pod, the mapped files appear to be fine, if viewed in Notepad (see below).
    However, when this is viewed in a different text editor, you can see that RoboHelp added additional non-printable characters to the .properties file (see below).
    We've tried generating this from different computers, exporting it to different locations, retyping the initial entry, and haven't found a solution to this issue.
    Does anyone know if there is a fix available? Are we doing something wrong?
    Thanks!!
    Kelly

    Ask your developers if they think these characters could be what are known as BOM (byte order marks).
    That is something can be seen in some files using the default encoding. There it can be changed by changing the encoding in the SSL dialog.
    Maybe that explains it and if that is the cause, I don't know how you would prevent it here in Rh. I think you will have to live with your own solution.
    See www.grainge.org for RoboHelp and Authoring tips
    @petergrainge

  • Need help in removing non printable characters

    hi
    I am having an issue with non printable characters in webservice. This webservice dishes out xml in B2B communication to my clients programs. Due to data corruption in oracle (dont know who is creating bad data ) I am having non printable characters in the xml file which is generated from database. I am dishing out this to our customers. since the data in updated every day it is imposible to fix the data every time. I need to write a very very effficient method to strip non printable characters from strings from the xml. Can some one Please help on this one. I want to make sure this method is very efficient because this method could be potentially be called lots of times. I am using JDK 1.3.1 and oracle 8i
    Any help will be appreciated
    Thanks
    Ashok Pappu

    At some point you existing program is probably converting from String data to the XML bytes through a CharsetEncoder, probably inside a java.io.Writer.
    Perhaps your best approach might be to write your own java.nio.charset.CharsetEncoder which deals with the bad characters as you see fit.
    You can register a new java.nio.charset.CharSet as a private character set type. Because this should result in simply replacing a standard CharsetEncoder with a non-standard one hopefully the overheads would be low.

  • Servlet Displaying Quotation Marks as Non-Printable Characters

    I have a servlet which is reading an HTML file and displaying it's contents. My problem is that, in the output, quotation marks in the source html (" and ') are being reproduced as non-printable characters (). Furthermore, the same servlet prints the quotation marks fine under the Linux OS and Apache Web Server, but does not under the Windows (2000) OS and IIS Web Server (running j2sdk-1_3_0_02-win). Any suggestions would be appreciated. Code in question is below. "str" is the line from the file. :
         FileReader freader = new FileReader (filePath);
         BufferedReader breader = new BufferedReader(freader);
         String str = null;
         while ((str = breader.readLine()) != null) {
         document = document + str + "\n";
         freader.close();

    Technically, you don't need to add the "\n" in there anyway. Newlines mean nothing to an HTML file if all you're doing is displaying that file. The lack of a carriage return, when the HTML is parsed, is completely irrelevant.
    Also, when handling large String concatenations, it's always going to be more efficient to use StringBuffer.
    StringBuffer sbDocument = new StringBuffer();
    while((str = breader.readLine()) != null)
       sb.append(str);
    String document = sbDocument.toString()

  • Reading characters from a text file into a multidimensional array?

    I have an array, maze[][] that is to be filled with characters from a text file. I've got most of the program worked out (i think) but can't test it because I am reading my file incorrectly. However, I'm running into major headaches with this part of the program.
    The text file looks like this: (It is meant to be a maze, 19 is the size of the maze(assumed to be square). is free space, # is block, s is start, x is finish)
    This didn't paste evenly, but thats not a big deal. Just giving an idea.
    19
    5..................
    And my constructor looks like follows, I've tried zillions of things with the input.hasNext() and hasNextLine() to no avail.
    Code:
    //Scanner to read file
    Scanner input = null;
    try{
    input = new Scanner(fileName);
    }catch(RuntimeException e) {
    System.err.println("Couldn't find the file");
    System.exit(0);
    //Set the size of the maze
    while(input.hasNextInt())
    size = input.nextInt();
    //Set Limits on coordinates
    Coordinates.setLimits(size);
    //Set the maze[][] array equal to this size
    maze = new char[size][size];
    //Fill the Array with maze values
    for(int i = 0; i < maze.length; i++)
    for(int x = 0; x < maze.length; x++)
    if(input.hasNextLine())
    String insert = input.nextLine();
    maze[i][x] = insert.charAt(x);
    Any advice would be loved =D

    Code-tags sometimes cause wonders, I replaced # with *, as the code tags interprets # as comment, which looks odd:
    ******...*.........To your code: Did you test it step by step, to find out about what is read? You could either use a debugger (e.g., if you have an IDE) or system outs to get a clue. First thing to check would be, if the maze size is read correctly. Further, the following loops look odd:for(int i = 0; i < maze.length; i++) {
        for(int x = 0; x < maze.length; x++) {
            if (input.hasNextLine()) {
                String insert = input.nextLine();
                maze[x] = insert.charAt(x);
    }Shouldn't the nextLine test and assignment be in the outer loop? And assignment be to each maze's inner array? Like so:for(int i = 0; i < maze.length; i++) {
        if (input.hasNextLine()) {
            String insert = input.nextLine();
            for(int x = 0; x < insert.size(); x++) {
                maze[i][x] = insert.charAt(x);
    }Otherwise, only one character per line is read and storing a character actually should fail.

  • Non Printable Characters in varchar or varchar2 filed

    How can I know if a filed has non-printable characters.

    An example :
    TEST@db102 SQL> insert into test values('aaa'||chr(13)||'bbb'||chr(10)||'ccc');
    1 row created.
    TEST@db102 SQL> select * from test;
    A
    bbb
    ccc
    TEST@db102 SQL> select dump(a) from test;
    DUMP(A)
    Typ=1 Len=11: 97,97,97,13,98,98,98,10,99,99,99
    TEST@db102 SQL>                                                                      

  • Inserting strings of printable and non printable characters

    I would very much appreciate some help with the following
    To handle an interface with a legacy system I need to create strings containing both printable and non-printabel ascii characters. And with non printable characters I mean in particular those in the range of ASCII 128 to 159.
    It seems it is not possible to insert a string containting both printable and not printable characters from the afore mentioned range into a VARCHAR2 table column as the following demonstrates:
    insert into test values(chr(156)); -- this inserts the 'œ' symbol.
    SQL> select test, ascii(test), length(test), substr(test,1,1), ascii(substr(test,1,1))from test;
    TEST       ASCII(TEST) LENGTH(TEST) SUBSTR(TEST,1,1) ASCII(SUBSTR(TEST,1,1))
    ┐                  156            1That the the character mapped is shown as '┐' and not 'œ' is not really issue for my application, what is important is that the ASCII value is shown as 156, which is the ASCII code of the character I inserted.
    What is however strange (actually probably not strange but has to do with the lack of understanding of the issue at hand) is that substr returns an empty string...
    Now I try to insert a concatenated string, first the "non printable" character then a printable character
    insert into test values(chr(156)||chr(65));
    SQL> select test, ascii(test), length(test), substr(test,1,1), ascii(substr(test,1,1))from test;
    TEST       ASCII(TEST) LENGTH(TEST) SUBSTR(TEST,1,1) ASCII(SUBSTR(TEST,1,1))
    A                   65            1 A                                     65For some reason the not printable character (chr(156)) is now not inserted or at least does not appear when I selected the data from the table, this effect seems to apply to all characters in the range of ASCII 128 to 159 (tried some but not all) However for instance CHR(13) can be inserted as part of a string as shown above .
    For our application I really don't care much what character is shown or not show, what is important is that I can retrieve the ASCII value and that this value matches the one I inserted which for some reason does not seem to work.
    This seems to be, at least to some extent a character set issue. I have also tested this on a database with character sets set as follows
    NLS_CHARACTERSET
    WE8MSWIN1252
    NLS_NCHAR_CHARACTERSET
    AL16UTF16
    With WE8MSWIN1252 the described issue does NOT occur, however unfortunately I must use NLS_CHARACTERSET AL32UTF8 which produces the results as described above!
    As said any insights would be much appreciated as I am slowly but surely starting to despair.
    For completions sake, character sets are set as follows (changing it is NOT an option):
    NLS_CHARACTERSET
    AL32UTF8
    NLS_NCHAR_CHARACTERSET
    AL16UTF16
    The test table is created as follows
    CREATE TABLE TEST
    TEST VARCHAR2(1000 BYTE)
    Database Version 11.2.0.3.0
    Edited by: helios.taraba on Dec 2, 2012 10:18 AM --Added database version
    Edited by: helios.taraba on Dec 2, 2012 10:24 AM Added description of test results using NLS_CHARACTERSET WE8MSWIN1252

    Hello Orafad,
    Thanks for your reply, at least I understand the effects I'm seeing i.e.
    +"For multibyte character sets, n must resolve to one entire code point. Invalid code points are not validated, and the result of specifying invalid code points is indeterminate."+
    http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions026.htm
    You are absolutely right I could use chr(50579) to get the ligature symbol. However as what we are trying to achieve is to implement a legacy interface to a 20+ years old subsystem we are actually not so much interested in the symbol itself but rather in the ascii value of that symbol (156 as you so rightly point out in the win-1252 characterset), this particular field represents the lenght of the message being sent to the subsystem and can vary from decimal 68 to 164 and is also considered in a checksum calculation which is part of the message.
    As changing the nls_characterset of the database is not an option I guess I only have one reasonable avenue to resolve this namely to push the functionality to added the "encoded" length of the message (and the calculation of the checksum) to the java driver which is responsible for sending the message (tcp/ip) to the subsystem. Here we should not have any issues adding a byte with the value 156 (or any other for that matter) to the datastream.
    Thankfully all other fields have characters with ascii values below 128 and above 31.
    I'm going to leave my question as un-answered for a bit longer in the hopes of someone coming up with a golden bullet, although not getting my hopes up.
    Thanks, Helios

  • Removing the Control Characters from a text file

    Hi,
    I am using the java.util.regex.* package to removing the control characters from a text file. I got below programming from the java.sun site.
    I am able to successfully compile the file and the when I try to run the file I got the error as
    ------------------------------------------------------------------------D:\Debi\datamigration>java Control
    Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal repet
    ition
    {cntrl}
    at java.util.regex.Pattern.error(Pattern.java:1472)
    at java.util.regex.Pattern.closure(Pattern.java:2473)
    at java.util.regex.Pattern.sequence(Pattern.java:1597)
    at java.util.regex.Pattern.expr(Pattern.java:1489)
    at java.util.regex.Pattern.compile(Pattern.java:1257)
    at java.util.regex.Pattern.<init>(Pattern.java:1013)
    at java.util.regex.Pattern.compile(Pattern.java:760)
    at Control.main(Control.java:24)
    Please help me on this issue.
    Thanks&Regards
    Debi
    import java.util.regex.*;
    import java.io.*;
    public class Control {
    public static void main(String[] args)
    throws Exception {
    //Create a file object with the file name
    //in the argument:
    File fin = new File("fileName1");
    File fout = new File("fileName2");
    //Open and input and output stream
    FileInputStream fis =
    new FileInputStream(fin);
    FileOutputStream fos =
    new FileOutputStream(fout);
    BufferedReader in = new BufferedReader(
    new InputStreamReader(fis));
    BufferedWriter out = new BufferedWriter(
    new OutputStreamWriter(fos));
         // The pattern matches control characters
    Pattern p = Pattern.compile("{cntrl}");
    Matcher m = p.matcher("");
    String aLine = null;
    while((aLine = in.readLine()) != null) {
    m.reset(aLine);
    //Replaces control characters with an empty
    //string.
    String result = m.replaceAll("");
    out.write(result);
    out.newLine();
    in.close();
    out.close();

    Hi,
    I used the code below with the \p, but I didn't able to complie the file. It gave me an
    D:\Debi\datamigration>javac Control.java
    Control.java:24: illegal escape character
    Pattern p = Pattern.compile("\p{cntrl}");
    ^
    1 error
    Please help me on this issue.
    Thanks&Regards
    Debi
    // The pattern matches control characters
    Pattern p = Pattern.compile("\p{cntrl}");
    Matcher m = p.matcher("");
    String aLine = null;

  • Removing non-printable characters

    Hi All,
    I was suppose to remove all non-printable characters, hence created below function. But in trouble for some rows in table.
    function ar1(i_value in varchar2)
    return varchar2
    as
           pattern varchar2(1000) := '][';
           l_strVal varchar2(1000);
    begin
           for i in 32 .. 126 loop
              pattern := pattern || case when chr(i) = '''' then ''''''
                                         when chr(i) in ('[', ']') then null
                                         else chr(i)
                                    end;
           end loop;
              l_strVal := regexp_replace(i_value, '([' || pattern || '])|.', '\1', 1, 0, 'n');
    return l_strVal;
    end;
    /Could anyone please help me, is this right way to do it.
    Problem is occurring for one row as shown below.
    SQL> select num, ar1(author) author1, author from doc where doc_num =37;
       NUM
    AUTHOR1
    AUTHOR
      15098137
    OM LESRAVI{|~
    OM LES
    RAVIIts removing non-printable characters but not sure how these {|~ are included in the resultant.
    Would be great if anyone can help me.
    Edited by: YasserRACDBA on Nov 9, 2010 3:53 PM

    Thanks....but even your method is giving worng result as shown below.
    SQL> select regexp_replace(author,'[[:cntrl:]]')
      2  from doc where doc_num =15098137;
    REGEXP_REPLACE(AUTHOR,'[[:CNTRL:]]')
    OM LESRAVI{|~Is there any clue please...how come those {|~ are there??                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

  • Non printable characters

    Hello
    I have seen some posts about this topic, but none helped me.
    I need to send to a device, via serial port (rs-232), four codes composed by printable and non-printable characters
    For instance, I need to send a string with ASCII character 224, ASCII character 87, ASCII character 10, ASCII character 0 and ASCII character 191, together in the same string
    Can someone help me to do this?
    Thanks

    What part are you having trouble doing?
    To create those ASCII characters you can use a  string constant or control set to '\' codes or hex display.
    Lynn 

  • Excluding all non-printable characters using regexp_replace

    Hi All,
    I need help in excluding all non-printable charcters using regexp_replace but i was not able to exclude One special character as shown below.
    select regexp_replace('¥Tachyon-QX\_4 !? H*'' $(~!#$%^&*()?@#), BA' , '[^[:alnum:] \!\@\#\$\%\^\&\*\(\)\_\+\=\:\""\<\>\?\[\]\;\''\,\.\/\\\`\~\?\¥\] [:cntrl:]')
    from dual;
    REGEXP_REPLACE('%TACHYON-QX\_4!?H*''$(~!#$
    %Tachyon-QX\_4 !? H*' $(~!#$%^&*()?@#), BABut problem is having with special character which i am not able to print it here...having char value - 13 and ascii value - 19. I want this value to be excluded as this nonprintable charcater.
    Also i need to exclude newlines and print the output in a single line...
    Could any one help me in these two requirement..
    1. Exclude all non printable characters(Including special character having char value - 13 and ascii value - 19)
    2. Exclude all newline characters..
    Thanks,
    Yasser

    How about this?
    SQL>select regexp_replace('Tachyon-QX\_4 !? H*'' $(~!#$%^&*()?@#), BA
      2  some junk on line 2
      3  xyz line 3' , '[^[:print:]]') printable
      4  from dual;
    PRINTABLE
    Tachyon-QX\_4 !? H*' $(~!#$%^&*()?@#), BAsome junk on line 2xyz line 3

  • TextNode containing non-printable characters

    Hi,
    I have problem with TextNodes that contains non-printable charaters.
    i.e: CR and LF
    The problem occoures when i'm going to serialize my DOM object to an
    XMLString for
    storing into our database.
    The result of the serialization process is that CR and LF are replaced with
    whitespaces.
    Does anyone have an 100% safe code example .... ?
    TIA.
    Borre Nordbakken

    Take a look at the constants in cl_abap_char_utilities. A simpler solution would be to ask for a file without such characters...

  • How to read characters from a text file in java program ?

    Sir,
    I have to read the characters m to z listed in a text file .
    I must compare the character read from the file.
    And if any of the characters between m to z is matched i have to replace it with a hexadecimal value.
    Any help or suggesstions in this regard would be very useful.
    Thanking you,
    khurram

    Hai,
    The requirement is like this
    There is an input file, the contents of the file are as follows, you can assume any name for the file.
    #Character mappings for Japanese Shift-JIS character set
    #ASCII character Mapped Shift-JIS character
    m 227,128,133 #Half width katakana letter small m
    n 227,128,134 #Half width katakana letter small n
    o 227,129,129
    p 227,129,130
    q 227,129,131
    r 227,129,132
    s 227,129,133
    t 227,129,134
    u 227,129,135
    v 227,129,136
    w 227,129,137
    x 227,129,138
    y 227,129,139
    z 227,129,142
    The contents of the above file are to be read as input.
    On encountering any character between m to z, i have to do a replacement with a hexadecimal code point value for the multibyte representation in the second column in the input file.
    I have the code to get the unicode codepoint value from the multibyte representation, but not from a file.
    So if you could please tell me how to get the characters in the second column, it would be very useful for me.
    The character # is used to represent the beginning of a comment in the input file.
    And comment lines are to be ignored while reading the file.
    Say i have a string str="message";
    then i should replace the m with the unicode code point value.
    Thanking you,
    khurram

Maybe you are looking for

  • ICal Search doesn't find my events in the distant future !!!

    Here's a bit of a disturbing one: I use iCal (Leopard, so iCal 3.0.8 1287) to keep track of many things- not just a meeting next week or a TO DO - but also for long-term planning. I wanted to enter a reminder for about December 2016 concerning social

  • Downloading photos from email to iphoto

    I have downloaded photos from an email in gmail (all .jpg) and they are in finder (actually in the iphoto file) and will open in preview, but I cannot transfer to iphoto library. I am using iphoto 6.06, OS 10.4.1. I get the message: unreadable files,

  • How do I fix the slow transfer speed over USB on my 24" iMac?

    I am backing up/cloning with SuperDuper, and running Lion.  The displayed transfer speed never gets past about 20mb per second.  Painfully slow....  Any suggestions?

  • JAVA 64bit for NW2004s

    Hi, I wanted to install XI NW2004s on windows2003enterprise edition 64bit. Do i required JAVA also 64bit ? Or any JAVA will it work. for NW2004s what is the java verion compatablety ? Please suggest . - Lisa

  • Return Delivery and shipment for third party PO

    Hello Gurus, There is a NB type PO. But we want to return it back to the supplier. How can I do this. Do I need to create a return delivery and shipment against this PO? Please give me the steps I need to follow. Regards, Balu