Removing Non-Ascii Characters from a String

Hi Everyone,
I would like to remove all NON-ASCII characters from a large string. For example, I am taking text from websites and would like to remove all the strange arabic and asian characters. How can I accomplish this?
Thank you in advance.

I would like to remove all NON-ASCII characters from a large string. I don't know if its a good method but try this:
str="\u6789gj";
output="";
for(char c:str.toCharArray()){
     if((c&(char)0xff00)==0){
          output=output+c;
System.out.println(output);
all the strange arabic and asian characters.Don't call them so.... I am an Indian Muslim ;-) ....
Thanks!

Similar Messages

  • Removing non english characters from my string input source

    Guys,
    I have problem where I need to remove all non english (Latin) characters from a string, what should be the right API to do this?
    One I'm using right now is:
    s.replaceAll("[^\\x00-\\x7F]", "");//s is a string having chinese characters.
    I'm looking for a standard Solution for such problems, where we deal with multiple lingual characters.
    TIA
    Nitin

    Nitin_tiwari wrote:
    I have a string which has Chinese as well as Japanese characters, and I only want to remove only Chinese characters.
    What's the best way to go about it?Oh, I see!
    Well, the problem here is that Strings don't have any information on the language. What you can get out of a String (provided you have the necessary data from the Unicode standard) is the script that is used.
    A script can be used for multiple languages (for example English and German use mostly the same script, even if there are a few characters that are only used in German).
    A language can use multiple scripts (for example Japanese uses Kanji, Hiragana and Katakana).
    And if I remember correctly, then Japanese and Chinese texts share some characters on the Unicode plane (I might be wrong, 'though, since I speak/write neither of those languages).
    These two facts make these kinds of detections hard to do. In some cases they are easy (separating latin-script texts from anything else) in others it may be much tougher or even impossible (Chinese/Japanese).

  • Removing non-numeric characters from string

    Hi there,
    I need to have the ability to remove non-numeric characters from a string and I do not know how to do this.
    Does any one know a way?
    Example:
    Present String: (02)-2345-4607
    Required String: 0223454607
    Thanks in advance

    Dear NickM
    Try this this will work...........
    create or replace function char2num(mstring in varchar2) return integer
    is
    -- Function to remove Special characters and alphebets from phone no. string field
    -- Author - Valid Bharde.(India-Mumbai)
    -- Date :- 20 Sept 2006.
    -- This Function will return numeric representation.
    -- The Folowing program is gifted to NickM with respect to his post on oracle site regarding Removing non-numeric characters from string on the said date
    mstatus number :=0;
    mnum number:=0;
    mrefstring varchar2(50);
    begin
    mnum := length(mstring);
    for x in 1..mnum loop
    if (ASCII(substr(upper(mstring),x,1)) >= 48 and ASCII(substr(upper(mstring),x,1)) <= 57) then
    mrefstring := mrefstring || substr(mstring,x,1);
    end if;
    end loop;
    return mrefstring;
    end;
    copy the above program and use it at function for example
    SQL> select char2num('(022)-453452781') from dual;
    CHAR2NUM('(022)-453452781')
    22453452781
    Chao!!!

  • Removing Non-numeric characters from Alpha-numeric string

    Hi,
    I have one column in which i have Alpha-numeric data like
    COLUMN X
    +91 (876) 098 6789
    1-567-987-7655
    so on.
    I want to remove Non-numeric characters from above (space,'(',')',+,........)
    i want to write something generic (suppose some function to which i pass the column)
    thanks in advance,
    Mandip

    This variation uses the like operators pattern recognition to remove non alphanumeric characters. It also
    keeps decimals.
    Code Snippet
    CREATE FUNCTION dbo.RemoveChars(@Str varchar(1000))
    RETURNS VARCHAR(1000)
    BEGIN
    declare @NewStr varchar(1000),
    @i int
    set @i = 1
    set @NewStr = ''
    while @i <= len(@str)
    begin
    --grab digits or (| in regex) decimal
    if substring(@str,@i,1) like '%[0-9|.]%'
    begin
    set @NewStr = @NewStr + substring(@str,@i,1)
    end
    else
    begin
    set @NewStr = @NewStr
    end
    set @i = @i + 1
    end
    RETURN Rtrim(Ltrim(@NewStr))
    END
    GO
    Code to validate:
    Code Snippet
    declare @t table(
    TestStr varchar(100)
    insert into @t values ('+91 (8.76) \098 6789');
    insert into @t values ('1-567-987-7655');
    select dbo.RemoveChars(TestStr)
    from @t

  • Removing Non Ascii Characters.

    Dear Friends,
    In our application, User copying some data from a document and pasting in a field "Comments".
    If that data consists anything like bullets,arrows of word document. It is inserting some Non keyboard characters into database like below.
    • Analysys
    • Do
    • Now
    • When
    • As
    • We
    don’t know how much he love sthe testing’I am not crazyh’
    I AM ‘USER’ 
    
    
     Uu
     Yy
     tt
    Now user asking to remove all those Non-ASCII characters from Comments Column. Please help!

    Hi Santosh,
    I can remember that I have given you the REGEXP_REPLACE query earlier which you have specified and told you to read some document about it to modify according to your need. It is not very wise thing to depend on others every time.
    Re: Removing Junk Characters.
    Anyway, REGEXP_REPLACE(str,'[^[a-z,A-Z,0-9,chr(0)-chr(127)[:space:]]]*','') can give you some pointer (not tested).

  • Removing non printable characters from an excel file using powershell

    Hello,
    anyone know how to remove non printable characters from an excel file using powershell?
    thanks,
    jose.

    To add - Excel is a binary file.  It cannot be managed via external methods easily.  You can write a macro that can do this.  Post in the Excel forum and explain what you are seeing and get the MVPs there to show you how to use the macro facility
    to edit cells.  Outside of cell text "unprintable" characters are a normal part of Excel.
    ¯\_(ツ)_/¯

  • Regex patern to remove non-ascii characters

    Hi,
    How to remove non-ascii character from input for a country france using RegEx.
    Could you please help us to contruct regex pattern from above?
    Thanks

    This isn't a complete answer, but is a good starting point:
    Regex any ascii character - Stack Overflow

  • Remove all non-number characters from a string

    hi
    How i can remove all non-number characters from a column ? for example , i have a column that contains data like
    'sd3456'
    'gfg87s989'
    '45/45fgfg'
    '4354-df4456'
    and i want to convert it to
    '3456'
    '87989'
    '4545'
    '43544456'
    thx in adv

    Or in 9i,
    Something like this ->
    satyaki>
    satyaki>with vat
      2  as
      3    (
      4      select 'sd3456' cola from dual
      5      union all
      6      select 'gfg87s989' from dual
      7      union all
      8      select '45/45fgfg' from dual
      9      union all
    10      select '4354-df4456' from dual
    11    )
    12  select translate(cola,'abcdefghijklmnopqrstuvwxyz-/*#$%^&@()/?,<>;:{}[]|\`"',' ') res
    13  from vat;
    RES
    3456
    87989
    4545
    43544456
    Elapsed: 00:00:00.00
    satyaki>
    {code}
    I checked this with minimum test cases. It will be better if you checked it with other cases.
    Regards.
    Satyaki De.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

  • Removing non-English characters from data.

    Ours is global system with some data with non-English characters. We want to download file by removing this non-English characters.
    Any suggestions how we can remove these non-English characters from file..?

    The FM u said
         Replace non-standard characters with standard characters
       Functionality
         SCP_REPLACE_STRANGE_CHARS processes a text so that it only contains
         simple characters. Special characters and national characters are
         replaced in such a way that the text remains reasonably legible.
         The character set 1146 is used by default. In this case the following
         replacements are made, for example:
          Æ ==> AE        (AE)
          Â ==> A         (Acircumflex)
          Ä ==> Ae        (Adieresis)
          £ ==> L         (sterling)
         Note that the new text can be longer than the old.
    So i dont think it ll be useful for eliminating the sp. chars.
    U have to check each and every alphabet with std 26 alphabets
    Thanks & Regards
    vinsee

  • How to remove non-ASCII charcters from an XML generated using Simple Transf

    Hi,
    I am currently facing a problem where I invoke a ST like
    CALL TRANSFORMATION ZTEST
      source root = str
      result xml rawstr.
    to prepare an XML using the contents of the ABAP variable str.
    In my case sometime the variable str can contain non-ASCII characters. What I find is that ST do not remove these characters and the final XML that get generated thus contains non-parsable xml charcaters.
    Is there an efficient way to remove/replace such non-ascii characters within the ST such that my final XML is consumable by any xml parser. I do not want to do a second level of processing by running through the output xml and removing the charcaters individually, because in our system the number of xml messages generated is very high and any such lookup-replace algorithm terms out to be too coslty.
    Regards,
    Vikas Lamba

    Hi
    may be you know this syntax :)
    <?xdofx:substr(SHIP_TO_LOCATION_NAME,11,44)?>
    Rahul

  • How can I remove non-numeric characters from a cell?

    I have a file an rtf file that I can open in Numbers. It puts each line in a separate cell. Each cell contains non-numeric and numeric characters. I'd like to delete the non-numeric characters so that I can add the numbers together. Is there a way to do this easily in Numbers that doesn't require doing it manually?
    Thanks,
    David

    Ok, David,
    This solution will work for vlaues up to 99,000 and if there is a space in front of your amount. There are two parts for clarity but you could wrap them up into one formula if you wanted to.
    B2 =FIND(" ",A2,LEN(A2)−9)
    C2 =MID(A2,B2,10)
    If there is a return before your amount (certain cells in your screenshot got me wondering) then the formula in column B
    =FIND("
    ",A2,1)
    It looks funny because it is finding the return.
    Let me know if this works for you.
    quinn

  • Removing non-english characters

    Hi,
    I'm trying to define a regular expression that helps me to replace non-english characters from a string.
    For example:
    BESANÇON
    and I need to get something like: BESANCON, or BESAN*ON.
    Could any one give me some hints?
    Max A.

    You can use the convert function:
    SELECT CONVERT('BESANÇON','US7ASCII')
    FROM dual;
    CONVERT(
    BESANCON
    1 row selected.

  • Cant Insert non-ascii characters?

    I need to insert and retrieve non-ascii characters from Oracle8i using PHP. How is this done? Do I need to change the NLS settings? How is this done?
    I also using PHP.
    Thanks
    Pete

    Ok, I solved the problem.
    I had to put at the top request.setCharacterEncoding("utf-8");
    Spencer

  • Removing non-alpha-numeric characters from a string

    How can I remove all non-alpha-numeric characters from a string? (i.e. only alpha-numerics should remain in the string).

    Or even without a loop ?
    Extract from the help for the Search and Replace String function :
    Right-click the Search and Replace String function and select Regular Expression from the shortcut menu to configure the function for advanced regular expression searches and partial match substitution in the replacement string.
    Extract from the for the advanced search options :
    [a-zA-Z0-9] matches any lowercase or uppercase letter or any digit. You also can use a character class to match any character not in a given set by adding a caret (^) to the beginning of the class. For example [^a-zA-Z0-9] matches any character that is not a lowercase or uppercase letter and also not a digit.
    Message Edité par JB le 05-06-2008 01:49 PM
    Attachments:
    Example_VI_BD4.png ‏2 KB

  • Non ascii characters being sent from a parameter in a form

    Hi!
    I have seen many topics posted on passing non ascii characters through parameters from one servlet to another and converting them into whatever format is necessary.
    However, I have not seen anyone answer the following question. I have a jsp page (html) with the character encoding set to utf-8. The user inputs some data in to a text field which is inside a form. The data could be in non ascii characters such as hebrew or arabic. This form is then sent to another jsp where i try to retreive the data from teh text field. No matter what i do, i cannot get the data presented correctly. It is either question marks or other wierd symbols.
    I have tried every permetation of encoding of the actual html page, the ecoding of the string from request.getParameter etc but it still is not presented on the new html page correctly.
    Can anyone help??
    Spencer

    Ok, I solved the problem.
    I had to put at the top request.setCharacterEncoding("utf-8");
    Spencer

Maybe you are looking for