RegEx in TSQL - replace non-alphanumeric characters etc

Hi guys, I have this function in VB that I used in Access to replace all non-alphanumeric characters, including spaces and anything in brackets.
Public Function charactersonly(inputString As String) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = "$[^)]+$|[^\w]|_"
RE.Global = True
charactersonly = RE.Replace(inputString, "")
Set RE = Nothing
End Function
Now, I moved to SQL server and I'm writing scripts to do same thing.
How can I use RegEx in TSQL?
Only thing I will do is that function.

As alternative
declare @string varchar(200)
set @string = 'gg$%^^&is%^& s2342jjk23&&({}e c76l232e+_+a#n/ c][#o''y#e'
select cast(cast((select substring(@string,n,1)
from numbers
where n <= len(@string)
and substring(@string,n,1) like '[0-9 ]' for xml path('')) as xml)as varchar(max))
Best Regards,Uri Dimant SQL Server MVP,
http://sqlblog.com/blogs/uri_dimant/
MS SQL optimization: MS SQL Development and Optimization
MS SQL Consulting:
Large scale of database and data cleansing
Remote DBA Services:
Improves MS SQL Database Performance
SQL Server Integration Services:
Business Intelligence

Similar Messages

How to replace non-alphanumeric characters with " " in a String?

Hi,
Anyone can help with this?
I guess I should use the replaceAll-method??

I need to keep characters that are generally ok to
use in
sentences, like ".", ",", "!", and "-" and alsoall
digits and letters
(numbers and alphabetic characters).Add those characters to the pattern in that case. Add
them just before ]
but placing the '-' char as the last in the list.

Non-alphanumeric characters in textarea causing 404 errors

I'm only just becoming acquainted with Coldfusion, as I've been asked to fix a problem in an existing system. The system consists of a simple html form containing a text area, among other input fields. The form is submitted to a cfm script, which displays a confirmation page and sends a couple of emails. The problem is that whenever the user enters any (as far as I can tell) non-alphanumeric characters, e.g. quotes, commas or brackets, in the textarea field, they get a 404 response from the server.
I tried a number of things to identify the problem, including escaping the text in javascript before submission, but didn't get anywhere. The last thing I tried was installing a local version of the CF on my workstation, hoping to reproduce the problem and thus debug it more easily. However, on my local CF setup the problem does not occur.
Can anyone help me debug this please? Thanks,
Marcin

Thanks for your response. The form method is "post". The only thing that the CFM handler does with this field (gametestexperience) is include it at the bottom of an email message:
<CFMAIL TO="[email protected]"
FROM="#email#"
SUBJECT="Tester Application"
type="html"
server = "Cluster9.us.messagelabs.com">
Automatically logged from the internet form:<p><p>
<table>
<tr><td>EmailFrom</td><td>#email#</td>
<tr><td>From</td><td>#firstName# #lastName#</td></tr>
<tr><td>Sent</td><td>#DateFormat(Now())#</td></tr>
<tr><td>Referrer</td><td>#referer#</td></tr>
<tr><td>Confirmation</td><td>#agree1#, #agree2#, #agree3#, #agree4#, #agree5#</td></tr>
<tr><td>FamilyCompetitor</td><td>#familycompetitor#</td></tr>
<tr><td>FamilyWork</td><td>#familywork#</td></tr>
<tr><td>ReferredBy</td><td>#referredBy#</td></tr>
<tr><td>RefererDetails</td><td>#refererdetails#</td></tr>
<tr><td>FirstName</td><td>#firstName#</td></tr>
<tr><td>LastName</td><td>#lastName#</td></tr>
<tr><td>Email</td><td>#email#</td></tr>
<tr><td>TelephoneHome</td><td>#telephoneHome#</td></tr>
<tr><td>TelephoneCell</td><td>#telephoneCell#</td></tr>
<tr><td>TelephoneDay</td><td>#telephoneDay#</td></tr>
<tr><td>City</td><td>#homeTown#</td></tr>
<tr><td>Employment</td><td>#employment#</td></tr>
<tr><td>GamerType</td><td>#gamerType#</td></tr>
<tr><td>GamerStyle</td><td>#gamerStyle#</td></tr>
<tr><td>HoursPerWeek</td><td>#hoursPerWeek#</td></tr>
<tr><td>PreferredGenres</td><td>#preferredGenres#</td></tr>
<tr><td>Consoles</td><td>#console#</td></tr>
<tr><td>GamesPlayed</td><td>#gameplayed#</td></tr>
<tr><td>Available</td><td>#available#</td></tr>
<tr><td>TestedBefore</td><td>#testedBefore#</td></tr>
<tr><td>TestingExperience</td><td>#gametestexperience#</td></tr>
</table>
Thanks,<P>
Focus Test Team!<P>
</CFMAIL>

[solved] Non-alphanumeric characters broken in TinyChat

Hello, fellow archers.
I've been having a bit of trouble with the TinyChat site. Most non-alphanumeric characters are being displayed as crossed-out boxes. This problem is exclusive to TinyChat and has not occured in any other Flash applications.
An example of the broken characters:
Note how the / and > characters are not broken.
This is the second installation of ArchLinux on my computer, the problem occured on the current install only. I suspect the problem is caused by a missing package, though the only hint I was able to find was installing the ttf-ms-fonts package (https://wiki.archlinux.org/index.php/br … ash_Player). This did not resolve the issue.
Package versions:
firefox 34.0.5-1
flashplugin 11.2.202.425-1
I'll gladly provide any other information that could help resolve this issue.
Last edited by skoftoby (2015-01-21 18:31:26)

Head_on_a_Stick wrote:
skoftoby wrote:I'd like to request this topic be marked as solved either way.
As with most things Arch, you have to do this yourself -- edit the title of your first post & put "[SOLVED]" at the beginning.
Whops. The subject length was exactly at the maximum length, and since I couldn't edit the title any further, I thought a moderator had to edit it. Apologies!

How to support non alphanumeric characters when using WORLD_LEXER?

BASIC_LEXER has an attribute of printjoins which we can specify the non alphanumberic characters as normal alphanumberic in query and included with the token. However, WORLD_LEXER doesn't have this attribute. So in order to use some non alphanumberic characters and treat them as alphanumberic characters in Oracle Text Index, such as ><$, what should I can?
Thanks in advance for any help.

I use WORLD_LEXER to create Oracle Text Index to support UTF-8.
Below is the script to create table and index:
REM Create table
CREATE TABLE my_test
( id VARCHAR2(32 BYTE) NOT NULL,
code VARCHAR2(100 BYTE) NOT NULL,
CONSTRAINT "my_test_pk" PRIMARY KEY ("id"));
REM create index
exec ctx_ddl.create_preference('stars_lexer','WORLD_LEXER');
exec ctx_ddl.create_preference('stars_wordlist', 'BASIC_WORDLIST');
exec ctx_ddl.set_attribute('stars_wordlist','substring_index','TRUE');
exec ctx_ddl.set_attribute('stars_wordlist','PREFIX_INDEX','TRUE');
-- create index for Table corrosion level
CREATE INDEX my_test_index
ON my_test(code)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS ('LEXER stars_lexer STOPLIST stars_stop WORDLIST stars_wordlist SYNC(EVERY "SYSDATE+5/1440") TRANSACTIONAL');
INSERT INTO my_test('1', 'English word1');
INSERT INTO my_test('2', '违违');
INSERT INTO my_test('3', '违违&^|违违');
When I query:
select * from corrosion_levels r where contains(r.CORROSION_LEVEL, '{%违${|违%}') > 0
ID CODE
3 违违$(|违违
2 违违
Actually, the result what I want is: 3 违违$(|违违
So the requirement is that all non-alphanumeric characters should be treated as normal alphanumeric charcters. Please tell me how to implement it.

SQL*Loader-350: Illegal combination of non-alphanumeric characters

Hi all,
how to skip a column in control file.
i.e.
I have a table with 10 columns and FAT file contains 11 columns.
how to skip a last column ?
and
I added extra column to Table and I changed the control file too.
but i am getting error.
SQL*Loader-350: Syntax error at line 115.
Illegal combination of non-alphanumeric characters
thanks in Advance.

Tom Kyte has an elaborate solution on his page:
http://osi.oracle.com/~tkyte/SkipCols/index.html
Or post the table description and the control file, so we can have a look at it.

Hot to set the Non-alphanumeric characters attribute?

Hello,
I'm developing an asp.net application using the oracle membership provider. I have installed the databse objects in an Oracle 9i and when I try to create a new user, It always asks me to consider at least 1 Non-alphanumeric character in the password, even if I put in the web.config file the minRequiredNonalphanumericCharacters="0" attribute. Is there another way to set the Non-alphanumeric characters to "0".
Thanks.

Workaround: set 0 for minRequiredNonalphanumericCharacters attribute in your machine.config for OracleMembershipProvider. The defalt value is 1. The typical location of machine.config is in %windir%\Microsoft.NET\Framework\v2.0.50727\CONFIG.

Non-alphanumeric characters part of a word?

Hi,
When searching using "23" the users have complained that it brings back terms like 23:00 and www.bob.....html?q=23. It's a bit of a long stretch but is there anyway to disregards those item.
I expect I'll probably end up explaining to the users that non-alphanumeric characters are regarded in the same was as white spaces, but is this a general standard, and is there a link to more information anywhere?
Cheers

You could make your colon and equal sign printjoins or skipjoins.

Account Codes with non Alphanumeric characters

I have a customer who has Business Partner Codes which contain non Alphanumeric characters :
space
full stop
apostrophe
forward slash
hyphen
Will this cause issues with the operation of any aspect of Webtools - what if anything should I watch out for since these cannot be changed?

Thanks for your response. The form method is "post". The only thing that the CFM handler does with this field (gametestexperience) is include it at the bottom of an email message:
<CFMAIL TO="[email protected]"
FROM="#email#"
SUBJECT="Tester Application"
type="html"
server = "Cluster9.us.messagelabs.com">
Automatically logged from the internet form:<p><p>
<table>
<tr><td>EmailFrom</td><td>#email#</td>
<tr><td>From</td><td>#firstName# #lastName#</td></tr>
<tr><td>Sent</td><td>#DateFormat(Now())#</td></tr>
<tr><td>Referrer</td><td>#referer#</td></tr>
<tr><td>Confirmation</td><td>#agree1#, #agree2#, #agree3#, #agree4#, #agree5#</td></tr>
<tr><td>FamilyCompetitor</td><td>#familycompetitor#</td></tr>
<tr><td>FamilyWork</td><td>#familywork#</td></tr>
<tr><td>ReferredBy</td><td>#referredBy#</td></tr>
<tr><td>RefererDetails</td><td>#refererdetails#</td></tr>
<tr><td>FirstName</td><td>#firstName#</td></tr>
<tr><td>LastName</td><td>#lastName#</td></tr>
<tr><td>Email</td><td>#email#</td></tr>
<tr><td>TelephoneHome</td><td>#telephoneHome#</td></tr>
<tr><td>TelephoneCell</td><td>#telephoneCell#</td></tr>
<tr><td>TelephoneDay</td><td>#telephoneDay#</td></tr>
<tr><td>City</td><td>#homeTown#</td></tr>
<tr><td>Employment</td><td>#employment#</td></tr>
<tr><td>GamerType</td><td>#gamerType#</td></tr>
<tr><td>GamerStyle</td><td>#gamerStyle#</td></tr>
<tr><td>HoursPerWeek</td><td>#hoursPerWeek#</td></tr>
<tr><td>PreferredGenres</td><td>#preferredGenres#</td></tr>
<tr><td>Consoles</td><td>#console#</td></tr>
<tr><td>GamesPlayed</td><td>#gameplayed#</td></tr>
<tr><td>Available</td><td>#available#</td></tr>
<tr><td>TestedBefore</td><td>#testedBefore#</td></tr>
<tr><td>TestingExperience</td><td>#gametestexperience#</td></tr>
</table>
Thanks,<P>
Focus Test Team!<P>
</CFMAIL>

Replace non-english characters function

Hi folks,
I have a text which includes non english characters. Is there any trick, how can I replace those characters with "closest" english character?
Examples:
"Hytölä" to become "Hytola"
"Säynatsälo" to become "Säynatsälo"
etc ...
I was thinking about usage of REGEXP
select regexp_replace('Hytölä Säynatsälo ', '[^0-9A-Za-z]', '') from dual
but a pattern is not correct.
Any suggesitons?

There is something that smells like a hack for me (source: replace characters with accent with their base letter)
However
with data as (
select 'Hytölä' str from dual
union all
select 'Säynatsälo' from dual
select
str
,utl_raw.cast_to_varchar2(nlssort(str, 'NLS_SORT=BINARY_AI')) nstr
,length(utl_raw.cast_to_varchar2(nlssort(str, 'NLS_SORT=BINARY_AI'))) l
from data
STR
NSTR
L
Hytölä
hytola
7
Säynatsälo
saynatsalo
11
Notice that change in length through an extra null bit at the end of the strings.
And the loss of the uppercase.
For this kind of questions it's helpful to know about the requirements. Why there shuóuld be a baseletter conversion? For search purposes for example.
not to forget the db characterset.

Replace Non-Numeric Characters with a Numeric Character in a String

Hi Guys,
I need to replace all the non-numeric characters (including embedded blanks & hyphen) in a string to a numeric character '1'.
The trailing blanks should not be replaced.
e.g. "P22233344455566" should be changed to "122233344455566"
& "49-1234567 " should be changed to "4911234567 "
Please help.

Use [replace|http://help.sap.com/abapdocu_70/en/ABAPREPLACE_IN_PATTERN.htm] with a regular expression to translate any non-numeric character (i.e. any character not between 0 and 9) to 1:
REPLACE ALL OCCURENCES OF REGEX '[^0-9]' IN value WITH '1'.
Cheers, harald
p.s.: In older releases [translate|http://help.sap.com/abapdocu_70/en/ABAPTRANSLATE.htm] would also do the trick, but is more lengthy, because one would need to specify each individual character that should be replaced, e.g.:
TRANSLATE value TO UPPER CASE.
TRANSLATE value USING
' 1_1-1a1b1c1d1e1f1g1h1i1j1k1l1m1n1o1p1q1r1s1t1u1v1w1x1y1z1'.

Replacing non-ASCII characters with HTML charcter references

Hi All,
In Oracle 10g or greater is there a built-in function that will convert a string with non-ASCII characters like this
a b č 뮼
into an ASCII string with HTML character references like this?
a b & # x 0 1 0 D ; & # x B B B C ;
(note I had to include spaces between each character in the sample code for message to prevent the forum software from converting my text)
I tried using
utl_i18n.escape_reference( val, 'us7ascii' )
but for some reason it returns
a b c & # x B B B C ;
Note how it converted the Western European character "č" to its unaccented counterpart "c", not "& # x 0 1 0 D ;" (is this a bug?).
I also tried a custom solution using regexp_replace and asciistr (which I can't include here because the forum software chokes on it) but it only returns the correct result for values <=4000 characters long. Unfortunately asciistr doesn't appear to accept CLOB values larger than 4000 characters. It returns an error message like
(ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 30251, maximum: 4000) ).
I'm looking for a solution that works on CLOB data of any size.
Thanks in advance for any insight you can provide.
Joe Fuda

So with that (UTF8) in mind, let's take another look.....
As shown below, I used a AL32UTF8 database.
Note: I did not use a unicode capable tool for querying. So I set console mode code page to 1250 just to have č displayed properly (instead of posing as an è).
Also, as a result of using windows-1250 for client character set, in the val column and in the second select's ncr column (iso8859-1), è (00e8) has been replaced with e through character set conversion going from server back to client.
Running the same code on a database with a db character set such as we8mswin1252, that doesn't define the č (latin small c with caron) character, would yield results with a c in the ncr column.
C:\>chcp 1250
Aktuell teckentabell: 1250
C:\>set nls_lang=.ee8mswin1250
C:\>sqlplus test/test
SQL*Plus: Release 11.1.0.6.0 - Production on Fri May 23 21:25:29 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the OLAP option
SQL> select * from nls_database_parameters where parameter like '%CHARACTERSET';
PARAMETER              VALUE
NLS_CHARACTERSET       AL32UTF8
NLS_NCHAR_CHARACTERSET AL16UTF16
SQL> select unistr('\010d \00e8') val, utl_i18n.escape_reference(unistr('\010d \00e8'),'us7ascii') NCR from dual;
VAL NCR
č e c e
SQL> select unistr('\010d \00e8') val, utl_i18n.escape_reference(unistr('\010d \00e8'),'we8iso8859p1') NCR from dual;
VAL NCR
č e &# x10d; e     <- "è"
SQL> select unistr('\010d \00e8') val, utl_i18n.escape_reference(unistr('\010d \00e8'),'ee8iso8859p2') NCR from dual;
VAL NCR
č e č &# xe8;
SQL> select unistr('\010d \00e8') val, utl_i18n.escape_reference(unistr('\010d \00e8'),'cl8iso8859p5') NCR from dual;
VAL NCR
č e &# x10d; &# xe8;In the US7ASCII case, where it should be possible for all non-ascii characters to be escaped, it seems as if the actual escape step is skipped over.
Hope this helps to understand whether utl_i8n is usable or not in your case.
Message was edited by:
orafad
Fixed replaced character references :)

Replacing non-ascii characters in String

I have a site where the user enters data in a rich text
editor (ktml4) that gets stored into a database (mysql). There are
non ascii characters getting into the data, I'm assuming that they
are copying and pasting from Word. Unfortunately in this situation,
changing that process isn't an option.
Currently, this is the only character that is causing me
problems:
http://www.zvon.org/other/charSearch/PHP/search.php?request=ffa0&searchType=3
I would just like to replace the non-ascii characters with a
space when I read them from the database. Something like:
#Replace(result.column, '\xffa0', ' ')#
However, I believe that code looks for the string "\xffa0",
not the character \xffa0.
Is there anyway to do this?

quote:
Originally posted by:
BuckLemke
quote:
Originally posted by:
Dan Bracuk
rereplace might work.
Can you give an example of how to pass a non-ascii character
to REReplace?
Regular expressions are not my strength, but the approach I
was considering was, "if it's not an ascii character, make it a
space". Then you pass the entire string at once.

Replacing non latin characters

Hi experts,
i have to check some fields of non latin characters.
When the fields include some of non latin charcters I have to replace them
with an "Y".
Have somesone a code example for this case?
Thanks for help!
Alex

This should give you an Idea
WHILE p_faxno CA sy-abcde. " to check if varaible contains any abcde...Z
p_faxno+sy-fdpos(1) = 'Y'.
ENDWHILE.
CONDENSE p_faxno NO-GAPS

Deleting/replacing non-alphabetic characters

I'd like to delete any non-anlphabetic characters in a given string, so I figured the following:
for (int i=0; i < aString.length(); i++){
                    if (!isAlpha(aString.charAt(i))){
                         char c = aString.charAt(i);
                         aString = aString.replace(c,' ');
          }Which doesn't work, and I don't understand why. (Ideally like to delete this char c, but replacing it with whitespace will I think also do)
Anyone any suggestions?

how about using regular expressions?
Here is an example using posix notation:
public class Repl
    public static void main(String args[]){
        String a = "Aghewuz2nknl7kj%\"dsk";
        // here any non alpha char is replaced by a space
        System.out.println(a.replaceAll("[^\\p{Alpha}]", " "));
}{code}

RegEx in TSQL - replace non-alphanumeric characters etc

Similar Messages

Maybe you are looking for