Range & Regular Expression issue.

I'm having a bit of trouble and I'm close to head butting a wall. It's a logic problem. I am trying to create a class, when given a range e.g., 52 - 234 it will output the regular expression
[5][2-9] | [6-9][0-9] | [1][0-9][0-9] | [2][0-2][0-9] | [2][3][0-4]
another example
12-23
[1][2-9] | [2][0-3]
It's giving me a logic headache. I can't help but walk around thinking in for loops after trying to get this to work. However, I noticed that someone on the Internet has made a Java tool that does exactly this, but all their links are dead :o(
Anyone got any ideas? Seen this before? Done this yourself? Help? I've been at this for days now, and I'm fed up!
Thanks :o)

Two things:
1. To use quote inside quoted string you must put two quotes in a row.
2. Certain characters have special meaning in regeular expressions. You must escape them with \ if you do not want regexp to interpret such characters.
Select regexp_replace('kathu&+','[/.#''& "\\-\+]')from dual;
REGEX
kathu
SQL>
{code}
SY.

Similar Messages

Regular Expression Issue

Dear Gurus,
I had a requirement to read a file and display the words btween two words..
for an example..
I need a good friend to share my sarrow things. I need a good friend to share my Loveable things.
here i need to extract words between good and Loveable
for this i wrote an regular expression
Pattern p = Pattern.compile("(?<=\\bgood\\b).*?(?=\\bloveable)\\b)");this is working fine
Now I need to extract the words between good and loveable or sarrow
I read from regex tutorial to use | symbol for OR operations.
Can any one help to solve this issue.
Thanks and Regards,
Durai S E
Edited by: user10734545 on Dec 12, 2011 12:30 AM
Edited by: EJP on 12/12/2011 19:30: code tags

user10734545 wrote:
for this i wrote an regular expression
Pattern p = Pattern.compile("(?<=\\bgood\\b).*?(?=\\bloveable)\\b)");
Sometimes a simple solutions is best: @Test
    public void testGoodLovable() throws Exception {
        Pattern pattern = Pattern.compile("good\\s+(\\w+)\\s+lovable");
        Matcher matcher = pattern.matcher("good test lovable");
        if (matcher.find()) {
            Assert.assertEquals("test", matcher.group(1));
        } else {
            Assert.fail("pattern not matched");
    }bye
TPD

Pattern regular expression issue

I am trying to validate the file names , the file names can be alpha numeric and can have '_' '-''.' symbols alone
my code dunno why but my pattern does not work for test# file name
Pattern _namePattern = Pattern.compile("^\\w\\s-//.]+$");
private boolean validateFilename(String name) {
          boolean valid = true;
          if (this._namePattern != null) {
               valid = !(this._namePattern.matcher(name).matches());
          if (valid) {
               valid = (name.indexOf("%2f") < 0) && (name.indexOf(58) < 0);
          return valid;
     private boolean validateFilenames(File files) {
          if (this._namePattern == null) {
               return true;
          boolean valid = true;
          if (!(validateFilename(files.getName()))) {
               System.out.println("ERROR.INVALID_FILENAME"
                         + "The following file contains invalid characters: "
                         + files.getName());
               valid = false;
          } else {
               System.out.println("Valid file name " + files.getName());
          return valid;
     }

i have gone through this link its the same as the pattern API the problem is i have tried so many expressions now that i have lost all of it :(
Ram asked:
What was the input string you used to test the above expression?i tried with test it fails for expression [\\w._-]+$so i used negate sight as i am doing a inverse logic in the code and it worked for test
[^\\w._-]+$now if i tested with a tes#t it still shows a valid string which is not correct
with the space literal it works for strings having spaces but when i have a input test #test it fails
I might be sounding too stupid but the matching logic is already present in a third party jar code i don't want to change the logic which i posted in my first i just want to change the regular expression so that it does not allow file names with any of the special characters

Javascript regular expression issue

Hi,
I am trying to use a javascript regular expression in my code
var reg_exp = new RegExp("^([0-9]{2})/([0-9]{2})/([0-9]{4})\s([0-9]{2})$");it works from outside of APEX but not being called from inside APEX. It is failing on the
\sIs there some override or some reason why I cannot do this from inside APEX?
Thanks in advance!

Well I just changed it to
var reg_exp = new RegExp("^([0-9]{2})/([0-9]{2})/([0-9]{4}) ([0-9]{2}):([0-9]{2})$");basically just replaced an actual space for \s and it works fine...

Oracle Regular expression issue

Hi All,
I have regular expression problem.
create table url ( Url varchar2(1024));
insert into URL values ('http://abc.jambo.com/ababs/sffef/dsf/sdfdsf/jk.htm')
insert into URL values ('.*amazon.com.*');
insert into URL values ('Abc.com');
insert into URL values ('xyz.Abc.com');
insert into URL values ('^http://bhido.jambo.com/ababs/kd.htm     ');
commit
SELECT url,REGEXP_SUBSTR(url,'http://([[:alnum:]]+\.?){3,4}/?') "REGEXP_SUBSTR"
FROM url
But it returns following result
URL                                                                       REGEXP_SUBSTR
http://abc.jambo.com/ababs/sffef/dsf/sdfdsf/jk.htm         http://abc.jambo.com/
.**abc.amazon.com.**                                                 NULL
Abc.com                                                                 NULL
xyz.Abc.com                                                         NULL
.**amazon.com.**                                                         NULL
^http://bhido.jambo.com/ababs/kd.htm                         http://bhido.jambo.com/
*What changes would be required in RegEx to get following output*
*URL                                                                       REGEXP_SUBSTR*
http://abc.jambo.com/ababs/sffef/dsf/sdfdsf/jk.htm         abc.jambo.com
.**abc.amazon.com.**                                                 abc.amazon.com
Abc.com                                                                 Abc.com
xyz.Abc.com                                                         xyz.Abc.com
.**amazon.com.**                                                         amazon.com
^http://bhido.jambo.com/ababs/kd.htm                         bhido.jambo.comThanks in advance
-Kuldeep
Edited by: Kuldeep2 on Apr 28, 2009 3:56 AM
Edited by: Kuldeep2 on Apr 29, 2009 3:28 AM

SQL> select url, regexp_substr(url,'[[:alnum:]|\.]*com')new_url from url
2 /
URL                                                NEW_URL
http://abc.jambo.com/ababs/sffef/dsf/sdfdsf/jk.htm abc.jambo.com
.*amazon.com.*                                     amazon.com
Abc.com                                            Abc.com
xyz.Abc.com                                        xyz.Abc.com
^http://bhido.jambo.com/ababs/kd.htm               bhido.jambo.com

[solved]Issue concerning regular expressions

Perhaps the problem in it's entirety is a bit more involved that just regular expressions, but for a start, this small problem is what I need help with.
In short, I have some data which can be in the of some text, an IP address or an URL.
In case that it's URL, it will usually, if not always, be in the form of a base URL, with or without one or more subdomains.
I need to strip away these sub domains so that only the base URL remain, thus:
abc.def.ghi.com
becomes
ghi.com
forgetting for the moment that some top level domains have two parts, it seems so easy, yes I cannot for the life of me figure out how to do it, using only the basic command line tool available such as sed, awk, and so on and so forth, and thus I hope that someone here can and will help me. It is of course also possible that the process is simply to involved that a simple command line will do, in which case I suppose a bahs or python script will have to do, but I really hope not. In any case I will appreciate any help and/or advise you can give me.
Best regards.
Last edited by zacariaz (2015-05-07 13:38:47)

WorMzy wrote:
Is this a homework exercise?
What have you tried so far? It should be easy enough to accomplish using a combination of rev and cut, although I'm sure there are more elegant solutions using awk.
Mod note: Moving to Newbie Corner
I wish it were homework, but no.
The rev idea is interesting, and I'll look into it, but as it is, an awk solution would be preferable, as this is only part of a larger issue, and I'd like to keep it as simple as possible. Also, as for topdomain with two parts, it's still somewhat involved.
As for what I've tried, a lot as I only stopped and went to bed when I realized the sun was raising, but the main problem is that I don't really know how to attack the problem, and of course I'm not exactly a master of regular expressions.
Anyway thanks.

PrintWriter issue with Directory Traversal and Regular Expression

This is a follow up to my previous question on the forum. I am developing a program traverses the hard drive for information. If it finds the said information in any of the file (based on the regular expression wriiten) it must print the output into the file. Currently I am able to traverse the harddrive perfectly, the regular expression and the search is perfect and when I print the output into the local console, I am able to derive perfect results. But when I use the PrintWriter to write the output into the flight, it writes NOTHING into the file. I have been scouring all over the Internet for an answer, but havent been able to find. Would highly appreciate if someone can tell me what I am doing wrong and provide some guidance on how to get it right.
public class myClass{
    BufferedReader br;
    String pcv;
    Pattern scPattern = Pattern.compile("Some regular expression");
    Matcher match = null;
    Pattern newPattern = Pattern.compile("Some regular expression");
    Matcher newMatch = null;
    String mvCheckVal;
    Matcher mvMatch;
    PrintWriter pw;
    void recursiveMethod(File dir) throws Exception {
            pw = new PrintWriter(new FileWriter("outputFile.txt"));
            pw.println("Opening pw stream.....");
            File[] files = dir.listFiles();
            String[] fileList = dir.list();
            for (int i = 0;i < files.length; i++) {
                if (files.isDirectory()) {
continue;
} else if (files[i].isFile()) {
br = new BufferedReader(new FileReader(files[i]));
pw.println("BR is opening....");
String line;
while((line = br.readLine()) != null) {
match = scPattern.matcher(line);
if (match.find()) {
pcv = line.substring(match.start(), match.end());
System.out.println("Match: " + pcv + " Context: " + match.replaceFirst(pcv)); //This is working perfectly
pw.println("Match: " + pcv + " Context: " + match.replaceFirst(pcv)); //This does not print anything at all
System.out.println("Files: " + files[i]);
System.out.println("");
pw.println(" Files: " + files[i]);
pw.println("");
System.out.println("Closing I/O....");
br.close();
pw.close();
public static void main(String[]args) throws Exception {
File dir = new File("C:/");
myNewClass acf = new myNewClass();
acf.myClass(dir);

@ejp
I am afraid that it is not working. Can you please tell me what I doing wrong.
void myMethod(File dir) throws Exception {
            bw = new BufferedWriter(new FileWriter("outputFile.txt"));
            File[] files = dir.listFiles();
            String[] fileList = dir.list();
            for (int i = 0;i < files.length; i++) {
                if (files.isDirectory()) {
myMethod(files[i]);
} else if (files[i].isFile()) {
br = new BufferedReader(new FileReader(files[i]));
String line;
while((line = br.readLine()) != null) {
match = scPattern.matcher(line);
if (match.find()) {
pcv = line.substring(match.start(), match.end());
System.out.println("Match: " + pcv + " Context: " + match.replaceFirst(pcv));
bw.write("Match: " + pcv + " Context: " + match.replaceFirst(pcv));
br.close();
bw.write("Files: " + files[i]);
bw.write("");
bw.close();
public static void main(String[]args) throws Exception {
File dir = new File("C:/");
myClass acf = new myClass();
acf.myMethod(dir);

Regular Expressions in num-exp

Hello All,
I had a problem on my SRST gateway with num-exp insterting a repeating pattern into my 7-digit dialing when in fallback mode.
For a brief example, the 7digit internal dialing is 21621.. or 21622..
The num-exp statement of 'num-exp 2... 2162...' was not allowing me to 7-digit dial directly from one IP phone to another while in fallback mode.
When I dialed 2162154 the 2162 would hit the num-exp and be expended to 2162162.
I have a work around that uses a voice translation-rule, applied to the call-manager-fallback config that will translate a 7-digit dialed string to the 4 digit dialed string which then hits the 4-digit to 7-digit num-exp and it is working fine.
However, I was wondering if there is a way to use regular expressions in num-exp so that perhaps I can skip the intermediate step of using the translation-rule. Based off my existing translation-rules that are working properly, I figured something like this might work for num-exp:
'num-exp /^2$[12]..$$/ /2162\1/'
But when I try to issue a num-exp with a regular expression I get the following message.
Incorrect format for Number macro pattern
regular expression must be of the form ^((\+)?([0-9#*A-F.]|(\\\*))+(\$)?)$
I have tried a number of different combinations with no success. I always get the same message. The regular expression that I tried first was:
'num-exp ^2... 2162...'
This is when I first saw the "Incorrect format..." message and figured that is must be possible. Is this just a generic warning similar to when you try to use complex regular expressions with the 'translation-rule' command vs. the 'voice translation-rule' command and in reality you cannot use regular expressions in the num-exp command?
Thank you,
Leo

Hi Chris,
Thank you for taking the time to answer my question. It looks like the answer is no, num-exp does not support regular expressions.
I don't insist on using num-exp for this I was just hoping to kill two birds with one stone and possibly skip the intermediate step of translating the 7-digits dial to 4-digits using a translation-rule just to expand from 4-digit to 7 again. This is only an issue while in SRST if a user tries to dial using 7-digits. We have a 7-digit internal dialing scheme and normally my num-exp is just to expand the 4 digits sent from the telco to our 7-digit internal dialing. The problem is that both our prefix and part of our DID range start with 21 so while in SRST if a user tried to dial a 7-digit DN, say 2162154, after they dialed the 4th digit (2162) that pattern would hit the num-exp and get expanded to 2162162. I was hoping to create a num-exp using a regular expression that would only expand a four digit string that begins with a 2 to seven digits and not any string that begins with a 2. This would 1) expand the four digits sent from the telco and 2) not match a seven digit string that begins with a 2 such as 2162154 which may be dialed by a user.
Again, this is only an issue while in SRST and I have a pretty good work around so I'm fine with not being able to use a regular expression as part of my num-exp config. I just thought it would be a cool application of a regular expression if it was possible.
Thanks again for answering my question.
Leo

Regular Expressions in CS5.5 - something is wrong

Hello Everybody,
Please correct me, but I think, I found a serious problem with regular Expressions in Indesign CS5.5 (and possibly in other apps from CS5.5).
Let's start with simple example:
var range = "a-a,a,a-a,a";
var regEx = /(a+-a+|a+)(,(a+-a+|a+))*/;
alert( "Match:" +regEx.test(range)+"\nLeftContext: "+RegExp.leftContext+"\nRightContext: "+RegExp.rightContext );
What I expected was true match and the left and the right context should be empty. In Indesign CS3 that is correct BUT NOT in CS5.5.
In CS 5.5 it seems that the only first "a-a" is matched and the rest is return as the rightContext - looks like big change (if not parsing error in RegExp engine).
Please correct me if I am wrong.
The second example - how to freeze ID CS5.5:
var range = "a-a,a,a-a,a";
var regEx = /(a+-a+|a+)(,(a+-a+|a+)){8,}/;
alert( "Match:" +regEx.test(range)+"\nLeftContext: "+RegExp.leftContext+"\nRightContext: "+RegExp.rightContext );
As you can see it differs only with the {8,} part instead of *
Run it in CS5.5 and you will see that the ID hangs (in CS3 of course it runs flawlessly}.
The third example - how to freeze ID 5.5 in one line (I posted it earlier in Photoshop forum because similiar problem was called earlier):
alert((/(n|s)? /gmi).test('s') );
As you can guess - it freezes the CS5.5 (CS3 passes the test).
Please correct me if I am doing something wrong or it's the problem of Adobe.
Best regards,
Daniel Brylak

Hi Daniel,
Thanks for sharing. Really annoying indeed.
Just to complete your diagnosis, what you describe about CS.5 is the same in CS5, while CS4 behaves as CS3.
var range = "aaaaa";
var regEx = /(a+-a+|a+)(,(a+-a+|a+))*/;
alert([
    "Match:" +regEx.test(range),
    "LeftContext: "+RegExp.leftContext.toSource(),        // => CS3/4: EMPTY -- CS5+: EMPTY
    "RightContext: "+RegExp.rightContext.toSource()        // => CS3/4: EMPTY -- CS5+: ",a,a-a,a"
    ].join('\r'));
So there is a serious implementation problem of the RegExp object from ExtendScript CS5.
I don't think it's related to the greedy modes. By default, JS RegExp quantifiers are greedy, and /a*/ still entirely captures "aaaaaa" in CS5+.
By the way, you can make any quantifier non-greedy by adding ? after the quantifier, e.g.: /a*?/, /a+?/, etc.
I guess that Adobe ExtendScript has a generic issue in updating the RegExp.lastIndex property in certain contexts—see http://forums.adobe.com/message/3719879#3719879 —which could explain several bugs such as the Negative Class bug —see http://forums.adobe.com/message/3510078#3510078 — or the problems you are mentioning today.
@+
Marc

Faulty Regular Expression

Hi all,
I have a question regarding regular expressions.. I am refactoring a method named isPasswordValid() and removing a bunch of ugly Java code that enforces the following rules:
Does it begin with an upper or lowercase letter?
Does it contain at least one lowercase letter?
Does it contain at leasr one uppercase letter?
Does it contain at least one number?
Does it contain at least one special character from the following: !@#$%^&*-_+=
Is it at least 15 characters?I have extracted my regular expression into a dummy class for easier testing:
public class RegExTest {
     public static void main(String[] args) {
          String password ="qwerty34#$QWERTY";
          if (password.matches("^[a-zA-Z]{1}.{14,}")
                    && password.matches(".*[a-z]+.*")
                    && password.matches(".*[A-Z]+.*")
                    && password.matches(".*[0-9]+.*")
                       && password.matches(".*[!@#$%^&*-_+=]+.*")) {
               System.out.println("This is a match");
}This regular expression works as intended except for an issue with special characters. It accepts every special character specified as well as others like the tilda (~) and parenthesis. I even tried escaping the ones that needed escaping but nothing seems to work. What am I missing here? Also, do you see anything else I should be concerned with? Of course, any help would be greatly appreciated!
Thanks!

When not placed at the start or end of a character class, the hyphen is a range operator:
[a-d]    // matches 'a', 'b', 'c' or 'd'
[ad-]    // matches 'a', 'd' or '-'
[-ad]    // matches '-', 'a' or 'd'Or escape it:
[a\-d]    // matches 'a', '-' or 'd'

Safari May Not Interpret Regular Expression in Compliance with W3C Standard

We are troubleshooting why some websites are no longer working when rendered with the Safari 2.0.4 browser. The failure begins when client entered data is validated using regular expressions.
We have localized the issue to Safari's not interpretting regular expressions consistently.
For example:
Does the regex \u00e9 match the literal character é? (Validates the regular expression engine understands Unicode escape sequences for extended characters.)? - NO, but it does on IE and FireFox
Does the regex \u0041 match the literal character A? (Validates the regular expression engine understands Unicode escape sequences for ASCII characters.)? - NO, but it does on IE and FireFox
Does the regex é match the literal character é? (Validates the regular expression engine understands literal characters outside the ASCII range – this is against ECMAScript spec.)? - Sometimes, but always on IE and FireFox
Write a Unicode escape sequence to the screen on the client side. (Validates the string parsing and display in the JS engine works.) - Works on all 3
Is escape sequence \u00e9 equivalent to literal character é? (Validates the string functionality in the JS engine works with extended characters.)? Yes on all 3.
Is escape sequence \u0041 equivalent to literal character A? (Validates the string functionality in the JS engine works with ASCII characters.)? Yes on all 3
Does the regex A match the literal character A? (Validates the regular expression engine understands literal characters in the ASCII range – this is ECMAScript spec.)? Yes on all 3
Please help. It's hard for me to believe that the regular expression / javascript interpreter(s) for Safari aren't working as they have in the past - but all roads are pointed that way....
Thank you for your review.
Mac OS X (10.4.7)

We are troubleshooting why some websites are no longer working when rendered with the Safari 2.0.4 browser. The failure begins when client entered data is validated using regular expressions.
We have localized the issue to Safari's not interpretting regular expressions consistently.
For example:
Does the regex \u00e9 match the literal character é? (Validates the regular expression engine understands Unicode escape sequences for extended characters.)? - NO, but it does on IE and FireFox
Does the regex \u0041 match the literal character A? (Validates the regular expression engine understands Unicode escape sequences for ASCII characters.)? - NO, but it does on IE and FireFox
Does the regex é match the literal character é? (Validates the regular expression engine understands literal characters outside the ASCII range – this is against ECMAScript spec.)? - Sometimes, but always on IE and FireFox
Write a Unicode escape sequence to the screen on the client side. (Validates the string parsing and display in the JS engine works.) - Works on all 3
Is escape sequence \u00e9 equivalent to literal character é? (Validates the string functionality in the JS engine works with extended characters.)? Yes on all 3.
Is escape sequence \u0041 equivalent to literal character A? (Validates the string functionality in the JS engine works with ASCII characters.)? Yes on all 3
Does the regex A match the literal character A? (Validates the regular expression engine understands literal characters in the ASCII range – this is ECMAScript spec.)? Yes on all 3
Please help. It's hard for me to believe that the regular expression / javascript interpreter(s) for Safari aren't working as they have in the past - but all roads are pointed that way....
Thank you for your review.
Mac OS X (10.4.7)

Regular expression alphabets

Hi
I want to retrieve the data if the data contains a character or a space or '-' thru select query .
Please help me in writing the combination of 3 with regular expression.
Thanks!!

VT wrote:
Hi,
Try this
SELECT *
FROM <TABLE> WHERE REGEXP_LIKE(<COLUMN>, '[a-z -][A-Z -]');cheers
VTThat won't work as it's expecting at least two characters with the first having to be a-z (lower case) or space or "-" followed by A-Z (upper case) or space or "-".
The correct way is either:
[a-zA-Z -]or
[[:alpha:] -]using the alpha set is often preferable as it can work differently with different character sets/languages rather than restricting to just the a-zA-Z ranges.
Generating a reference for your own database characterset/language can be useful...
SQL> select level-1 as asc_code, decode(chr(level-1), regexp_substr(chr(level-1), '[[:print:]]'), CHR(level-1)) as chr,
2         decode(chr(level-1), regexp_substr(chr(level-1), '[[:graph:]]'), 1) is_graph,
3         decode(chr(level-1), regexp_substr(chr(level-1), '[[:blank:]]'), 1) is_blank,
4         decode(chr(level-1), regexp_substr(chr(level-1), '[[:alnum:]]'), 1) is_alnum,
5         decode(chr(level-1), regexp_substr(chr(level-1), '[[:alpha:]]'), 1) is_alpha,
6         decode(chr(level-1), regexp_substr(chr(level-1), '[[:digit:]]'), 1) is_digit,
7         decode(chr(level-1), regexp_substr(chr(level-1), '[[:cntrl:]]'), 1) is_cntrl,
8         decode(chr(level-1), regexp_substr(chr(level-1), '[[:lower:]]'), 1) is_lower,
9         decode(chr(level-1), regexp_substr(chr(level-1), '[[:upper:]]'), 1) is_upper,
10         decode(chr(level-1), regexp_substr(chr(level-1), '[[:print:]]'), 1) is_print,
11         decode(chr(level-1), regexp_substr(chr(level-1), '[[:punct:]]'), 1) is_punct,
12         decode(chr(level-1), regexp_substr(chr(level-1), '[[:space:]]'), 1) is_space,
13         decode(chr(level-1), regexp_substr(chr(level-1), '[[:xdigit:]]'), 1) is_xdigit
14    from dual
15 connect by level <= 256
16 /
ASC_CODE C   IS_GRAPH   IS_BLANK   IS_ALNUM   IS_ALPHA   IS_DIGIT   IS_CNTRL   IS_LOWER   IS_UPPER   IS_PRINT   IS_PUNCT   IS_SPACE IS_XDIGIT
         0                                                                   1
         1                                                                   1
         2                                                                   1
         3                                                                   1
         4                                                                   1
         5                                                                   1
         6                                                                   1
         7                                                                   1
         8                                                                   1
         9                                                                   1                                              1
        10                                                                   1                                              1
        11                                                                   1                                              1
        12                                                                   1                                              1
        13                                                                   1                                              1
        14                                                                   1
        15                                                                   1
        16                                                                   1
        17                                                                   1
        18                                                                   1
        19                                                                   1
        20                                                                   1
        21                                                                   1
        22                                                                   1
        23                                                                   1
        24                                                                   1
        25                                                                   1
        26                                                                   1
        27                                                                   1
        28                                                                   1
        29                                                                   1
        30                                                                   1
        31                                                                   1
        32                       1                                                                            1                     1
        33 !          1                                                                                       1          1
        34 "          1                                                                                       1          1
        35 #          1                                                                                       1          1
        36 $          1                                                                                       1          1
        37 %          1                                                                                       1          1
        38 &          1                                                                                       1          1
        39 '          1                                                                                       1          1
        40 (          1                                                                                       1          1
        41 )          1                                                                                       1          1
        42 *          1                                                                                       1          1
        43 +          1                                                                                       1          1
        44 ,          1                                                                                       1          1
        45 -          1                                                                                       1          1
        46 .          1                                                                                       1          1
        47 /          1                                                                                       1          1
        48 0          1                     1                     1                                           1                                1
        49 1          1                     1                     1                                           1                                1
        50 2          1                     1                     1                                           1                                1
        51 3          1                     1                     1                                           1                                1
        52 4          1                     1                     1                                           1                                1
        53 5          1                     1                     1                                           1                                1
        54 6          1                     1                     1                                           1                                1
        55 7          1                     1                     1                                           1                                1
        56 8          1                     1                     1                                           1                                1
        57 9          1                     1                     1                                           1                                1
        58 :          1                                                                                       1          1
        59 ;          1                                                                                       1          1
        60 <          1                                                                                       1          1
        61 =          1                                                                                       1          1
        62 >          1                                                                                       1          1
        63 ?          1                                                                                       1          1
        64 @          1                                                                                       1          1
        65 A          1                     1          1                                           1          1                                1
        66 B          1                     1          1                                           1          1                                1
        67 C          1                     1          1                                           1          1                                1
        68 D          1                     1          1                                           1          1                                1
        69 E          1                     1          1                                           1          1                                1
        70 F          1                     1          1                                           1          1                                1
        71 G          1                     1          1                                           1          1
        72 H          1                     1          1                                           1          1
        73 I          1                     1          1                                           1          1
        74 J          1                     1          1                                           1          1
        75 K          1                     1          1                                           1          1
        76 L          1                     1          1                                           1          1
        77 M          1                     1          1                                           1          1
        78 N          1                     1          1                                           1          1
        79 O          1                     1          1                                           1          1
        80 P          1                     1          1                                           1          1
        81 Q          1                     1          1                                           1          1
        82 R          1                     1          1                                           1          1
        83 S          1                     1          1                                           1          1
        84 T          1                     1          1                                           1          1
        85 U          1                     1          1                                           1          1
        86 V          1                     1          1                                           1          1
        87 W          1                     1          1                                           1          1
        88 X          1                     1          1                                           1          1
        89 Y          1                     1          1                                           1          1
        90 Z          1                     1          1                                           1          1
        91 [          1                                                                                       1          1
        92 \          1                                                                                       1          1
        93 ]          1                                                                                       1          1
        94 ^          1                                                                                       1          1
        95 _          1                                                                                       1          1
        96 `          1                                                                                       1          1
        97 a          1                     1          1                                1                     1                                1
        98 b          1                     1          1                                1                     1                                1
        99 c          1                     1          1                                1                     1                                1
       100 d          1                     1          1                                1                  1                           1
       101 e          1                     1          1                                1                  1                           1
       102 f          1                     1          1                                1                  1                           1
       103 g          1                     1          1                                1                  1
       104 h          1                     1          1                                1                  1
       105 i          1                     1          1                                1                  1
       106 j          1                     1          1                                1                  1
       107 k          1                     1          1                                1                  1
       108 l          1                     1          1                                1                  1
       109 m          1                     1          1                                1                  1
       110 n          1                     1          1                                1                  1
       111 o          1                     1          1                                1                  1
       112 p          1                     1          1                                1                  1
       113 q          1                     1          1                                1                  1
       114 r          1                     1          1                                1                  1
       115 s          1                     1          1                                1                  1
       116 t          1                     1          1                                1                  1
       117 u          1                     1          1                                1                  1
       118 v          1                     1          1                                1                  1
       119 w          1                     1          1                                1                  1
       120 x          1                     1          1                                1                  1
       121 y          1                     1          1                                1                  1
       122 z          1                     1          1                                1                  1
       123 {          1                                                                                    1     1
       124 |          1                                                                                    1     1
       125 }          1                                                                                    1     1
       126 ~          1                                                                                    1     1
       127                                                                   1
       128 Ç          1                                                                                    1     1
etc.
{code}

[SOLVED]ZSH and regular expressions

Hi
I am getting into regular expressions and i have noticed that with my .zshrc file i have some problem. In bash this expression works:
\^\[^#]
but not also in zsh. I have also noted that regular expression works fine with other zshrc configurations found in archwiki (like grml) but i want to have my configuration. And i really can't find what command make a difference
My .zshrc file is pulled from this site https://github.com/slashbeast/things/bl … s/DOTzshrc.
# .zshrc
# Author: Piotr Karbowski <[email protected]>
# License: beerware.
# Basic zsh config.
umask 077
ZDOTDIR=${ZDOTDIR:-${HOME}}
ZSHDDIR="${HOME}/.config/zsh.d"
HISTFILE="${ZDOTDIR}/.zsh_history"
HISTSIZE='10000'
SAVEHIST="${HISTSIZE}"
export EDITOR="/usr/bin/vim"
export TMP="$HOME/tmp"
export TEMP="$TMP"
export TMPDIR="$TMP"
export TMPPREFIX="${TMPDIR}/zsh"
if [ ! -d "${TMP}" ]; then mkdir "${TMP}"; fi
if ! [[ "${PATH}" =~ "^${HOME}/bin" ]]; then
export PATH="${HOME}/bin:${PATH}"
fi
# Not all servers have terminfo for rxvt-256color. :<
if [ "${TERM}" = 'rxvt-256color' ] && ! [ -f '/usr/share/terminfo/r/rxvt-256color' ] && ! [ -f '/lib/terminfo/r/rxvt-256color' ] && ! [ -f "${HOME}/.terminfo/r/rxvt-256color" ]; then
export TERM='rxvt-unicode'
fi
# Colors.
red='\e[0;31m'
RED='\e[1;31m'
green='\e[0;32m'
GREEN='\e[1;32m'
yellow='\e[0;33m'
YELLOW='\e[1;33m'
blue='\e[0;34m'
BLUE='\e[1;34m'
purple='\e[0;35m'
PURPLE='\e[1;35m'
cyan='\e[0;36m'
CYAN='\e[1;36m'
NC='\e[0m'
# Functions
if [ -f '/etc/profile.d/prll.sh' ]; then
. "/etc/profile.d/prll.sh"
fi
run_under_tmux() {
# Run $1 under session or attach if such session already exist.
# $2 is optional path, if no specified, will use $1 from $PATH.
# If you need to pass extra variables, use $2 for it as in example below..
# Example usage:
# torrent() { run_under_tmux 'rtorrent' '/usr/local/rtorrent-git/bin/rtorrent'; }
# mutt() { run_under_tmux 'mutt'; }
# irc() { run_under_tmux 'irssi' "TERM='screen' command irssi"; }
# There is a bug in linux's libevent...
# export EVENT_NOEPOLL=1
command -v tmux >/dev/null 2>&1 || return 1
if [ -z "$1" ]; then return 1; fi
local name="$1"
if [ -n "$2" ]; then
local file_path="$2"
else
local file_path="command ${name}"
fi
if tmux has-session -t "${name}" 2>/dev/null; then
tmux attach -d -t "${name}"
else
tmux new-session -s "${name}" "${file_path}" \; set-option status \; set set-titles-string "${name} (tmux@${HOST})"
fi
t() { run_under_tmux rtorrent; }
irc() { run_under_tmux irssi "TERM='screen' command irssi"; }
over_ssh() {
if [ -n "${SSH_CLIENT}" ]; then
return 0
else
return 1
fi
reload () {
exec "${SHELL}" "$@"
confirm() {
local answer
echo -ne "zsh: sure you want to run '${YELLOW}$@${NC}' [yN]? "
read -q answer
echo
if [[ "${answer}" =~ ^[Yy]$ ]]; then
command "${=1}" "${=@:2}"
else
return 1
fi
confirm_wrapper() {
if [ "$1" = '--root' ]; then
local as_root='true'
shift
fi
local runcommand="$1"; shift
if [ "${as_root}" = 'true' ] && [ "${USER}" != 'root' ]; then
runcommand="sudo ${runcommand}"
fi
confirm "${runcommand}" "$@"
poweroff() { confirm_wrapper --root $0 "$@"; }
reboot() { confirm_wrapper --root $0 "$@"; }
hibernate() { confirm_wrapper --root $0 "$@"; }
detox() {
if [ "$#" -ge 1 ]; then
confirm detox "$@"
else
command detox "$@"
fi
has() {
local string="${1}"
shift
local element=''
for element in "$@"; do
if [ "${string}" = "${element}" ]; then
return 0
fi
done
return 1
begin_with() {
local string="${1}"
shift
local element=''
for element in "$@"; do
if [[ "${string}" =~ "^${element}" ]]; then
return 0
fi
done
return 1
termtitle() {
case "$TERM" in
rxvt*|xterm|nxterm|gnome|screen|screen-*)
local prompt_host="${(%):-%m}"
local prompt_user="${(%):-%n}"
local prompt_char="${(%):-%~}"
case "$1" in
precmd)
printf '\e]0;%s@%s: %s\a' "${prompt_user}" "${prompt_host}" "${prompt_char}"
preexec)
printf '\e]0;%s [%s@%s: %s]\a' "$2" "${prompt_user}" "${prompt_host}" "${prompt_char}"
esac
esac
git_check_if_worktree() {
# This function intend to be only executed in chpwd().
# Check if the current path is in git repo.
# We would want stop this function, on some big git repos it can take some time to cd into.
if [ -n "${skip_zsh_git}" ]; then
git_pwd_is_worktree='false'
return 1
fi
# The : separated list of paths where we will run check for git repo.
# If not set, then we will do it only for /root and /home.
if [ "${UID}" = '0' ]; then
# running 'git' in repo changes owner of git's index files to root, skip prompt git magic if CWD=/home/*
git_check_if_workdir_path="${git_check_if_workdir_path:-/root:/etc}"
else
git_check_if_workdir_path="${git_check_if_workdir_path:-/home}"
git_check_if_workdir_path_exclude="${git_check_if_workdir_path_exclude:-${HOME}/_sshfs}"
fi
if begin_with "${PWD}" ${=git_check_if_workdir_path//:/ }; then
if ! begin_with "${PWD}" ${=git_check_if_workdir_path_exclude//:/ }; then
local git_pwd_is_worktree_match='true'
else
local git_pwd_is_worktree_match='false'
fi
fi
if ! [ "${git_pwd_is_worktree_match}" = 'true' ]; then
git_pwd_is_worktree='false'
return 1
fi
# todo: Prevent checking for /.git or /home/.git, if PWD=/home or PWD=/ maybe...
# damn annoying RBAC messages about Access denied there.
if [ -d '.git' ] || [ "$(git rev-parse --is-inside-work-tree 2> /dev/null)" = 'true' ]; then
git_pwd_is_worktree='true'
git_worktree_is_bare="$(git config core.bare)"
else
unset git_branch git_worktree_is_bare
git_pwd_is_worktree='false'
fi
git_branch() {
git_branch="$(git symbolic-ref HEAD 2>/dev/null)"
git_branch="${git_branch##*/}"
git_branch="${git_branch:-no branch}"
git_dirty() {
if [ "${git_worktree_is_bare}" = 'false' ] && [ -n "$(git status --untracked-files='no' --porcelain)" ]; then
git_dirty='%F{green}*'
else
unset git_dirty
fi
precmd() {
# Set terminal title.
termtitle precmd
if [ "${git_pwd_is_worktree}" = 'true' ]; then
git_branch
git_dirty
git_prompt=" %F{blue}[%F{253}${git_branch}${git_dirty}%F{blue}]"
else
unset git_prompt
fi
preexec() {
# Set terminal title along with current executed command pass as second argument
termtitle preexec "${(V)1}"
chpwd() {
git_check_if_worktree
man() {
if command -v vimmanpager >/dev/null 2>&1; then
PAGER="vimmanpager" command man "$@"
else
command man "$@"
fi
# Are we running under grsecurity's RBAC?
rbac_auth() {
local auth_to_role='admin'
if [ "${USER}" = 'root' ]; then
if ! grep -qE '^RBAC:' "/proc/self/status" && command -v gradm > /dev/null 2>&1; then
echo -e "\n${BLUE}*${NC} ${GREEN}RBAC${NC} Authorize to '${auth_to_role}' RBAC role."
gradm -a "${auth_to_role}"
fi
fi
#rbac_auth
# Check if we started zsh in git worktree, useful with tmux when your new zsh may spawn in source dir.
git_check_if_worktree
if [ "${git_pwd_is_worktree}" = 'true' ]; then
git_branch
git_dirty
git_prompt=" %F{blue}[%F{253}${git_branch}${git_dirty}%F{blue}]"
else
unset git_prompt
fi
# Le features!
# extended globbing, awesome!
setopt extendedGlob
# zmv - a command for renaming files by means of shell patterns.
autoload -U zmv
# zargs, as an alternative to find -exec and xargs.
autoload -U zargs
# Turn on command substitution in the prompt (and parameter expansion and arithmetic expansion).
setopt promptsubst
# Control-x-e to open current line in $EDITOR, awesome when writting functions or editing multiline commands.
autoload -U edit-command-line
zle -N edit-command-line
bindkey '^x^e' edit-command-line
# Include user-specified configs.
if [ ! -d "${ZSHDDIR}" ]; then
mkdir -p "${ZSHDDIR}" && echo "# Put your user-specified config here." > "${ZSHDDIR}/example.zsh"
fi
for zshd in $(ls -A ${HOME}/.config/zsh.d/^*.(z)sh$); do
. "${zshd}"
done
# Completion.
autoload -Uz compinit
compinit
zstyle ':completion:*' matcher-list 'm:{a-z}={A-Z}'
zstyle ':completion:*' completer _expand _complete _ignored _approximate
zstyle ':completion:*' menu select=2
zstyle ':completion:*' select-prompt '%SScrolling active: current selection at %p%s'
zstyle ':completion::complete:*' use-cache 1
zstyle ':completion:*:descriptions' format '%U%F{cyan}%d%f%u'
# If running as root and nice >0, renice to 0.
if [ "$USER" = 'root' ] && [ "$(cut -d ' ' -f 19 /proc/$$/stat)" -gt 0 ]; then
renice -n 0 -p "$$" && echo "# Adjusted nice level for current shell to 0."
fi
# Fancy prompt.
if over_ssh && [ -z "${TMUX}" ]; then
prompt_is_ssh='%F{blue}[%F{red}SSH%F{blue}] '
elif over_ssh; then
prompt_is_ssh='%F{blue}[%F{253}SSH%F{blue}] '
else
unset prompt_is_ssh
fi
case $USER in
root)
PROMPT='%B%F{cyan}%m%k %(?..%F{blue}[%F{253}%?%F{blue}] )${prompt_is_ssh}%B%F{blue}%1~${git_prompt}%F{blue} %# %b%f%k'
PROMPT='%B%F{blue}%n@%m%k %(?..%F{blue}[%F{253}%?%F{blue}] )${prompt_is_ssh}%B%F{cyan}%1~${git_prompt}%F{cyan} %# %b%f%k'
esac
# Ignore lines prefixed with '#'.
setopt interactivecomments
# Ignore duplicate in history.
setopt hist_ignore_dups
# Prevent record in history entry if preceding them with at least one space
setopt hist_ignore_space
# Nobody need flow control anymore. Troublesome feature.
#stty -ixon
setopt noflowcontrol
# Fix for tmux on linux.
case "$(uname -o)" in
'GNU/Linux')
export EVENT_NOEPOLL=1
esac
# Aliases
alias cp='cp -iv'
alias rcp='rsync -v --progress'
alias rmv='rsync -v --progress --remove-source-files'
alias mv='mv -iv'
alias rm='rm -iv'
alias rmdir='rmdir -v'
alias ln='ln -v'
alias chmod="chmod -c"
alias chown="chown -c"
if command -v colordiff > /dev/null 2>&1; then
alias diff="colordiff -Nuar"
else
alias diff="diff -Nuar"
fi
alias grep='grep --colour=auto'
alias egrep='egrep --colour=auto'
alias ls='ls --color=auto --human-readable --group-directories-first --classify'
# Keys.
case $TERM in
rxvt*|xterm*)
bindkey "^[[7~" beginning-of-line #Home key
bindkey "^[[8~" end-of-line #End key
bindkey "^[[3~" delete-char #Del key
bindkey "^[[A" history-beginning-search-backward #Up Arrow
bindkey "^[[B" history-beginning-search-forward #Down Arrow
bindkey "^[Oc" forward-word # control + right arrow
bindkey "^[Od" backward-word # control + left arrow
bindkey "^H" backward-kill-word # control + backspace
bindkey "^[[3^" kill-word # control + delete
linux)
bindkey "^[[1~" beginning-of-line #Home key
bindkey "^[[4~" end-of-line #End key
bindkey "^[[3~" delete-char #Del key
bindkey "^[[A" history-beginning-search-backward
bindkey "^[[B" history-beginning-search-forward
screen|screen-*)
bindkey "^[[1~" beginning-of-line #Home key
bindkey "^[[4~" end-of-line #End key
bindkey "^[[3~" delete-char #Del key
bindkey "^[[A" history-beginning-search-backward #Up Arrow
bindkey "^[[B" history-beginning-search-forward #Down Arrow
bindkey "^[Oc" forward-word # control + right arrow
bindkey "^[Od" backward-word # control + left arrow
bindkey "^H" backward-kill-word # control + backspace
bindkey "^[[3^" kill-word # control + delete
esac
bindkey "^R" history-incremental-pattern-search-backward
bindkey "^S" history-incremental-pattern-search-forward
if [ -f ~/.alert ]; then cat ~/.alert; fi
Thanks for all the help.
Last edited by Shark (2013-05-11 22:32:24)

Raynman wrote:
"This expression doesn't work", "It doesn't work" ...
Could you try being a bit more specific?
Firstly, i am sorry i didn't post the output. I should have know better.
Secondly, chill out.
I have used above regex with grep command. Output from terminal is:
zsh: bad pattern: ^[^#]
In bash it works perfectly.
If i issue "setopt re_match_pcre" i have the same ouput as above.
EDIT: If i issue "unsetopt no_match" it actually works but i have to change the regex from "\^\[^#]" to "\^[^#]" otherwise i get the same output as above. In bash both options work.
Last edited by Shark (2013-05-11 22:07:21)

Problem in creating a Regular Expression with gnu

Hi All,
iam trying to create a regular expression using gnu package api..
gnu.regex.RE;
i need to validate the browser's(MSIE) userAgent through my regular expression
userAgent is like :First one ==> Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
i wrote an regular expression like this:
Mozilla.*(.*)\\s*(.*)compatible;\\s*MSIE(.*)\\s*(.*)([0-9]\\.[0-9])(.*);\\s*(.*)Windows(.*)\\s*NT(.*)\\s*5.0(.*)
Actaully this is validating my userAgent and returns true, my problem is, it is returning true if userAgent is having more words at the end after Windows NT 5.0 like Second One ==> Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) Testing
i want the regularExpression pattern to validate the First one and return true for it, and has to return false for the Second one..
my code is:
import gnu.regexp.*;
import gnu.regexp.REException;
public class TestRegexp
public static boolean getUserAgentDetails(String userAgent)
     boolean isvalid = false;
     RE regexp = new RE("Mozilla.*(.*)\\s*(.*)compatible;\\s*MSIE(.*)\\s*(.*)([0-9]\\.[0-9])(.*);\\s*(.*)Windows(.*)\\s*NT(.*)\\s*5.0(.*)");
     isvalid = regexp.isMatch(userAgent);
     return isvalid;
public static void main(String a[])
     String userAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)";
     boolean regoutput = getUserAgentDetails(userAgent);
     System.out.println("***** regoutput is ****** " + regoutput);
}please help me in solving this..
Thanks in Advance..
thanx,
krishna

Ofcourse, i can do comparision with simple string matching..
but problem is the userAgent that i want to support is for all the MSIE versions ranging from 5.0 onwards, so there will the version difference of IE like MSIE 6.0..! or MSIE 5.5 some thing like that..
any ways i will try with StringTokenizer once..!
seems that will do my work..
Thanks,
krishna

Introduction to regular expressions ...

I'm well aware that there are already some articles on that topic, some people asked me to share some of my knowledge on this topic. Please take a look at this first part and let me know if you find this useful. If yes, I'm going to continue on writing more parts using more and more complicated expressions - if you have questions or problems that you think could be solved through regular expression, please post them.
Introduction
Oracle has always provided some character/string functions in its PL/SQL command set, such as SUBSTR, REPLACE or TRANSLATE. With 10g, Oracle finally gave us, the users, the developers and of course the DBAs regular expressions. However, regular expressions, due to their sometimes cryptic rules, seem to be overlooked quite often, despite the existence of some very interesing use cases. Beeing one of the advocates of regular expression, I thought I'll give the interested audience an introduction to these new functions in several installments.
Having fun with regular expressions - Part 1
Oracle offers the use of regular expression through several functions: REGEXP_INSTR, REGEXP_SUBSTR, REGEXP_REPLACE and REGEXP_LIKE. The second part of each function already gives away its purpose: INSTR for finding a position inside a string, SUBSTR for extracting a part of a string, REPLACE for replacing parts of a string. REGEXP_LIKE is a special case since it could be compared to the LIKE operator and is therefore usually used in comparisons like IF statements or WHERE clauses.
Regular expressions excel, in my opinion, in search and extraction of strings, using that for finding or replacing certain strings or check for certain formatting criterias. They're not very good at formatting strings itself, except for some special cases I'm going to demonstrate.
If you're not familiar with regular expression, you should take a look at the definition in Oracle's user guide Using Regular Expressions With Oracle Database, and please note that there have been some changes and advancements in 10g2. I'll provide examples, that should work on both versions.
Some of you probably already encountered this problem: checking a number inside a string, because, for whatever reason, a column was defined as VARCHAR2 and not as NUMBER as one would have expected.
Let's check for all rows where column col1 does NOT include an unsigned integer. I'll use this SELECT for demonstrating different values and search patterns:
WITH t AS (SELECT '456' col1
             FROM dual
            UNION
           SELECT '123x'
             FROM dual
            UNION
           SELECT 'x123'
             FROM dual
            UNION
           SELECT 'y'
             FROM dual
            UNION
           SELECT '+789'
             FROM dual
            UNION
           SELECT '-789'
             FROM dual
            UNION
           SELECT '159-'
             FROM dual
            UNION
           SELECT '-1-'
             FROM dual
SELECT t.col1
FROM t
WHERE NOT REGEXP_LIKE(t.col1, '^[0-9]+$')
;Let's take a look at the 2nd argument of this REGEXP function: '^[0-9]+$'. Translated it would mean: start at the beginning of the string, check if there's one or more characters in the range between '0' and '9' (also called a matching character list) until the end of this string. "^", "[", "]", "+", "$" are all Metacharacters.
To understand regular expressions, you have to "think" in regular expressions. Each regular expression tries to "fit" an available string into its pattern and returns a result beeing successful or not, depending on the function. The "art" of using regular expressions is to construct the right search pattern for a certain task. Using functions like TRANSLATE or REPLACE did already teach you using search patterns, regular expressions are just an extension to this paradigma. Another side note: most of the search patterns are placeholders for single characters, not strings.
I'll take this example a bit further. What would happen if we would remove the "$" in our example? "$" means: (until the) end of a string. Without this, this expression would only search digits from the beginning until it encounters either another character or the end of the string. So this time, '123x' would be removed from the SELECTION since it does fit into the pattern.
Another change: we will keep the "$" but remove the "^". This character has several meanings, but in this case it declares: (start from the) beginning of a string. Without it, the function will search for a part of a string that has only digits until the end of the searched string. 'x123' would now be removed from our selection.
Now there's a question: what happens if I remove both, "^" and "$"? Well, just think about it. We now ask to find any string that contains at least one or more digits, so both '123x' and 'x123' will not show up in the result.
So what if I want to look for signed integer, since "+" is also used for a search expression. Escaping is the name of the game. We'll just use '^\+[0-9]+$' Did you notice the "\" before the first "+"? This is now a search pattern for the plus sign.
Should signed integers include negative numbers as well? Of course they should, and I'll once again use a matching character list. In this list, I don't need to do escaping, although it is possible. The result string would now look like this: '^[+-]?[0-9]+$'. Did you notice the "?"? This is another metacharacter that changes the placeholder for plus and minus to an optional placeholder, which means: if there's a "+" or "-", that's ok, if there's none, that's also ok. Only if there's a different character, then again the search pattern will fail.
Addendum: From this on, I found a mistake in my examples. If you would have tested my old examples with test data that would have included multiple signs strings, like "--", "-+", "++", they would have been filtered by the SELECT statement. I mistakenly used the "*" instead of the "?" operator. The reason why this is a bad idea, can also be found in the user guide: the "*" meta character is defined as 0 to multiple occurrences.
Looking at the values, one could ask the question: what about the integers with a trailing sign? Quite simple, right? Let's just add another '[+-] and the search pattern would look like this: '^[+-]?[0-9]+[+-]?$'.
Wait a minute, what happened to the row with the column value "-1-"?
You probably already guessed it: the new pattern qualifies this one also as a valid string. I could now split this pattern into several conditions combined through a logical OR, but there's something even better: a logical OR inside the regular expression. It's symbol is "|", the pipe sign.
Changing the search pattern again to something like this '^[+-]?[0-9]+$|^[0-9]+[+-]?$' [1] would return now the "-1-" value. Do I have to duplicate the same elements like "^" and "$", what about more complicated, repeating elements in future examples? That's where subexpressions/grouping comes into play. If I want only certain parts of the search pattern using an OR operator, we can put those inside round brackets. '^([+-]?[0-9]+|[0-9]+[+-]?)$' serves the same purpose and allows for further checks without duplicating the whole pattern.
Now looking for integers is nice, but what about decimal numbers? Those may be a bit more complicated, but all I have to do is again to think in (meta) characters. I'll just use an example where the decimal point is represented by ".", which again needs escaping, since it's also the place holder in regular expressions for "any character".
Valid decimals in my example would be ".0", "0.0", "0.", "0" (integer of course) but not ".". If you want, you can test it with the TO_NUMBER function. Finding such an unsigned decimal number could then be formulated like this: from the beginning of a string we will either allow a decimal point plus any number of digits OR at least one digits plus an optional decimal point followed by optional any number of digits. Think about it for a minute, how would you formulate such a search pattern?
Compare your solution to this one:
'^(\.[0-9]+|[0-9]+(\.[0-9]*)?)$'
Addendum: Here I have to use both "?" and "*" to make sure, that I can have 0 to many digits after the decimal point, but only 0 to 1 occurrence of this substrings. Otherwise, strings like "1.9.9.9" would be possible, if I would write it like this:
'^(\.[0-9]+|[0-9]+(\.[0-9]*)*)$'Some of you now might say: Hey, what about signed decimal numbers? You could of course combine all the ideas so far and you will end up with a very long and almost unreadable search pattern, or you start combining several regular expression functions. Think about it: Why put all the search patterns into one function? Why not split those into several steps like "check for a valid decimal" and "check for sign".
I'll just use another SELECT to show what I want to do:
WITH t AS (SELECT '0' col1
             FROM dual
            UNION
           SELECT '0.'
             FROM dual
            UNION
           SELECT '.0'
             FROM dual
            UNION
           SELECT '0.0'
             FROM dual
            UNION
           SELECT '-1.0'
             FROM dual
            UNION
           SELECT '.1-'
             FROM dual
            UNION
           SELECT '.'
             FROM dual
            UNION
           SELECT '-1.1-'
             FROM dual
SELECT t.*
FROM t
;From this select, the only rows I need to find are those with the column values "." and "-1.1-". I'll start this with a check for valid signs. Since I want to combine this with the check for valid decimals, I'll first try to extract a substring with valid signs through the REGEXP_SUBSTR function:
NVL(REGEXP_SUBSTR(t.col1, '^([+-]?[^+-]+|[^+-]+[+-]?)$'), ' ')Remember the OR operator and the matching character collections? But several "^"? Some of the meta characters inside a search pattern can have different meanings, depending on their positions and combination with other meta characters. In this case, the pattern translates into: from the beginning of the string search for "+" or "-" followed by at least another character that is not "+" or "-". The second pattern after the "|" OR operator does the same for a sign at the end of the string.
This only checks for a sign but not if there also only digits and a decimal point inside the string. If the search string fails, for example when we have more than one sign like in the "-1.1-", the function returns NULL. NULL and LIKE don't go together very well, so we'll just add NVL with a default value that tells the LIKE to ignore this string, in this case a space.
All we have to do now is to combine the check for the sign and the check for a valid decimal number, but don't forget an option for the signs at the beginning or end of the string, otherwise your second check will fail on the signed decimals. Are you ready?
Does your solution look a bit like this?
WHERE NOT REGEXP_LIKE(NVL(REGEXP_SUBSTR(t.col1,
                           '^([+-]?[^+-]+|[^+-]+[+-]?)$'),
                       '^[+-]?(\.[0-9]+|[0-9]+(\.[0-9]*)?)[+-]?$'
                      )Now the optional sign checks in the REGEXP_LIKE argument can be added to both ends, since the SUBSTR won't allow any string with signs on both ends. Thinking in regular expression again.
Continued in Introduction to regular expressions ... continued.
C.
Fixed some embarrassing typos ... and mistakes.
cd

Excellent write up CD. Very nice indeed. Hopefully you'll be completing parts 2 and 3 some time soon. And with any luck, your article will encourage others to do the same....I know there's a few I'd like to see and a few I'd like to have a go at writing too :-)

Range & Regular Expression issue.

Similar Messages

Maybe you are looking for