Regular expression to add undesrcore before single capital

I am trying to convert the names of attributes that use capitalization sort of like camelcase to distinguish multiple words, e.g. VehicleColor to use underscores instead, eg. Vehicle_Color.
I have a regular expression that does this, however I have a problem when an abbreviation consisting of multiple upper case characters is present, e.g. AverageMPG becomes Average_M_P_G. I am trying to come up with a pattern that only adds the underscores to the first occurrence of a capital letter in a series which should result in the abbreviation MPG becoming Average_MPG.
SQL> select * from v$version where rownum = 1;
BANNER
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
SQL> with test_data as
2      (
3      select 'VehicleColor' str from dual union all
4      select 'WeightClass' str from dual union all
5      select 'AverageMPG' str from dual union all
6      select 'HighMPG' str from dual union all
7      select 'LowMPG' str from dual union all
8      select 'ABS_System' str from dual
9      )
10 select
11      str,
12      regexp_replace(str, '([A-Z])', '_\1', 2) result
13 from
14      test_data;
STR          RESULT
VehicleColor Vehicle_Color
WeightClass Weight_Class
AverageMPG   Average_M_P_G
HighMPG      High_M_P_G
LowMPG       Low_M_P_G
ABS_System   A_B_S__System
6 rows selected.
SQL>These are the results I would like, but I don't know how to modify the pattern to only have the replace act on the first capital letter in a series of capitals or if it is possible.
STR          RESULT
VehicleColor Vehicle_Color
WeightClass Weight_Class
AverageMPG   Average_MPG
HighMPG      High_MPG
LowMPG       Low_MPG
ABS_System   ABS_System

with test_data as
        select 'VehicleColor' str from dual union /**/all
        select 'WeightClass' str from dual union /**/all
        select 'AverageMPG' str from dual union/**/ all
        select 'HighMPG' str from dual union/**/ all
       select 'LowMPG' str from dual union/**/ all
        select 'ABS_System' str from dual
        select str, replace(regexp_replace(replace(str,'_',' '), '([^[:upper:]])([[:upper:]]{1,})([^[:upper:]]|$)', '\1_\2\3' ),' ') result
        from test_data
STR     RESULT
VehicleColor     Vehicle_Color
WeightClass     Weight_Class
AverageMPG     Average_MPG
HighMPG     High_MPG
LowMPG     Low_MPG
ABS_System     ABS_System

Similar Messages

Find text using regular expression and add highlight annotation

Hi Friends
Is it possible to find text using regular expression and add highlight annotation using plugin

A plugin can use the PDWordFinder to get a list of the words on a page, and their location. That's all that the API offers for searching. Of course, you can use a regular expression library to work with that word list.

Hi Friends, i have to validate the min.length attribute using Regular expression, The min.length accepts single value digit only which accepts =5 & min.length accepts =9 please guide me as soon as possible

dfdsfds

User, we need to know your jdev version!
It would be helpful if your question would not only the header of the post.
Can you please rephrase your question? Somehow i don't understand xour question.
Timo

Insert space before the capital letter in a word

Hi,
I have a column with list of names like
reports
admissionPage
topHighCost
requestedReports
Like this there will be many rows. I want to add space before the capital letter.
This is a rdl query. Is it is possible to do in SSRS? Or we should do in Tsql? Please help me in this issue.
Thanks in advance..
BALUSUSRIHARSHA

Its going to be pretty complicated, in my opinion, to do this in SSRS.
For t-sql, I developed a function for you.
CREATE FUNCTION fn_text_with_space(@STRING VARCHAR(MAX))
RETURNS
VARCHAR(MAX)
AS
BEGIN
DECLARE @NEWSTRING VARCHAR(MAX) = '';
declare @count int = 1;
WHILE @COUNT <= LEN(@STRING)
BEGIN
IF SUBSTRING(@STRING, @COUNT, 1) COLLATE Latin1_general_CS_AS = UPPER(SUBSTRING(@STRING, @COUNT, 1)) COLLATE Latin1_general_CS_AS
BEGIN
SET @NEWSTRING = @NEWSTRING + ' ' + SUBSTRING(@STRING, @COUNT, 1);
END
ELSE
BEGIN
SET @NEWSTRING = @NEWSTRING + SUBSTRING(@STRING, @COUNT, 1);
END
SET @COUNT = @COUNT + 1;
END
RETURN @NEWSTRING;
END

Regular expression checker

hi,
i am new to regular expression. i would like to know if there is a tool to check regular expressions? the tool should based on the entered regular expression display the result.
thanks

import java.util.regex.*;
import java.awt.*;
import java.awt.event.*;
import javax.swing.*;
public class JavaRegxTest extends JFrame implements ActionListener{
JTextField regxInput, textInput;
JButton doMatchButton, resetButton, exitButton;
JTextArea resultsArea;
public JavaRegxTest(){
    super("JavaRegxTest");
    setDefaultCloseOperation(EXIT_ON_CLOSE);
    Container cp = getContentPane();
    cp.setLayout(new GridLayout(2, 1));
    JPanel upPanel = new JPanel(new GridLayout(5, 1));
    upPanel.add(new JLabel("regular expression:   "));
    upPanel.add(regxInput = new JTextField(50));
    upPanel.add(new JLabel("sample text:          "));
    upPanel.add(textInput = new JTextField(50));
    JPanel buttonPanel = new JPanel(new GridLayout(1, 3));
    buttonPanel.add(resetButton = new JButton("RESET"));
    buttonPanel.add(doMatchButton = new JButton("MATCH"));
    buttonPanel.add(exitButton = new JButton("EXIT"));
    upPanel.add(buttonPanel);
    resetButton.addActionListener(this);
    doMatchButton.addActionListener(this);
    exitButton.addActionListener(this);
    resultsArea = new JTextArea(20, 60);
    resultsArea.setEditable(false);
    JScrollPane jsp = new JScrollPane(resultsArea);
    cp.add(upPanel);
    cp.add(jsp);
    setSize(550, 700);
    setVisible(true);
public void actionPerformed(ActionEvent e){
    JButton btn = (JButton)(e.getSource());
    if (btn == exitButton){
      System.exit(0);
    else if (btn == resetButton){
      reset();
    else if (btn == doMatchButton){
      doMatch();
/* clear text */
void reset(){
    regxInput.setText("");
    textInput.setText("");
    resultsArea.setText("");
/* display match results */
void doMatch(){
    resultsArea.setText(resultsArea.getText() + "REGX=" + regxInput.getText()
     + "\n" + "TEXT=" + textInput.getText() + "\n");
    try{
      Pattern pat = Pattern.compile(regxInput.getText());
      String sampleText = textInput.getText();
      Matcher mat = pat.matcher(sampleText);
      int gc = mat.groupCount();
      for (int i = 0; i <= gc; ++i){ //for each capture group
        resultsArea.setText(resultsArea.getText()
         + "GROUP" + i + " : \n"); //GROUP0 == whole match
        while (mat.find()){ //display every matched parts
          resultsArea.setText(resultsArea.getText()
           + " " + mat.group(i) + "\n");
        mat.reset(sampleText); //go to next group
    catch(Exception e){
      resultsArea.setText(e.toString());
public static void main(String[] args){
    JavaRegxTest jrt = new JavaRegxTest();
}

Regular expression for 2nd occurance of a substring in a string

Hi,
1)
i want to find the second occurrence of a substring in a string with regular expression so that i can modify that only.
Ex: i have a string like ---> axe,afn,sdk,jdi,afn,mki,mki
in this i want the second occurance of afn and change that one only...
which regular expression i have to use...
Note that ...i have to use regular expression only....no string manipulation methods...(strictly)
2)
How can i apply the multiple regular expressions multiple times on a single string ..i.e in the above instance i have to apply the same 2nd occurrence logic for
substring mki also. for this i have to use a single regular expression string that contains validations for both the sub strings mki and afn.
Thanks in advance,
Venkat

javafreak666 wrote:
Hi,
1)
i want to find the second occurrence of a substring in a string with regular expression so that i can modify that only.
Ex: i have a string like ---> axe,afn,sdk,jdi,afn,mki,mki
in this i want the second occurance of afn and change that one only...
which regular expression i have to use...
Note that ...i have to use regular expression only....no string manipulation methods...(strictly)
2)
How can i apply the multiple regular expressions multiple times on a single string ..i.e in the above instance i have to apply the same 2nd occurrence logic for
substring mki also. for this i have to use a single regular expression string that contains validations for both the sub strings mki and afn.
Thanks in advance,
VenkatWhat do you mean by using a regex to get the index of a second substring? There is not method in Java which uses regex to et the index of a substring.
There are various indexOf(...) methods for this:
String text = "axe,afn,sdk,jdi,afn,mki,mki";
String target = "afn";
int second = text.indexOf(target, text.indexOf(target)+1);
System.out.println("second="+second);Of course you can find the index of a group like this:
Matcher m = Pattern.compile(target+".*?("+target+")").matcher(text);
System.out.println(m.find() ? "index="+m.start(1) : "nothing found");but there is not single method that handles this: you'll have to call the find() and then the start(...) method on the Matcher instance, so the indexOf(...) approach is the favourable one, IMO.

Regular expression help please. (extracting a string subset between two markers)

I haven't used regular expressions before, and I'm having trouble finding a regular expression to extract a string subset between two markers.
The string;
Header stuff I don't want
Header stuff I don't want
Header stuff I don't want
Header stuff I don't want
Header stuff I don't want
Header stuff I don't want
ERRORS 6
Info I want line 1
Info I want line 2
Info I want line 3
Info I want line 4
Info I want line 5
Info I want line 6
END_ERRORS
From the string above (this is read from a text file) I'm trying to extract the string subset between ERRORS 6 and END_ERRORS. The number of errors (6 in this case) can be any number 1 through 32, and the number of lines I want to extract will correspond with this number. I can supply this number from a calling VI if necessary.
My current solution, which works but is not very elegant;
(1) uses Match Regular Expression to the return the string after matching ERRORS 6
(2) uses Match Regular Expression returning all characters before match END_ERRORS of the string returned by (1)
Is there a way this can be accomplished using 1 Match Regular Expression? If so could anyone suggest how, together with an explanation of how the regular expression given works.
Many thanks
Alan
Solved!
Go to Solution.

I used a character class to catch any word or whitespace characters. Putting this inside parentheses makes a submatch that you can get by expanding the Match Regular Expression node. The \d finds digits and the two *s repeat the previous term. So, \d* will find the '6', as well as '123456'.
Jim
You're entirely bonkers. But I'll tell you a secret. All the best people are. ~ Alice

Regular Expression. Select Statement. Carriage Return

Oracle 9i
Using SQLPLUS
I've read about regular expression and need some translation/explanation.
I have a large table containing a varchar2 (free text) column. Users may have inserted carriage returns when they entered the data. I need to locate rows that contain carriage returns, select and display them. Later I'll need to update those rows to replace the carriage returns with a space.
Can you assist with syntax. I believe use of a regular expression is required.
Thanks

for single characters like <CR> TRANSLATE() Doh. Never post at the end of a long day.
As the other posters have pointed out, one-for-one single character substitution is normally done with REPLACE(), although TRANSLATE() also works. The more normal role for TRANSLATE() is situations where you want to substitute multiple characters, e.g.
SQL> update <your table> set <your column> = replace (<your column> , chr(13)||chr(10), ' ');This substitutes a space for a carriage return and line feed combination.
Cheers, APC

Regular expressions in Format Definition add-on

Hello experts,
I have a question about regular expressions. I am a newbie in regular expressions and I could use some help on this one. I tried some 6 hours, but I can't get solve it myself.
Summary of my problem:
In SAP Business One (patch level 42) it is possible to use bank statement processing. A file (full of regular expressions) is to be selected, so it can match certain criteria to the bank statement file. The bank statement file consists of a certain pattern (look at the attached code snippet).
:61:071222D208,00N026
:86:P 12345678BELASTINGDIENST       F8R03782497                $GH
$0000009                         BETALINGSKENM. 123456789123456
0 1234567891234560
:61:071225C758,70N078
:86:0116664495 REGULA B.V. HELPMESTRAAT 243 B 5371 AM HARDCITY HARD
CITY 48772-54314
:61:071225C425,05N078
:86:0329883585 J. MANSSHOT PATTRIOTISLAND 38 1996 PT HELMEN BIJBETA
LING VOOR RELOOP RMP1 SET ORDERNR* 69866 / SPOEDIG LEVEREN
:61:071225C850,00N078
:86:0105327212 POSE TELEFOONSTRAAT 43 6448 SL S-ROTTERDAM MIJN OR
DERNR. 53846 REF. MAIL 21-02
- I am in search of the right type of regular expression that is used by the Format Definition add-on (javascript, .NET, perl, JAVA, python, etc.)
Besides that I need the regular expressions below, so the Format Definition will match the right lines from my bankfile.
- a regular expression that selects lines starting with :61: and line :86: including next lines (if available), so in fact it has to select everything from :86: till :61: again.
- a regular expression that selects the bank account number (position 5-14) from lines starting with :86:
- a regular expression that selects all other info from lines starting with :86: (and following if any), so all positions that follow after the bank account number
I am looking forward to the right solutions, I can give more info if you need any.

Hello Hendri,
Q1:I am in search of the right type of regular expression that is used by the Format Definition add-on (javascript, .NET, perl, JAVA, pythonetc.)
Answer: Format Definition uses .Net regular expression.
You may refer the following examples. If necessary, I can send you a guide about how to use regular expression in Format Defnition. Thanks.
Example 6
Description:
To match a field with an optional field in front. For example, u201C:61:0711211121C216,08N051NONREFu201D or u201C:61:071121C216,08N051NONREFu201D, which comprises of a record identification u201C:61:u201D, a date in the form of YYMMDD, anther optional date MMDD, one or two characters to signify the direction of money flow, a numeric amount value and some other information. The target to be matched is the numeric amount value.
Regular expression:
(?<=:61:\d(\d)?[a-zA-Z]{1,2})((\d(,\d*)?)|(,\d))
Text:
:61:0711211121C216,08N051NONREF
Matches:
1
Tips:
1.     All the fields in front of the target field are described in the look behind assertion embraced by (?<= and ). Especially, the optional field is embraced by parentheses and then a u201C?u201D (question mark). The sub expression for amount is copied from example 1. You can compose your own regular expression for such cases in the form of (?<=REGEX_FOR_FIELDS_IN_FRONT)(REGEX_FOR_TARGET_FIELD), in which REGEX_FOR_FIELDS_IN_FRONT and REGEX_FOR_TARGET_FIELD are respectively the regular expression for the fields in front and the target field. Keep the parentheses therein.
Example 7
Description:
Find all numbers in the free text description, which are possibly document identifications, e.g. for invoices
Regular expression:
(?<=\b)(?<!\.)\d+(?=\b)(?!\.)
Text:
:86:GIRO 6890316
ENERGETICA NATURA BENELU
AFRIKAWEG 14
HULST
3187-A1176
TRANSACTIEDATUM* 03-07-2007
Matches:
6
Tips:
1.     The regular expression given finds all digits between word boundaries except those with a prior dot or following dot; u201C.u201D (dot) is escaped as \.
2.     It may find out some inaccurate matches, like the date in text. If you want to exclude u201C-u201D (hyphen) as prior or following character, resemble the case for u201C.u201D (dot), the regular expression becomes (?<=\b)(?<!\.)(?<!-)\d+(?=\b)(?!\.)(?!-). The matches will be:
:86:GIRO 6890316
ENERGETICA NATURA BENELU
AFRIKAWEG 14
HULST
3187-A1176
TRANSACTIEDATUM* 03-07-2007
You may lose some real values like u201C3187u201D before the u201C-u201D.
Example 8
Description:
Find BP account number in 9 digits with a prior u201CPu201D or u201C0u201D in the first position of free text description
Regular expression:
(?<=^(P|0))\d
Text:
0000006681 FORTIS ASR BETALINGSCENTRUM BV
Matches:
1
Tips:
1.     Use positive look behind assertion (?<=PRIOR_KEYWORD) to express the prior keyword.
2.     u201C^u201D stands for that match starts from the beginning of the text. If the text includes the record identification, you may include it also in the look behind assertion. For example,
:86:0000006681 FORTIS ASR BETALINGSCENTRUM BV
The regular expression becomes
(?<=:86:(P|0))\d
Example 9
Description:
Following example 8, to find the possible BP name after BP account number, which is composed of letter, dot or space.
Regular expression:
(?<=^(P|0)\d)[a-zA-Z. ]*
Text:
0000006681 FORTIS ASR BETALINGSCENTRUM BV
Matches:
1
Tips:
1.     In this case, put BP account number regular expression into the look behind assertion.
Example 10
Description:
Find the possible document identifications in a sub-record of :86: record. Sub-record is like u201C?00u201D, u201C?10u201D etc. A possible document identification sub-record is made up of the following parts:
u2022     keyword u201CREu201D, u201CRGu201D, u201CRu201D, u201CINVu201D, u201CNRu201D, u201CNOu201D, u201CRECHNu201D or u201CRECHNUNGu201D, and
u2022     an optional group made up of following:
     a separator of either a dot, hyphen or slash, and
     an optional space, and
     an optional string starting with keyword u201CNRu201D or u201CNOu201D followed by a separator of either a dot, hyphen or slash, and
     an optional space
u2022     and finally document identification in digits
Regular expression:
(?<=\?\d(RE|RG|R|INV|NR|NO|RECHN|RECHNUNG)((\.|-|/)\s?((NR|NO)(\.|-|/))?\s?)?)\d+
Kind Regards
-Yatsea

Add regular expression to spotlite!

hello!
i don't really know where to put this request, so i just do it here:
please add regular expression search to spotlite!
would be really great and i think not that much of a work..

And even more effective would be to do a bit of digging before making suggestions to Apple.
For instance, a simple googling of "spotlight regex" immediately provides a suggestion or two. Checking Spotlight's query syntax might also lead in interesting directions. And, of course, Spotlight (which is a mighty <censored> excuse for a search engine anyway) is really mdfind, and we all know we can pipe mdfind results to grep, don't we?

Regular Expression - Extract words before the PLUS Sign ?

Dear All,
I had many words with having a symbol plus. I need to extract the words before the plus sign.
I can able to do this by using String.indexOf or String.contains. But i like to know is there is any way to do this using Regular Expression.
sample string
Kathire+san Output Kathire
World+islike Output World
Thanks,
J.Kathir

Here's one way.
import java.util.regex.Pattern;
String input = "abc+def";
Pattern pat = pat.compile("\\+");
String beforePlus = pat.split(input)[0];
Sun's Regular Expression Tutorial for Java
Regular-Expressions.info

Regular Expression for allowing user to add multiple email id's

Hi ,
right now i'm working on regex in javascript and which accepts only one single email id
Can any one help me with a Regular Expression which accepts multiple email id's in the "To" field.

Try this:
/^[\w.-]+@[A-Z0-9.-]+\.(?:[A-Z]{2}|com|org|net|bi
z|info|name|aero|jobs|museum)(?:[,;
]+[\w.-]+@[A-Z0-9.-]+\.(?:[A-Z]{2}|com|org|net|biz|inf
o|name|aero|jobs|museum))*$/i..this is want i was trying to achieve....Thank you
very much, it worked perfectly.Hi friends,
I wanted to share with you all , the regex I have written
regex works perfectly fine...valid delimiters are "," & ";" i (either of comma or semicolon):-
/^[\w\.-]+@[A-Z0-9.-]+\.(?:[A-Z]{2}|com|org|net|biz|info|name|aero|jobs|museum)((,|;)[\w\.-]+@[A-Z0-9.-]+\.(?:[A-Z]{2}|com|org|net|biz|info|name|aero|jobs|museum))*$/i
thnx,
madhun

How can I add Regular Expression verify?

I need to write a Regular Expression for date whose format is 'yyyy-mm-dd'('2004-5-3' or '2004-05-03') and a Regular Expression for phone number whoes format is '12345678' or '1234567' or '13809441234'.
Thanks for any help.

The date format is a little tricky as there are many invalid combinations. You'd be better off calling TO_DATE with your user input and format mask. If an error is returned, the date is invalid.
For the phone number assuming a valid phone number could be 7, 8, or 11 digits (it's a little difficult to tell what the format your suggest is) then you could apply something like the following:
^[0-9]{7}([0-9]|[0-9]{4})?$
Regards.

How to add regular expression in viewobject bind variables

Hi,
Am using java class to set the adf bind variables
vm.setWhereClause("FIRST_NAME = :fname");
vm.defineNamedWhereClauseParam("fname", null, null);
vm.setNamedWhereClauseParam("fname","Alana");
It works fine and it returns Alana
But what i want is to set a regular expression where it returns all the names that are starting with A.
I tried with this.
vm.setNamedWhereClauseParam("fname","A*");
also
vm.setNamedWhereClauseParam("fname","A%");
but both didn work.
Please help.
Thanks,
Hari

Perfect... Thanks Arun.
I tried including last name too. i.e when either of the one matches (first name or last name)
vm.setWhereClause("upper(FIRST_NAME) LIKE upper(:fname||'%')");
vm.defineNamedWhereClauseParam("fname", null, null);
vm.setNamedWhereClauseParam("fname","A");
vm.setWhereClause("upper(LAST_NAME) LIKE upper(:lname||'%')");
vm.defineNamedWhereClauseParam("lname", null, null);
vm.setNamedWhereClauseParam("lname","A");
But i guess we should not give it like this as it doesnt seem to work.
Hari

Introduction to regular expressions ...

I'm well aware that there are already some articles on that topic, some people asked me to share some of my knowledge on this topic. Please take a look at this first part and let me know if you find this useful. If yes, I'm going to continue on writing more parts using more and more complicated expressions - if you have questions or problems that you think could be solved through regular expression, please post them.
Introduction
Oracle has always provided some character/string functions in its PL/SQL command set, such as SUBSTR, REPLACE or TRANSLATE. With 10g, Oracle finally gave us, the users, the developers and of course the DBAs regular expressions. However, regular expressions, due to their sometimes cryptic rules, seem to be overlooked quite often, despite the existence of some very interesing use cases. Beeing one of the advocates of regular expression, I thought I'll give the interested audience an introduction to these new functions in several installments.
Having fun with regular expressions - Part 1
Oracle offers the use of regular expression through several functions: REGEXP_INSTR, REGEXP_SUBSTR, REGEXP_REPLACE and REGEXP_LIKE. The second part of each function already gives away its purpose: INSTR for finding a position inside a string, SUBSTR for extracting a part of a string, REPLACE for replacing parts of a string. REGEXP_LIKE is a special case since it could be compared to the LIKE operator and is therefore usually used in comparisons like IF statements or WHERE clauses.
Regular expressions excel, in my opinion, in search and extraction of strings, using that for finding or replacing certain strings or check for certain formatting criterias. They're not very good at formatting strings itself, except for some special cases I'm going to demonstrate.
If you're not familiar with regular expression, you should take a look at the definition in Oracle's user guide Using Regular Expressions With Oracle Database, and please note that there have been some changes and advancements in 10g2. I'll provide examples, that should work on both versions.
Some of you probably already encountered this problem: checking a number inside a string, because, for whatever reason, a column was defined as VARCHAR2 and not as NUMBER as one would have expected.
Let's check for all rows where column col1 does NOT include an unsigned integer. I'll use this SELECT for demonstrating different values and search patterns:
WITH t AS (SELECT '456' col1
             FROM dual
            UNION
           SELECT '123x'
             FROM dual
            UNION
           SELECT 'x123'
             FROM dual
            UNION
           SELECT 'y'
             FROM dual
            UNION
           SELECT '+789'
             FROM dual
            UNION
           SELECT '-789'
             FROM dual
            UNION
           SELECT '159-'
             FROM dual
            UNION
           SELECT '-1-'
             FROM dual
SELECT t.col1
FROM t
WHERE NOT REGEXP_LIKE(t.col1, '^[0-9]+$')
;Let's take a look at the 2nd argument of this REGEXP function: '^[0-9]+$'. Translated it would mean: start at the beginning of the string, check if there's one or more characters in the range between '0' and '9' (also called a matching character list) until the end of this string. "^", "[", "]", "+", "$" are all Metacharacters.
To understand regular expressions, you have to "think" in regular expressions. Each regular expression tries to "fit" an available string into its pattern and returns a result beeing successful or not, depending on the function. The "art" of using regular expressions is to construct the right search pattern for a certain task. Using functions like TRANSLATE or REPLACE did already teach you using search patterns, regular expressions are just an extension to this paradigma. Another side note: most of the search patterns are placeholders for single characters, not strings.
I'll take this example a bit further. What would happen if we would remove the "$" in our example? "$" means: (until the) end of a string. Without this, this expression would only search digits from the beginning until it encounters either another character or the end of the string. So this time, '123x' would be removed from the SELECTION since it does fit into the pattern.
Another change: we will keep the "$" but remove the "^". This character has several meanings, but in this case it declares: (start from the) beginning of a string. Without it, the function will search for a part of a string that has only digits until the end of the searched string. 'x123' would now be removed from our selection.
Now there's a question: what happens if I remove both, "^" and "$"? Well, just think about it. We now ask to find any string that contains at least one or more digits, so both '123x' and 'x123' will not show up in the result.
So what if I want to look for signed integer, since "+" is also used for a search expression. Escaping is the name of the game. We'll just use '^\+[0-9]+$' Did you notice the "\" before the first "+"? This is now a search pattern for the plus sign.
Should signed integers include negative numbers as well? Of course they should, and I'll once again use a matching character list. In this list, I don't need to do escaping, although it is possible. The result string would now look like this: '^[+-]?[0-9]+$'. Did you notice the "?"? This is another metacharacter that changes the placeholder for plus and minus to an optional placeholder, which means: if there's a "+" or "-", that's ok, if there's none, that's also ok. Only if there's a different character, then again the search pattern will fail.
Addendum: From this on, I found a mistake in my examples. If you would have tested my old examples with test data that would have included multiple signs strings, like "--", "-+", "++", they would have been filtered by the SELECT statement. I mistakenly used the "*" instead of the "?" operator. The reason why this is a bad idea, can also be found in the user guide: the "*" meta character is defined as 0 to multiple occurrences.
Looking at the values, one could ask the question: what about the integers with a trailing sign? Quite simple, right? Let's just add another '[+-] and the search pattern would look like this: '^[+-]?[0-9]+[+-]?$'.
Wait a minute, what happened to the row with the column value "-1-"?
You probably already guessed it: the new pattern qualifies this one also as a valid string. I could now split this pattern into several conditions combined through a logical OR, but there's something even better: a logical OR inside the regular expression. It's symbol is "|", the pipe sign.
Changing the search pattern again to something like this '^[+-]?[0-9]+$|^[0-9]+[+-]?$' [1] would return now the "-1-" value. Do I have to duplicate the same elements like "^" and "$", what about more complicated, repeating elements in future examples? That's where subexpressions/grouping comes into play. If I want only certain parts of the search pattern using an OR operator, we can put those inside round brackets. '^([+-]?[0-9]+|[0-9]+[+-]?)$' serves the same purpose and allows for further checks without duplicating the whole pattern.
Now looking for integers is nice, but what about decimal numbers? Those may be a bit more complicated, but all I have to do is again to think in (meta) characters. I'll just use an example where the decimal point is represented by ".", which again needs escaping, since it's also the place holder in regular expressions for "any character".
Valid decimals in my example would be ".0", "0.0", "0.", "0" (integer of course) but not ".". If you want, you can test it with the TO_NUMBER function. Finding such an unsigned decimal number could then be formulated like this: from the beginning of a string we will either allow a decimal point plus any number of digits OR at least one digits plus an optional decimal point followed by optional any number of digits. Think about it for a minute, how would you formulate such a search pattern?
Compare your solution to this one:
'^(\.[0-9]+|[0-9]+(\.[0-9]*)?)$'
Addendum: Here I have to use both "?" and "*" to make sure, that I can have 0 to many digits after the decimal point, but only 0 to 1 occurrence of this substrings. Otherwise, strings like "1.9.9.9" would be possible, if I would write it like this:
'^(\.[0-9]+|[0-9]+(\.[0-9]*)*)$'Some of you now might say: Hey, what about signed decimal numbers? You could of course combine all the ideas so far and you will end up with a very long and almost unreadable search pattern, or you start combining several regular expression functions. Think about it: Why put all the search patterns into one function? Why not split those into several steps like "check for a valid decimal" and "check for sign".
I'll just use another SELECT to show what I want to do:
WITH t AS (SELECT '0' col1
             FROM dual
            UNION
           SELECT '0.'
             FROM dual
            UNION
           SELECT '.0'
             FROM dual
            UNION
           SELECT '0.0'
             FROM dual
            UNION
           SELECT '-1.0'
             FROM dual
            UNION
           SELECT '.1-'
             FROM dual
            UNION
           SELECT '.'
             FROM dual
            UNION
           SELECT '-1.1-'
             FROM dual
SELECT t.*
FROM t
;From this select, the only rows I need to find are those with the column values "." and "-1.1-". I'll start this with a check for valid signs. Since I want to combine this with the check for valid decimals, I'll first try to extract a substring with valid signs through the REGEXP_SUBSTR function:
NVL(REGEXP_SUBSTR(t.col1, '^([+-]?[^+-]+|[^+-]+[+-]?)$'), ' ')Remember the OR operator and the matching character collections? But several "^"? Some of the meta characters inside a search pattern can have different meanings, depending on their positions and combination with other meta characters. In this case, the pattern translates into: from the beginning of the string search for "+" or "-" followed by at least another character that is not "+" or "-". The second pattern after the "|" OR operator does the same for a sign at the end of the string.
This only checks for a sign but not if there also only digits and a decimal point inside the string. If the search string fails, for example when we have more than one sign like in the "-1.1-", the function returns NULL. NULL and LIKE don't go together very well, so we'll just add NVL with a default value that tells the LIKE to ignore this string, in this case a space.
All we have to do now is to combine the check for the sign and the check for a valid decimal number, but don't forget an option for the signs at the beginning or end of the string, otherwise your second check will fail on the signed decimals. Are you ready?
Does your solution look a bit like this?
WHERE NOT REGEXP_LIKE(NVL(REGEXP_SUBSTR(t.col1,
                           '^([+-]?[^+-]+|[^+-]+[+-]?)$'),
                       '^[+-]?(\.[0-9]+|[0-9]+(\.[0-9]*)?)[+-]?$'
                      )Now the optional sign checks in the REGEXP_LIKE argument can be added to both ends, since the SUBSTR won't allow any string with signs on both ends. Thinking in regular expression again.
Continued in Introduction to regular expressions ... continued.
C.
Fixed some embarrassing typos ... and mistakes.
cd

Excellent write up CD. Very nice indeed. Hopefully you'll be completing parts 2 and 3 some time soon. And with any luck, your article will encourage others to do the same....I know there's a few I'd like to see and a few I'd like to have a go at writing too :-)

Regular expression to add undesrcore before single capital

Similar Messages

Maybe you are looking for