Quantifiers in regular expressions?

What is the difference between Greedy quantifiers, Reluctant quantifiers and Possessive quantifiers? The java.util.regex.Pattern API doesn't really explain this.

If you understand the "independent expression" construction (that looks like "(?>...)"), you got it too,
because the posessive quantifier is exactly an equivalent of
greedy one inside an independent expression (i.e. the "a++" is the same as the "(?>a+)").
If not, i'm not sure i can explain it quickly.
Just imagine you have a regex of the form "Regex1Regex2",
where each of Regex1 and Regex2 have a usual greedy quantifier inside it. During a search, after the Regex1 succeeded, the matcher tests the Regex2 starting from where the Regex1 ended up. If the Regex2 succeeds too, the match is reported. Otherwise,
the next variant for Regex1 (the shorter one for greedy as you remember) is picked, and the Regex2 is tested starting from another position. This process is called backtracking.
The usual greedy quantifier can issue as many variants
as much is the length of initially captured string. Counterwise, the posessive one issues only the first variant and nothing else.
You can skip all variants but the first one by using the posessive quantifier, thus virtually saving a lot matcher's work.
So the main advantage for using a posessive quantifier is a performance enchancement. The drawback is that you can
lose some matches.

Similar Messages

  • Regular expressions and capture groups

    Hi everyone :)
    Is there a way to override the default behaviour of capture groups in regular expressions? More specifically I want to override this:
    "The captured input associated with a group is always the subsequence that the group most recently matched."
    For example, if I have a string that is this:
    * <comment one>
    * <comment two>
    <some text>
    I have a pattern of the form "(.*)(/\\*.*\\*/)(.*)" which will match multi-line comments. I have also specified the flag DOTALL so that the predefined character class '.' matches over line-breaks.
    If I apply this pattern to the above string I get comment two being captured, not comment one. This is because of the stipulation that I cited above.
    I need to be able to capture only the first match, and prevent the capture group from being overwritten by more recent matches.
    Is this possible? Any ideas?
    Thanks in advance.
    Kind regards,
    Ben Deany

    Is there a way to override the default behaviour of
    capture groups in regular expressions? More
    specifically I want to override this:No, but you don't need to.
    I have a pattern of the form "(.*)(/\\*.*\\*/)(.*)"
    which will match multi-line comments.Comment two is captured by the second group because comment one is eaten by the first group. Use the reluctant quantifier "*?" on the . in the first group instead of the greedy quantifier "*" to get what is apparently the behavior you want. Then the first group will contain nothing, the second group will contain comments one and two, and the third group will contain the following text.
    .* is a very powerful thing to use. It will match everything in its path, guzzling text like moonshine at Mardi Gras. The only reason it doesn't match comment two as well is because then the expression as a whole would not match.
    The parentheses surrounding the first and third groups are not needed (unless you want to use group methods on them too).

  • Regular Expressions in CS5.5 - something is wrong

    Hello Everybody,
    Please correct me, but I think, I found a serious problem with regular Expressions in Indesign CS5.5 (and possibly in other apps from CS5.5).
    Let's start with simple example:
    var range = "a-a,a,a-a,a";
    var regEx = /(a+-a+|a+)(,(a+-a+|a+))*/;
    alert( "Match:" +regEx.test(range)+"\nLeftContext: "+RegExp.leftContext+"\nRightContext: "+RegExp.rightContext );
    What I expected was true match and the left  and the right context should be empty. In Indesign CS3 that is correct BUT NOT in CS5.5.
    In CS 5.5 it seems that the only first "a-a" is matched and the rest is return as the rightContext - looks like big change (if not parsing error in RegExp engine).
    Please correct me if I am wrong.
    The second example - how to freeze ID CS5.5:
    var range = "a-a,a,a-a,a";
    var regEx = /(a+-a+|a+)(,(a+-a+|a+)){8,}/;
    alert( "Match:" +regEx.test(range)+"\nLeftContext: "+RegExp.leftContext+"\nRightContext: "+RegExp.rightContext );
    As you can see it differs only with the {8,} part instead of *
    Run it in CS5.5 and you will see that the ID hangs (in CS3 of course it runs flawlessly}.
    The third example - how to freeze ID 5.5 in one line (I posted it earlier in Photoshop forum because similiar problem was called earlier):
    alert((/(n|s)? /gmi).test('s') );
    As you can guess - it freezes the CS5.5 (CS3 passes the test).
    Please correct me if I am doing something wrong or it's the problem of Adobe.
    Best regards,
    Daniel Brylak

    Hi Daniel,
    Thanks for sharing. Really annoying indeed.
    Just to complete your diagnosis, what you describe about CS.5 is the same in CS5, while CS4 behaves as CS3.
    var range = "aaaaa";
    var regEx = /(a+-a+|a+)(,(a+-a+|a+))*/;
    alert([
        "Match:" +regEx.test(range),
        "LeftContext: "+RegExp.leftContext.toSource(),        // => CS3/4: EMPTY -- CS5+: EMPTY
        "RightContext: "+RegExp.rightContext.toSource()        // => CS3/4: EMPTY -- CS5+: ",a,a-a,a"
        ].join('\r'));
    So there is a serious implementation problem of the RegExp object from ExtendScript CS5.
    I don't think it's related to the greedy modes. By default, JS RegExp quantifiers are greedy, and /a*/ still entirely captures "aaaaaa" in CS5+.
    By the way, you can make any quantifier non-greedy by adding ? after the quantifier, e.g.: /a*?/, /a+?/, etc.
    I guess that Adobe ExtendScript has a generic issue in updating the RegExp.lastIndex property in certain contexts—see http://forums.adobe.com/message/3719879#3719879 —which could explain several bugs such as the Negative Class bug —see http://forums.adobe.com/message/3510078#3510078 — or the problems you are mentioning today.
    @+
    Marc

  • How can I remove all content between two tags using Find/Replace regular expressions?

    This one is driving me bonkers...  I'm relatively new to regular expressions, but I'm trying to get Dreamweaver to remove all content between two tags in an XML document.  For example, let's say I have the following XML:
    <custom>
    <![CDATA[<p>Some text</p>
    <p>Some more text</p>]]>
    </custom>
    I'd like to do a Find/Replace that produces:
    <custom>
    </custom>
    In essence, I'd like to strip all of the content between two tags.  Ideally, I'd like to know how to strip the CDATA content as well, to return the following:
    <custom>
    <![CDATA[]]>
    </custom>
    I'd much appreciate any suggestions on accomplishing this.
    Many thanks!

    Thanks much for your response.  I found David's article to be a little thin with respect to examples using quantifiers in coordination with the wildcard metacharacters; however, I was able to cobble together a working expression through trial and error using the information he presented.  For posterity, here’s the solution:
    Find:
    <custom>[\d\D]*?</custom>
    Replace:
    <custom>
    <![CDATA[]]>
    </custom>
    I believe this literally translates to:
    [] = find anything in this range/character class
    \d = find any digit character (i.e. any number)
    \D = find any non-digit character (i.e. anything except numbers)
    *? = match zero or more times, but as few times as possible (i.e. match multiple characters per instance, but only match one instance at a time, or none at all)
    I’m still not sure how to effectively utilize the . wildcard.  For example, the following expression will not find content that ends with a number:
    <custom>.*?[\D]*?</ custom >
    I'm presuming this is because numbers aren't included in the \D metacharacter; however, shouldn't numbers be picked up by the .*? expression?

  • Regular expressions in eclipse

    am trying to filter out some search results in eclipse with regular expressions, but am not having any luck. I am trying to find all jsp pages that contain html:text tags that DON'T have a maxlength attribute.
    I have tried this, but it doesn't seem to work:
    html:text .*[^m][^a][^x][^l|L][^e][^n][^g][^t][^h]How do you specify to NOT match an entire string (like maxlength, for example) while still matching another string (like html:text) in the same regexp?
    Thanks in advance.

    "[^>m]++" matches one or more of anything that's not '>' or 'm', which is always safe. If that fails (meaning a '>' or 'm' was seen), the second alternative tries to match an 'm' unless it's the beginning of "maxlength" (it could actually be the middle of "foobarmaxlength", but I'm assuming that won't happen). The whole alternation is in a group controlled by an asterisk, so it will keep cycling until either '>' or "maxlength" is seen. If it's "maxlength" rather than '>' that stops the loop, the regex fails.
    I was assuming that '>' wouldn't appear inside a tag. (It actually can, even in HTML, if it's enclosed in quotes, but nobody ever seems to do that. JSP, of course, is another story.) I think the simplest way to fix this regex is to add another alternative for quoted values:  <html:text\b([^>"m]++|"[^"]*+"|m(?!axlength))*+>BTW, it's important to use possessive quantifiers (++, +, etc.) in a pattern like this, where the general form is (X)*. Otherwise, you could get into a runaway-backtracking scenario where the regex takes forever to decide no match is possible.

  • Getting "Inner Html" using Regular Expressions. Learning RE in SDK1.4.

    Hello group.
    I am learning Regular Expressions in JAVA SDK 1.4 first. Not PERL or other language.
    Using the utility at the following link I am trying to get all the text between the <TR> and </TR> tags.
    http://jakarta.apache.org/oro/demo.html
    This seems simple but the line returns, breaks etc.. make it more difficult. I have worked on this for hours.
    There will be multiple table rows in my stream.
    My goal is to first get the text between the <TR> Tags...
    Then I was going to use groups to get data0, data1, data2, data3.
    Does this sound like a good plan? Should I use multiple RE or one RE that does 4 group returns.
    I was thinking the applet was causing my problem.
    <TR>.*?</TR> does not work.
    (<tr>\s*([^(</tr>)])+</tr>) does not work.
    I can get data0 to work as well as data1,2,3.
    Would it make more sense to split this multiple row table by </tr>?
    One row of malformed html (actually multiple rows):
    <TR>
    <TD bgColor=#ffffff><A class=fav
    href="http://nicesite.com/data0"
    >nicesite</A><IMG
    src="smile.gif"></TD>
    <TD bgColor=#12ff22><SPAN class=fav>data1</SPAN></TD>
    <TD bgColor=#12ff22><SPAN class=fav>data2</SPAN></TD>
    <TD bgColor=#12ff22><SPAN class=fav>data3</SPAN></TD>
    <TD align=middle bgColor=#ffffff><A
    href="#"><IMG
    src="smile.gif" border=0></A></TD>
    <TD align=middle bgColor=#ffffff>data
              4</TD>
    <TD align=middle bgColor=#ffffff>data5</TD></TR>
    s_____ I have seen some of your post and tryed to apply them. What do you think?
    Regards,
    NupeVic

    http://jakarta.apache.org/oro/demo.htmlI prefer
    http://jregex.sourceforge.net/demoapp.html
    >
    This seems simple but the line returns, breaks etc..
    make it more difficult. Yes, they do indeed
    There will be multiple table rows in my stream.
    My goal is to first get the text between the <TR>
    Tags...
    Then I was going to use groups to get data0, data1,
    data2, data3.
    Does this sound like a good plan? Should I use
    multiple RE or one RE that does 4 group returns.One of the main features of regexes that you must realize
    is that they are mainly suited for non-recursive, linear data structures
    (btw, that's why regexes in general are hardly suited for html).
    So, if the number of TD items is fixed, you could
    1. search using a single pattern for the whole row, something like
    "<PatternForTR>"+
    "<PatternForTD>(<PatternForData>)<PatternFor/TD>"+
    "<PatternForTD>(<PatternForData>)<PatternFor/TD>"+
    "<PatternFor/TR>"
    so the group1 would contain data1 and so on
    Otherwise, you should
    2. find each row using
    "<PatternForTR>(.*?)<PatternFor/TR>",
    then search the contents of group1 using the
    "<PatternForTD>(<PatternForData>)<PatternFor/TD>".
    >
    <TR>.*?</TR> does not work.The pattern itself is ok, but in order for it to work one should enable the DOTALL flag (the 's' flag in jregex demo), as the '.' doesn't accept line breaks by default.
    (<tr>\s*([^(</tr>)])+</tr>) does not work.It seems that [^(</tr>)]+ actually is a nonsense in this context.
    It describes a string that consists of any chars but '(', ')', '<', '>', 'r', 't', '/'.
    What you actully meant (a string that doesn't contain "</tr>")
    is just achieved by using non-greedy quantifier in <TR>.*?</TR>.
    >
    I can get data0 to work as well as data1,2,3.
    Would it make more sense to split this multiple row
    table by </tr>?Going the second way above, you could find rows using the
    general pattern for TR:
    <tr.*?>(.+?)</tr> and search their contents(i.e. the group#1) using the
    general pattern for TD
    <td.*?>(.+?)</td> Finally, this is the specific pattern for TD that doesn't include the leading
    and trailing tags into group1:
    <td[^>]*>(?:\s*</?[^>]*>)*\s*(.+?)(?:\s*</?[^>]*>)*\s*</td>It succeded in finding
    nicesite
    data1
    data2
    data3
    data
    4
    data5in your sample.

  • Regular expressions... they are not regular! =)

    So,
    I've been pulling my hair out with regular expressions. I'm sure there is a logical explanation to this, but i've read a bunch of explanations and i THOUGHT i understood this, but i don't. Here goes:
    I have a string "2010PETE". I tried matching it to "\\d{1,}" (this is how i entered it in Java). This returns FALSE. HOWEVER, it seems to me the above should be TRUE because it says that a greedy quantifier with {1,} searches for the the preceding character AT LEAST N times, where in this case n=1, so i interpret this as "If a digit (\\d) is found at least once within the string, then this string matches the regular expression. This does NOT seem to be the case.
    Can someone clear this up for me?

    THANK YOU. i think that is what i was missing, the part about
    "would only match if the input consisted of at least one digit, possibly multiple digits, and nothing else."
    I read the documentation and some of it didn't seem to be clear on that point.
    i'll play around with this and see how far i can get. if i still have questions i will post some code for sure, and try to get a nice, rounded set of examples.
    thanks!
    ONE OTHER QUESTION I JUST THOUGHT OF: does the .matches() method match expressions when some substring of the String matches, or does it have to match the entire String? So, if i have the String "123ABC", and i ask to match "1 or more letters" will it fail because there are non-letters in the String, but then pass if i add "1 or more letters AND 1 or more digits"? so, in the latter every character in the String is accounted for in the search, as opposed to the first. Is that correct, or are there ways to JUST match some substring in the String instead of the whole thing? i WILL make some examples too... but does that make sense?
    Edited by: pedron on Jan 12, 2012 3:23 PM

  • ReplaceAll string by regular expression not work for this case.

    I will delete all tag and want "pure text" but the output is delete all.
    String content = "<aaa>pure text<fff>";
    content = content.replaceAll("<.*>","");Content has output is blank because reqular expression match from begin and end of string
    But when i change
    String content = "<aaa>pure text<fff";
    content = content.replaceAll("<.*>","");The output is ==> pure text<fff
    How make req match in sequential
    Please lead me to solution

    peterdog1234 wrote:
    Thank you very much.
    I know '?' is a Quantifiers.
    I do not understand using ?
    Please lead me againSee the paragraph "Laziness Instead of Greediness" from [http://www.regular-expressions.info/repeat.html].

  • Regular expression: how to match "[somestuff]"?

    I have a problem with the following code.
    I meant to catch "[fm1,-]". But I got "[fm1,-] funder [fm2,-] of our country. [sn8,s-]" instead.
    import java.util.*;
    import java.util.regex.*;
    public class regPractice {
    public static void main(String[] args) {
    String s="<TITLE Getting to Know> I hope suitabe [fm1,-] funder [fm2,-] of our country. [sn8,s-]";
    Pattern p=Pattern.compile("\\[(.*)\\]");
    Matcher m=p.matcher(s);
    if (m.find() ){
    System.out.println(m.group(0) ) ;
    }else{
    System.out.println("Nothing");
    }

    Regular expressions are greedy - that is (.*) will grab as much as it
    possibly can before a ]. Hence you see what you see.
    What you want is a reluctant quantifier, in this case (.*?)
    These grab as little as they possibly can. The parentheses are also
    not needed in your example, but you may want group(1) for some
    other reason.
    So we end up with:import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    public class regPractice {
        public static void main(String[] args) {
            String s = "<TITLE Getting to Know> I hope suitabe [fm1,-] funder [fm2,-] of our country. [sn8,s-]";
            Pattern p = Pattern.compile("\\[(.*?)\\]");
            Matcher m = p.matcher(s);
            if (m.find()) {
                System.out.println(m.group(0));
            } else {
                System.out.println("Nothing");
    }which gives the desired output.
    The different types of quantifier are described here:
    http://java.sun.com/docs/books/tutorial/extra/regex/quant.html

  • Regular Expression Infinite Loop

    Hi All!
    I have a problem with the regular expression defined by OPERATIN_MAPPING_REGEX:
    public static String IDENTIFIER_REGEX = "[$_a-zA-Z][$_a-zA-Z0-9]*";
    public static String QUALIFIED_NAME_REGEX = "" + IDENTIFIER_REGEX +"(?:\\." + IDENTIFIER_REGEX +")*"; 
    public static String OPERATION_REGEX = "(" + QUALIFIED_NAME_REGEX + "){1}(?:\\((" + QUALIFIED_NAME_REGEX + "(?:," + QUALIFIED_NAME_REGEX + ")*)*\\))?";
    public static String OPERATION_MAPPING_REGEX = "^(" + OPERATION_REGEX + ")->(" + OPERATION_REGEX + ")$";The purpose of this RegEx is to match following strings:
    "SomeClassName->AnotherClassName"
    "SomeClassName()->AnotherClassName()"
    "SomeClassName(some.parameter.TypeName)->AnotherClassName(another.parameter.TypeName)"
    "some.package.name.SomeClassName->another.package.name.AnotherClassName"
    "some.package.name.SomeClassName()->another.package.name.AnotherClassName()"
    "some.package.name.SomeClassName(some.parameter.TypeName)->another.package.name.AnotherClassName(another.parameter.TypeName)"
    "some.package.name.SomeClassName(some.parameter.TypeName1,some.parameter.TypeName2)->another.package.name.AnotherClassName(another.parameter.TypeName1,another.parameter.TypeName2)"
    etc..
    This all works well, however, following pattern results in an infinite loop:
    "SomeClassName->AnotherClassName(another.parameter.TypeName,another.parameter.TypeName )"
    Please not the blank before the closing parenthesis.
    And this is the test code:
    public static Pattern operationMappingPattern = Pattern.compile(OPERATION_MAPPING_REGEX);
    Matcher matcher = PlatformUtils.operationMappingPattern.matcher("SomeClassName->AnotherClassName(another.parameter.TypeName,another.parameter.TypeName )");
    if (matcher.matches()) {
            Assert.fail("Regular Expression must not match");
    }I am using
    java version "1.5.0_12"
    Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_12-b04)
    Java HotSpot(TM) Client VM (build 1.5.0_12-b04, mixed mode)
    I do not know if this is the right place here but this sees to be a bug in the RegEx implementation of the JDK. Any Hints?
    Thanks
    Hannes
    Edited by: sennah on Nov 2, 2007 8:25 AM

    It's not a bug, it's just one of the hazards inherent in using regexes. This can happen when a regex has a lot of quantifiers (asterisks, plus signs, etc.) in it, especially when quantified parts are enclosed in groups that are themselves quantified. It isn't really in an infinite loop, but it could take anywhere from a few minutes to a few million years for the regex to concede that it can't match the text. In this case, the fix is simple: change all your normal, greedy quantifiers to possessive quantifiers. I've made that change below, as well as allowing for whitespace at various points.   static String ID = "[$_a-zA-Z][$_a-zA-Z0-9]*+";
      static String QNAME = ID + "(?:\\." + ID +")*+"; 
      static String QNAMES = QNAME + "(?:\\s*+,\\s*+" + QNAME + ")*+";
      static String ARGS = "(?:\\(\\s*+(?:" + QNAMES + ")?+\\s*+\\))?+";
      static String OP = "(?:" + QNAME + ARGS + ")";
      static String OP_MAP = "^(" + OP + ")->(" + OP + ")$";

  • Logical AND in Java Regular Expressions

    I'm trying to implement logical AND using Java Regular Expressions.
    I couldn't figure out how to do it after reading Java docs and textbooks. I can do something like "abc.*def", which means that I'm looking for strings which have "abc", then anything, then "def", but it is not "pure" logical AND - I will not find "def.*abc" this way.
    Any ideas, how to do it ?
    Baken

    First off, looks like you're really talking about an "OR", not an "AND" - you want it to match abc.*def OR def.*abc right? If you tried to match abc.*def AND def.*abc nothing would ever match that, as no string can begin with both "abc" and "def", just like no numeric value can be both 2 and 5.
    Anyway, maybe regex isn't the right tool for this job. Can you not simply programmatically match it yourself using String methods? You want it to match if the string "starts with" abc and "ends with" def, or vice-versa. Just write some simple code.

  • Help in Regular expression

    Hello..
    I wanted to write a regular expression to match the foll string..
    <!--endclickprintexclude--><!--startclickprintexclude--> <!--endclickprintexclude-->
    <p> <b>NEW ORLEANS, Louisiana (CNN) </b>
    -- Two years after Hurricane Katrina devastated coastal areas of Louisiana and Mississippi, residents say much of America has forgotten their plight.
    </p> <!--startclickprintexclude-->
    I tried doing..
    Matcher matcher= Pattern.compile("<!--endclickprintexclude--> <p><b>([^<^>]+?)</p><!--startclickprintexclude-->", Pattern.CASE_INSENSITIVE).matcher(story);
    Its not working...
    is there any other soln?

    Theres probably a better way to do this but here's a way that works.
    import java.util.regex.*;
    public class RegexTester{
    public static void main(String[] args){
         String text =
         "<!--endclickprintexclude--><!--startclickprintexclude--> <!--endclickprintexclude-->" +
         "<p> <b>NEW ORLEANS, Louisiana (CNN) </b>" +
         "-- Two years after Hurricane Katrina devastated coastal areas of Louisiana and Mississippi," +
         "residents say much of America has forgotten their plight." +
         "</p> <!--startclickprintexclude-->";
         String regex = ">((?:\\s*[\\S&&[^<>]]+\\s*)*?)<";
         Pattern p = Pattern.compile(regex);
         Matcher m = p.matcher(text);
         while(m.find()){
         System.out.println("Match: '" + m.group(1) + "'");
    }

  • Help in regular expression matching

    I have three expressions like
    1) [(y2009)(y2011)]
    2) [(y2008M5)(y2011M3)] or [(y2009M5)(y2010M12)]
    3) [(y2009M1d20)(y2011M12d31)]
    i want regular expression pattern for the above three expressions
    I am using :
    REGEXP_LIKE(timedomainexpression, '???[:digit:]{4}*[:digit:]{1,2}???[:digit:]{4}*[:digit:]{1,2}??', 'i');
    but its giving results for all above expressions while i want different expression for each.
    i hav used * after [:digit:]{4}, when i am using ? or . then its giving no results. Please help in this situation ASAP.
    Thanks

    I dont get your question Can you post your desired output? and also give some sample data.
    Please consider the following when you post a question.
    1. New features keep coming in every oracle version so please provide Your Oracle DB Version to get the best possible answer.
    You can use the following query and do a copy past of the output.
    select * from v$version 2. This forum has a very good Search Feature. Please use that before posting your question. Because for most of the questions
    that are asked the answer is already there.
    3. We dont know your DB structure or How your Data is. So you need to let us know. The best way would be to give some sample data like this.
    I have the following table called sales
    with sales
    as
          select 1 sales_id, 1 prod_id, 1001 inv_num, 120 qty from dual
          union all
          select 2 sales_id, 1 prod_id, 1002 inv_num, 25 qty from dual
    select *
      from sales 4. Rather than telling what you want in words its more easier when you give your expected output.
    For example in the above sales table, I want to know the total quantity and number of invoice for each product.
    The output should look like this
    Prod_id   sum_qty   count_inv
    1         145       2 5. When ever you get an error message post the entire error message. With the Error Number, The message and the Line number.
    6. Next thing is a very important thing to remember. Please post only well formatted code. Unformatted code is very hard to read.
    Your code format gets lost when you post it in the Oracle Forum. So in order to preserve it you need to
    use the {noformat}{noformat} tags.
    The usage of the tag is like this.
    <place your code here>\
    7. If you are posting a *Performance Related Question*. Please read
       {thread:id=501834} and {thread:id=863295}.
       Following those guide will be very helpful.
    8. Please keep in mind that this is a public forum. Here No question is URGENT.
       So use of words like *URGENT* or *ASAP* (As Soon As Possible) are considered to be rude.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

  • Regular expression alphabets

    Hi
    I want to retrieve the data if the data contains a character or a space or '-' thru select query .
    Please help me in writing the combination of 3 with regular expression.
    Thanks!!

    VT wrote:
    Hi,
    Try this
    SELECT *
    FROM <TABLE> WHERE REGEXP_LIKE(<COLUMN>, '[a-z -][A-Z -]');cheers
    VTThat won't work as it's expecting at least two characters with the first having to be a-z (lower case) or space or "-" followed by A-Z (upper case) or space or "-".
    The correct way is either:
    [a-zA-Z -]or
    [[:alpha:] -]using the alpha set is often preferable as it can work differently with different character sets/languages rather than restricting to just the a-zA-Z ranges.
    Generating a reference for your own database characterset/language can be useful...
    SQL> select level-1 as asc_code, decode(chr(level-1), regexp_substr(chr(level-1), '[[:print:]]'), CHR(level-1)) as chr,
      2         decode(chr(level-1), regexp_substr(chr(level-1), '[[:graph:]]'), 1) is_graph,
      3         decode(chr(level-1), regexp_substr(chr(level-1), '[[:blank:]]'), 1) is_blank,
      4         decode(chr(level-1), regexp_substr(chr(level-1), '[[:alnum:]]'), 1) is_alnum,
      5         decode(chr(level-1), regexp_substr(chr(level-1), '[[:alpha:]]'), 1) is_alpha,
      6         decode(chr(level-1), regexp_substr(chr(level-1), '[[:digit:]]'), 1) is_digit,
      7         decode(chr(level-1), regexp_substr(chr(level-1), '[[:cntrl:]]'), 1) is_cntrl,
      8         decode(chr(level-1), regexp_substr(chr(level-1), '[[:lower:]]'), 1) is_lower,
      9         decode(chr(level-1), regexp_substr(chr(level-1), '[[:upper:]]'), 1) is_upper,
    10         decode(chr(level-1), regexp_substr(chr(level-1), '[[:print:]]'), 1) is_print,
    11         decode(chr(level-1), regexp_substr(chr(level-1), '[[:punct:]]'), 1) is_punct,
    12         decode(chr(level-1), regexp_substr(chr(level-1), '[[:space:]]'), 1) is_space,
    13         decode(chr(level-1), regexp_substr(chr(level-1), '[[:xdigit:]]'), 1) is_xdigit
    14    from dual
    15  connect by level <= 256
    16  /
      ASC_CODE C   IS_GRAPH   IS_BLANK   IS_ALNUM   IS_ALPHA   IS_DIGIT   IS_CNTRL   IS_LOWER   IS_UPPER   IS_PRINT   IS_PUNCT   IS_SPACE  IS_XDIGIT
             0                                                                   1
             1                                                                   1
             2                                                                   1
             3                                                                   1
             4                                                                   1
             5                                                                   1
             6                                                                   1
             7                                                                   1
             8                                                                   1
             9                                                                   1                                              1
            10                                                                   1                                              1
            11                                                                   1                                              1
            12                                                                   1                                              1
            13                                                                   1                                              1
            14                                                                   1
            15                                                                   1
            16                                                                   1
            17                                                                   1
            18                                                                   1
            19                                                                   1
            20                                                                   1
            21                                                                   1
            22                                                                   1
            23                                                                   1
            24                                                                   1
            25                                                                   1
            26                                                                   1
            27                                                                   1
            28                                                                   1
            29                                                                   1
            30                                                                   1
            31                                                                   1
            32                       1                                                                            1                     1
            33 !          1                                                                                       1          1
            34 "          1                                                                                       1          1
            35 #          1                                                                                       1          1
            36 $          1                                                                                       1          1
            37 %          1                                                                                       1          1
            38 &          1                                                                                       1          1
            39 '          1                                                                                       1          1
            40 (          1                                                                                       1          1
            41 )          1                                                                                       1          1
            42 *          1                                                                                       1          1
            43 +          1                                                                                       1          1
            44 ,          1                                                                                       1          1
            45 -          1                                                                                       1          1
            46 .          1                                                                                       1          1
            47 /          1                                                                                       1          1
            48 0          1                     1                     1                                           1                                1
            49 1          1                     1                     1                                           1                                1
            50 2          1                     1                     1                                           1                                1
            51 3          1                     1                     1                                           1                                1
            52 4          1                     1                     1                                           1                                1
            53 5          1                     1                     1                                           1                                1
            54 6          1                     1                     1                                           1                                1
            55 7          1                     1                     1                                           1                                1
            56 8          1                     1                     1                                           1                                1
            57 9          1                     1                     1                                           1                                1
            58 :          1                                                                                       1          1
            59 ;          1                                                                                       1          1
            60 <          1                                                                                       1          1
            61 =          1                                                                                       1          1
            62 >          1                                                                                       1          1
            63 ?          1                                                                                       1          1
            64 @          1                                                                                       1          1
            65 A          1                     1          1                                           1          1                                1
            66 B          1                     1          1                                           1          1                                1
            67 C          1                     1          1                                           1          1                                1
            68 D          1                     1          1                                           1          1                                1
            69 E          1                     1          1                                           1          1                                1
            70 F          1                     1          1                                           1          1                                1
            71 G          1                     1          1                                           1          1
            72 H          1                     1          1                                           1          1
            73 I          1                     1          1                                           1          1
            74 J          1                     1          1                                           1          1
            75 K          1                     1          1                                           1          1
            76 L          1                     1          1                                           1          1
            77 M          1                     1          1                                           1          1
            78 N          1                     1          1                                           1          1
            79 O          1                     1          1                                           1          1
            80 P          1                     1          1                                           1          1
            81 Q          1                     1          1                                           1          1
            82 R          1                     1          1                                           1          1
            83 S          1                     1          1                                           1          1
            84 T          1                     1          1                                           1          1
            85 U          1                     1          1                                           1          1
            86 V          1                     1          1                                           1          1
            87 W          1                     1          1                                           1          1
            88 X          1                     1          1                                           1          1
            89 Y          1                     1          1                                           1          1
            90 Z          1                     1          1                                           1          1
            91 [          1                                                                                       1          1
            92 \          1                                                                                       1          1
            93 ]          1                                                                                       1          1
            94 ^          1                                                                                       1          1
            95 _          1                                                                                       1          1
            96 `          1                                                                                       1          1
            97 a          1                     1          1                                1                     1                                1
            98 b          1                     1          1                                1                     1                                1
            99 c          1                     1          1                                1                     1                                1
           100 d          1                     1          1                                1                  1                           1
           101 e          1                     1          1                                1                  1                           1
           102 f          1                     1          1                                1                  1                           1
           103 g          1                     1          1                                1                  1
           104 h          1                     1          1                                1                  1
           105 i          1                     1          1                                1                  1
           106 j          1                     1          1                                1                  1
           107 k          1                     1          1                                1                  1
           108 l          1                     1          1                                1                  1
           109 m          1                     1          1                                1                  1
           110 n          1                     1          1                                1                  1
           111 o          1                     1          1                                1                  1
           112 p          1                     1          1                                1                  1
           113 q          1                     1          1                                1                  1
           114 r          1                     1          1                                1                  1
           115 s          1                     1          1                                1                  1
           116 t          1                     1          1                                1                  1
           117 u          1                     1          1                                1                  1
           118 v          1                     1          1                                1                  1
           119 w          1                     1          1                                1                  1
           120 x          1                     1          1                                1                  1
           121 y          1                     1          1                                1                  1
           122 z          1                     1          1                                1                  1
           123 {          1                                                                                    1     1
           124 |          1                                                                                    1     1
           125 }          1                                                                                    1     1
           126 ~          1                                                                                    1     1
           127                                                                   1
           128 Ç          1                                                                                    1     1
    etc.
    {code}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        

  • Help in query using regular expression

    HI,
    I need a help to get the below output using regular expression query. Please help me.
    SELECT REGEXP_SUBSTR ('PWRPKG(P/W+P/L+CC)', '[^+]+', 1, lvl) val, lvl
    FROM DUAL,(SELECT LEVEL lvl FROM DUAL
    CONNECT BY LEVEL <=(SELECT MAX ( LENGTH ('PWRPKG(P/W+P/L+CC)') - LENGTH (REPLACE ('PWRPKG(P/W+P/L+CC)','+',NULL))+ 1) FROM DUAL));
    I need the output as
    correct result:
    ==============
    val lvl
    P/W 1
    P/L 2
    CC 3
    But i tried the above it is not coming the above result. Please help me where i did a mistake.
    Thanks in advance

    Frank gave you a solution in your other thread. You could simplify it if you are on 11g:
    SQL> select * from table_x
      2  /
    TXT
    TECHPKG(INTELLI CC+FRT SONAR)
    PWRPKG(P/W+P/L+CC)
    select  txt,
            regexp_substr(
                          txt,
                          '(.*\()*([^+)]+)',
                          1,
                          column_value,
                          null,
                          2
                         ) element,
            column_value element_number
      from  table_x,
            table(
                  cast(
                       multiset(
                                select  level
                                  from  dual
                                  connect by level <= regexp_count(txt,'\+') + 1
                       as sys.OdciNumberList
      order by rowid,
               column_value
    TXT                                      ELEMENT    ELEMENT_NUMBER
    TECHPKG(INTELLI CC+FRT SONAR)            INTELLI CC              1
    TECHPKG(INTELLI CC+FRT SONAR)            FRT SONAR               2
    PWRPKG(P/W+P/L+CC)                       P/W                     1
    PWRPKG(P/W+P/L+CC)                       P/L                     2
    PWRPKG(P/W+P/L+CC)                       CC                      3
    SQL>  SY.

Maybe you are looking for