Regex capture groups in IdocScript

Hello,
I'm looking for a way to print out capture groups as part of a regex match using Idoc Script. Here's a basic example of what I'm trying to do.
HTML:
<p>This is a paragraph of text</p>
Idoc Script:
<!--$r = wcmElement("regexText")-->
<!--$if regexMatches(regexText, "<p>(.*)</p>")-->
<!--$firstParagraph = $1-->
<!--$endif-->
Essentially, what I'm trying to do is store the value between the <p></p> tags in a variable. I know there's a syntax error in my above script, but the IdocScript documentation does not touch on capture groups at all, so I'm left to guess on what I know from other scripting languages.
Thanks,
Josh

<p> tags are escaped and ssIncludeXML will not do the trick.
you can try somethign like..
<!--$myHTMlMarkup =wcmElement("regexText")-->
<!--$startindex= strIndexOf(myHTMlMarkup,"&gt;p&lt;")-->
<$loopwhile startindex+ 1 > 0$>
     <!--$stopIndex= strIndexOf(myHTMlMarkup,"&gt;/p&lt;")-->
     <!--$para = strSubstring(myHTMlMarkup,startindex + 9,stopIndex) -->
     <!--$myHTMlMarkup=strIndexOf(myHTMlMarkup,stopIndex + 10)-->
     <!--$startindex= strIndexOf(myHTMlMarkup,"&gt;p&lt;")-->
<!--$endloop-->

Similar Messages

  • RegEx and capturing groups

    hi.
    i'm trying to use the capturing groups to extract substrings.
    this is the data format: MTWTFSS@2005-03-19
    and this my regex: Pattern.compile("((\\p{Upper}|-){7})@(19|20)\\d\\d-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])");
    i want to extract the substring before the @. i have set the capturing group, but i always get an error:
    java.lang.IllegalStateException: No match found
    at java.util.regex.Matcher.group(Matcher.java:353)
    at vdrremotecontrol.VDRTimer.(VDRTimer.java:93)
    at vdrremotecontrol.VDRRemoteControl.getTimersFromVDR(VDRRemoteControl.java:628)
    at vdrremotecontrol.VDRRemoteControl.loadSettings(VDRRemoteControl.java:855)
    at tvbrowser.core.plugin.JavaPluginProxy.doLoadSettings(JavaPluginProxy.java:191)
    ... 5 more
    what's the right way to set the capturing group?
    greetings, henrik

    me again :-)
    // MTWTFSS@2005-03-19
    Pattern repeatingAt =
    tingAt =
    Pattern.compile("((\\p{Upper}|-){7})@(19|20)\\d\\d-(0[
    1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])");
    if(dayPattern.matcher(day).matches()) {
    repeating = false;
    this.day_of_month = Integer.parseInt(day);
    } else if(datePattern.matcher(day).matches())
    tches()) {
    repeating = false;
    this.day_of_month =
    of_month =
    Integer.parseInt(datePattern.matcher(day).group(3));
    } else
    } else if(simpleRepeating.matcher(day).matches()) {
    repeating = true;
    repeating_days = determineDays(day);
    } else
    } else if(repeatingAtShort.matcher(day).matches())
    repeating = true;
    String days = day.substring(0,
    bstring(0, day.indexOf("@"));
    repeating_days = determineDays(days);
    } else if(repeatingAt.matcher(day).matches())
    tches()) {
    repeating = true;
    Matcher matcher =
    matcher = repeatingAt.matcher(day);
    System.out.println("#"+day+"#");
    System.out.println(matcher.groupCount());
    System.out.println(matcher.group(1));
    String days = day.substring(0,
    bstring(0, day.indexOf("@"));
    repeating_days = determineDays(days);
    }output is:
    [java] #MDMDFSS@2005-06-10#
    [java] 5
    [java] SCHWERWIEGEND: Die Einstellungen des
    n des Plugins "Video Disc Recorder" konnten nicht
    geladen werden.
    [java]
    java]
    (/home/henni/.tvbrowser/java.vdrremotecontrol.VDRRemot
    eControl.prop)
    [java] util.exc.TvBrowserException: Die
    : Die Einstellungen des Plugins "Video Disc Recorder"
    konnten nicht geladen werden.
    [java]
    java]
    (/home/henni/.tvbrowser/java.vdrremotecontrol.VDRRemot
    eControl.prop)
    [java] at
    a] at
    tvbrowser.core.plugin.JavaPluginProxy.doLoadSettings(J
    avaPluginProxy.java:197)
    [java] at
    a] at
    tvbrowser.core.plugin.AbstractPluginProxy.loadSettings
    (AbstractPluginProxy.java:114)
    [java] at
    a] at
    tvbrowser.core.plugin.PluginProxyManager.activatePlugi
    n(PluginProxyManager.java:505)
    [java] at
    a] at
    tvbrowser.core.plugin.PluginProxyManager.activateAllPl
    uginsExcept(PluginProxyManager.java:459)
    [java] at
    a] at
    tvbrowser.core.plugin.PluginProxyManager.init(PluginPr
    oxyManager.java:220)
    [java] at
    a] at tvbrowser.TVBrowser.main(TVBrowser.java:307)
    [java] Caused by:
    d by: java.lang.IllegalStateException: No match
    found
    [java] at
    a] at
    java.util.regex.Matcher.group(Matcher.java:353)
    [java] at
    a] at
    vdrremotecontrol.VDRTimer.<init>(VDRTimer.java:94)
    [java] at
    a] at
    vdrremotecontrol.VDRRemoteControl.getTimersFromVDR(VDR
    RemoteControl.java:628)
    [java] at
    a] at
    vdrremotecontrol.VDRRemoteControl.loadSettings(VDRRemo
    teControl.java:855)
    [java] at
    a] at
    tvbrowser.core.plugin.JavaPluginProxy.doLoadSettings(
    JavaPluginProxy.java:191)
    [java] ... 5
    ... 5
    more[/cod
    e][u][/u]
    This example executes fine.
            String expression = "((\\p{Upper}|-){7})@(19|20)\\d\\d-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])";
            String data = "MTWTFSS@2005-03-19";
            Pattern p = Pattern.compile(expression);
            Matcher m = p.matcher(data);
            while (m.find()) {
                System.out.println(m.group(1));
            }        /Kaj

  • Regex: named group capturing?

    Hi, I know that in JAVA regular expression "capturing groups are numbered by counting their opening parentheses from left to right", but is there anyway to assign a name to a capturing group, like the way in python:
    (?P<year>[0-9][0-9][0-9][0-9]) //see regular expression Howto website
    where this group is named "year" and and be retrieved by the group name.
    Can I do this in JAVA regex? Any hints please, thanks!

    I made some changes to sun's regExps to support named groups.
    http://gorbush.narod.ru/files/regex2.zip
    String
    "TEST 123"
    RegExp
    "(?<login>\\w+) (?<id>\\d+)"
    Access
    matcher.group(1) = TEST
    matcher.group("login") = TEST
    matcher.name(1) = login
    Replace
    matcher.replaceAll("aaaaa_$1_sssss_$2____") = aaaaa_TEST_sssss_123____
    matcher.replaceAll("aaaaa_${login}_sssss_${id}____") = aaaaa_TEST_sssss_123____

  • Regex Matching on Capture groups

    I have this regular expression:
    (throw|give)(?: ([1-3][A-B]))+
    given this input:
    throw 1A 2C 1B 3C
    How would I capture each of the items 1A 2C 1B and 3C?
    In the above expression I have 3 capture groups
    group0: whole expression
    group1: (throw|give)
    group2: ((?:1|2|3)(?:A|B|C))
    The problem is that when I execute the find() on my matcher it tries matching the whole expression at once! That means for the group 2 I always get the last match only:
    group0: throw 1A 2C 1B 3C
    group1: throw
    group2: 3C
    How do I get the matcher to only match on ONE capture group at a time?! Is it possible? I thought that was the purpose of the find() method. The documentation says find() matches on a "subsequence", yet I can only get it to match on the whole expression. Plus, I don't see where "subsequence" is defined in the documentation. What am I missing here?

    "Attempts to find the next subsequence of the input sequence that matches the pattern." (my emphasis)
    find() matches the whole regex, not components thereof. What you need to do is use one regex to match the whole expression and return a capture group with the digit-letter pairs, and then use another regex on that capture group to extract the pairs one at a time.*******************************************************************************
    Answer provided by Friends of the Water Cooler. Please inform forum admin via the
    'Discuss the JDC Web Site' forum that off-topic threads should be supported.

  • Reg Exp and non capturing groups

    Hello,
    I got problems with the non capturing groups in java.
    I want to extract the 2nd ocurenc of a given patern from a string.
    This ist the string =" 12.12.2004 [Logging] [Success] ### More text [Perhaps more brackets here] some text"
    Now I want to extract the 2nd occurecy of the brackets with an unknow text pattern.
    I tried this pattern = "(?:\[\w+\] )\[w+\]"
    I thought that the non-capturing pattern helps me to find the first pair of [] brackets. the capturing group should give me want i want.
    But I still get the result: [Logging] [Success]"
    Even if I try just to use only the non cpaturing patern "(?:\[\w+\])" I still get an output from Matcher.group() which I didnt expect here!
    What I am doing wrong?
    Thanx for any help

    Sorry, sabre150 I just missed it to answer you.
    I have placed it into eclipse in a scrapbookpage into the following code:
    final String regexp = "[^\\]]*\\[[^\\]]*\\]\\s*(.*)";
    final String zeile = " 12.12.2004 [Logging] [Success] ### More text [Perhaps more brackets here] some text";
    java.util.regex.Pattern pattern = java.util.regex.Pattern.compile(regexp);
    java.util.regex.Matcher matcher = pattern.matcher("");
    matcher.reset(zeile);
    System.out.println("Found: "+matcher.find());
    System.out.println(matcher.group());Output was:
    Found: true
    12.12.2004 [Logging] [Success] ### More text [Perhaps more brackets here] some text
    and not what I wanted
    "[Success] ### More text [Perhaps ...."
    I dont know what I am doing wrong.
    I thought I could solve it with the non-capturing group, but they doesnt seem to work.
    My question to the non-capturing group is:
    I have this pattern [code]"(?:.*)".
    This pattern will probably match the String "Hello Mary!"
    But it should not capture the string.
    So matcher.find() is true, but matcher.group() is the whole string and nit null, like I expected.

  • Regular expressions and capture groups

    Hi everyone :)
    Is there a way to override the default behaviour of capture groups in regular expressions? More specifically I want to override this:
    "The captured input associated with a group is always the subsequence that the group most recently matched."
    For example, if I have a string that is this:
    * <comment one>
    * <comment two>
    <some text>
    I have a pattern of the form "(.*)(/\\*.*\\*/)(.*)" which will match multi-line comments. I have also specified the flag DOTALL so that the predefined character class '.' matches over line-breaks.
    If I apply this pattern to the above string I get comment two being captured, not comment one. This is because of the stipulation that I cited above.
    I need to be able to capture only the first match, and prevent the capture group from being overwritten by more recent matches.
    Is this possible? Any ideas?
    Thanks in advance.
    Kind regards,
    Ben Deany

    Is there a way to override the default behaviour of
    capture groups in regular expressions? More
    specifically I want to override this:No, but you don't need to.
    I have a pattern of the form "(.*)(/\\*.*\\*/)(.*)"
    which will match multi-line comments.Comment two is captured by the second group because comment one is eaten by the first group. Use the reluctant quantifier "*?" on the . in the first group instead of the greedy quantifier "*" to get what is apparently the behavior you want. Then the first group will contain nothing, the second group will contain comments one and two, and the third group will contain the following text.
    .* is a very powerful thing to use. It will match everything in its path, guzzling text like moonshine at Mardi Gras. The only reason it doesn't match comment two as well is because then the expression as a whole would not match.
    The parentheses surrounding the first and third groups are not needed (unless you want to use group methods on them too).

  • Regular expressions question - making a capture group optional to match.

    Hello java Experts...
    I have a problem with making a group optional to match. I am using a pattern to match a text from an apache/bittorrent log. This text sometimes contains an IP address and sometimes it doesn't. Here is an example:
    GET /announce?info_hash=E%E9%C6%1D%DC%7EA%CD%EB%97%C8%85%DC%26M4%DB%11%18%1D&peer_id=%00%00%00%00%00%00%00%00%00%00%00%005%86%07%1An%F0%8E/&port=6882&ip=130.233.20.169&uploaded=0&downloaded=0&left=1855094951&event=started HTTP/1.0I used this pattern to parse it and it works:
    (\\w+\\s\\/\\w+)\\?(info_hash)\\=([\\d\\w\\%\\/]+)\\&(peer_id)\\=([\\d\\w\\%\\/]+)&(port)\\=(\\d+)\\&(ip)*\\=([\\d.]+)*\\&(uploaded)\\=(\\d+)\\&(downloaded)\\=(\\d+)\\&(left)\\=(\\d+)\\&(event)\\=(\\w+)";Now if the IP address is not in the log entry, the above pattern won't match. For example, this text doesn't match against the pattern:
    GET /announce?info_hash=E%e9%c6%1d%dc%7eA%cd%eb%97%c8%85%dc%26M4%db%11%18%1d&peer_id=%d6%be%9f%88%28C5%eb%1c%cdI%98j%c5H%80%5d%3bJ%99&uploaded=0&downloaded=0&left=1855094951&event=started HTTP/1.0So what I am trying to tell the regex engine is that some entries contain an IP address, so take it then into a group, but if some entries don't, then don't bother with it. How do I go about doing that?
    Any help would be greatly appreciated and thanks for your time :)

    I tried it, but apparently it doesn't work. I guess java somehow numbers groups differently when they're nested. I thought doing
    (ip)?\\=([\\d.]+)? would make it optional to match the IP part of the log entry, but it doesn't.
    Also tried nesting them in one: (\\&(ip)\\=([\\d.]+))? which is just the same as you proposed, but it still gives the No Match Found exception.

  • Question on regex Matcher (group number)

    HI, everybody
    I am writing a program on replacement like the one below.
    String regex = "(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)";
    String original = "ABCDEFGHIJKL";
    String replacement = "$12";
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher(original);
    String result = m.replaceFirst(replacement);What I actually want is to take out the first group, in this case an "A", and append a character "2" after it.
    The result I am expecting is "A2". But the result I get is "L". For the regex engine takes it as the 12th group.
    What should I do to remove the ambiguity.
    Thanks.

    In such case, use $1\\2.

  • Urgent:capture group values in variables

    Hi,
    I am new to BI publisher, need some help to solve my issue.
    i need to grab group value in variables.
    i have a code like this
    <?for-each-group:contact;./contactname?>
    for example that above code will give 5 contact names. i have to capture 1st contact in one variable. Second contaname in second variable.. etc.
    Thanks,
    lax

    Not concatenation,
    i have for each group statement with contact name
    it is displaying data like this
    Name1
    Name2
    Name3
    now i want to get that values in my each variable. for example variable1 should have Name1 value. variable2 have Name2 value. Variable3 have Name3.
    so that in some other pages i have to query some records based on above variable values.
    Thanks,
    lax
    Edited by: user7498756 on Aug 23, 2012 12:23 PM

  • Regex multiple groups matching question

    Hi,
    I have three types of inputs: A,B,C
    I need to match sequences of:
    AAA...ABC
    If I truncate the input to contain only A's, and use a Matcher that searchs for A, I can read all A's instances with matcher's group methods.
    If I apply the whole sequence to the full regex, I can only read the last A (B and C too).
    How can I get the "subgroups" of A's?
    Do I need to split up the input and use two regex's? Is there a way to make the matcher assign a different group for each A, and one group for each B and C? So I can iterate over the groups?
    regex I use:
    (A)+?(B)(C) (reluctant modifier)
    Thanks

    A, B and C are not characters, they are expressions.
    I would like to be able to read all groups of A when I apply my regex to the whole sequence.
    Here is my code, in case it is needed. Some examples of valid inputs I need to read:
    5.0x + 10y = 20
    -2t1 +3.5t3 >= 0.5
    etc
    import java.util.regex.*;
    public class ModelParser {
         private static final String factor = "(?:([+-]?)\\s*(\\d+(?:\\.\\d)?))";
         private static final String variable = "(?:\\s*(\\w+))";
         private static final String expression = "(?:" + factor + variable + "\\s*)+?";
         private static final String operand = "(=|>=|<=)";
         private static final String regex =
                   //expression;
                   expression + operand + factor;
         public ModelParser(String input) {
              System.out.println(input);
              System.out.println(regex);
              Pattern p = Pattern.compile(regex);
              Matcher m = p.matcher(input);
              boolean matches;
              while (matches = m.find()) {
                   System.out.println("Groups: " + m.groupCount());
                   int groups = m.groupCount();
                   for (int i = 0; i <= groups; ++i) {
                        System.out.println("Group " + i + ": " + m.group(i) +
                            " from: " + m.start(i) + " to " + m.end(i));
         public static void main(String[] args) {
              ModelParser mp = new ModelParser(" + 3.5y - 2.0x = 5");
              //ModelParser mp = new ModelParser(" + 3.5y - 2.0x");
    }You can use the commented lines (regex and input at the main method) instead and check that it is reading ok the "A" groups.

  • Regexp and group capturing

    Hi,
    I 've trouble with capturing group as mention in the example below
    String s = "KLASSE3";
    Pattern p = Pattern.compile("KLASSE(\\d)");
    Matcher m= p.matcher(s);
    System.out.println( m.groupCount());
    System.out.println( m.group(1));
    Running this gives :
    1
    Exception in thread "main" java.lang.IllegalStateException: No match found
    at java.util.regex.Matcher.group(Matcher.java:421)
    Tried with 1.5.0_08 and 1.6.0-b105
    Thanks for any hint

    shame on me !!!
    thanks very much dude

  • RegEx: find() and group() issue...

    Hi,
    In the code below, I use the find() method to search the first occurence of a match and then group(1) to retrieve that match if the find() method returns true.
    The problem is that the find() method returns true and the group(1) throws an ArrayOutOfBoundException :(
    Here is the RegEx code I use:
      String value = "${local.production}\\makeresults\\repository";
      Matcher matcher = Pattern.compile("\\$\\{.*\\}").matcher(value);
      if (matcher.find()) {
        System.out.println(matcher.group(1));   // throws an ArrayOutOfBoundException
    Got this output:
    Exception in thread "main" java.lang.IndexOutOfBoundsException: No group 1
      at java.util.regex.Matcher.group(Matcher.java:463)
      at Test2.main(Test2.java:28)Have I made any mistakes?
    Thanks in advance for your help.
    Best regards,
    Serge.

    You shall use group() without parameter to get complete sequence.
    Only capturing groups (between parenthesis) are indexed (from left to right, starting at one.)
    Here you don't have capturing group (and that's why it says No group 1), and it seems that you need the entire pattern, so you need the group() method.
    Matcher
    Tim - Actually, you can also use group(0) ...

  • Get all groups from a regular expression match

    Please help me understand how to use Java regular expressions:
    I have an expression similar to this:
    {noformat}"([^X]+)(X[^X]*)+"{noformat}This should match stuff like "asaasaXdfdfdfXXsdsfd".
    How does one access all the matches for the second group (the second groups has a Kleene operator
    added so it is not really just one group --- but match.groupCount() is always 2)
    Here is roughly the code:
    {noformat}java.util.regex.Pattern pattern = {noformat}{noformat}java.util.regex.Pattern.compile({noformat}{noformat}"([^X]+)(X[^X]*)+",{noformat}{noformat}java.util.regex.Pattern.MULTILINE{noformat}{noformat});{noformat}{noformat}java.util.regex.Matcher matcher = pattern.matcher(text);{noformat}{noformat}matcher.find();{noformat}{noformat}int groupcount = matcher.groupCount();{noformat}
    Also, without matcher.find() I get an illegalStateException .. which I also get if I use matcher.matches() instead
    of matcher.find().
    I am obviously missing something here. There is always at least one "X" in the string so shouldn't that pattern always
    match the whole string? Since there are often multiple X, shouldnt I get a group for each occurrence of X, followed
    by 0 or more other characters?
    {noformat}But when I try to match everything by using "^([^X]+)(X[^X]*)+$" I get an "IllegalStateException: No match available" again.{noformat}
    What is the correct way to do this?
    Edited by: johann_p on May 16, 2008 10:39 AM

    I am sorry I messed this up. Here is a SSCCE:
    import java.util.regex.Pattern;
    import java.util.regex.Matcher;
    class RegExp1 {
        public static void main(String[] args) {
          String testString = "first|aaaa | bbbb\n|cccc|ddddd";
          Pattern pattern = Pattern.compile("^([^|]+)(\\|[^|]*)+$");
          Matcher matcher = pattern.matcher(testString);
          matcher.find();
          int groupcount = matcher.groupCount();
          System.out.println("Found "+groupcount+" groups");
          System.out.println("Matcher: "+matcher);
          for (int i = 1; i <= groupcount; i++) {
            System.out.println("Match "+i+": "+testString.substring(matcher.start(i),matcher.end(i)));
    }I figured out a small bug in my first code that explains some of the exception oddities, but my principal question remains:
    how do I access all the matches that correspond to the second capturing group?
    In the example I would get "first" for Match 1 and "|ddddd" for Match 2, but how do I access all the matches??
    Thank you for your help!

  • Interesting Regex dillemmas

    I need to do the following:
    1. given a list of regular expressions find out which ones match some string, would i be faster to do this in a for-loop or OR all expressions into one large expression and use capture groups.
    2. given a list of regular expressions is it possible to tell if 2 or more expressions will match on SOME(unknown) string
    3. is it possible to write 10 regular expressions so that "priority" is encoded in the expression, so that if 5 of those expressions match on a given string, we will know which of those 5 will have the highest priority.
    Please kind sirs, advise (;

    1. given a list of regular expressions find out which ones match some
    string, would i be faster to do this in a for-loop or OR all expressions
    into one large expression and use capture groups.Profile it. But I'd favour the loop as more transparent.
    2. given a list of regular expressions is it possible to tell if 2 or more
    expressions will match on SOME(unknown) stringDepends. By "regular expressions" do you mean regular expressions (in which case it is possible, although you'll have to write a moderate amount of code to do it) or expressions accepted by java.util.regex.Pattern, in which case it's at best not so easily proven to be possible?
    3. is it possible to write 10 regular expressions so that "priority"
    is encoded in the expression, so that if 5 of those expressions
    match on a given string, we will know which of those 5 will have
    the highest priority.Probably, but don't.

  • Regex: how can Matcher.matches return true, but Matcher.find return false?

    Consider the class below:
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    public class RegexBugDemo {
         private static final Pattern numberPattern;
         static {
                   // The BigDecimal grammar below was adapted from the BigDecimal(String) constructor.
                   // See also p. 46 of http://www.javaregex.com/RegexRecipesV1.pdf for a regex that matches Java floating point literals; uses similar techniques as below.
              String Sign = "[+-]";
              String Sign_opt = "(?:" + Sign + ")" + "?";     // Note: the "(?:" causes this to be a non-capturing group
              String Digits = "\\p{Digit}+";
              String IntegerPart = Digits;
              String FractionPart = Digits;
              String FractionPart_opt = "(?:" + FractionPart + ")" + "?";
              String Significand = "(?:" + IntegerPart + "\\." + FractionPart_opt + ")|(?:" + "\\." + FractionPart + ")|(?:" + IntegerPart + ")";
              String ExponentIndicator = "[eE]";
              String SignedInteger = Sign_opt + Digits;
              String Exponent = ExponentIndicator + SignedInteger;
              String Exponent_opt = "(?:" +Exponent + ")" + "?";
              numberPattern = Pattern.compile(Sign_opt + Significand + Exponent_opt);
    //     private static final Pattern numberPattern = Pattern.compile("\\p{Digit}+");
         public static void main(String[] args) throws Exception {
              String s = "0";
    //          String s = "01";
              Matcher m1 = numberPattern.matcher(s);
              System.out.println("m1.matches() = " + m1.matches());
              Matcher m2 = numberPattern.matcher(s);
              if (m2.find()) {
                   int i0 = m2.start();
                   int i1 = m2.end();
                   System.out.println("m2 found this substring: \"" + s.substring(i0, i1) + "\"");
              else {
                   System.out.println("m2 NOT find");
              System.exit(0);
    }Look at the main method: it constructs Matchers from numberPattern for the String "0" (a single zero). It then reports whether or not Matcher.matches works as well as Matcher.find works. When I ran this code on my box just now, I get:
    m1.matches() = true
    m2 NOT findHow the heck can matches work and find NOT work? matches has to match the entire input sequence, whereas find can back off if need be! I am really pulling my hair out over this one--is it a bug with the JDK regex engine? Did not seem to turn up anything in the bug database...
    There are at least 2 things that you can do to get Matcher.find to work.
    First, you can change s to more than 1 digit, for example, using the (originaly commented out) line
              String s = "01";yields
    m1.matches() = true
    m2 found this substring: "01"Second, I found that this simpler regex for numberPattern
         private static final Pattern numberPattern = Pattern.compile("\\p{Digit}+");yields
    m1.matches() = true
    m2 found this substring: "0"So, the problem seems to be triggered by a short source String and a complicated regex. But I need the complicated regex for my actual application, and cannot see why it is a problem.
    Here is a version of main which has a lot more diagnostic printouts:
         public static void main(String[] args) throws Exception {
              String s = "0";
              Matcher m1 = numberPattern.matcher(s);
              System.out.println("m1.regionStart() = " + m1.regionStart());
              System.out.println("m1.regionEnd() = " + m1.regionEnd());
              System.out.println("m1.matches() = " + m1.matches());
              System.out.println("m1.hitEnd() = " + m1.hitEnd());
              m1.reset();
              System.out.println("m1.regionStart() = " + m1.regionStart());
              System.out.println("m1.regionEnd() = " + m1.regionEnd());
              System.out.println("m1.lookingAt() = " + m1.lookingAt());
              System.out.println("m1.hitEnd() = " + m1.hitEnd());
              Matcher m2 = numberPattern.matcher(s);
              System.out.println("m2.regionStart() = " + m2.regionStart());
              System.out.println("m2.regionEnd() = " + m2.regionEnd());
              if (m2.find()) {
                   int i0 = m2.start();
                   int i1 = m2.end();
                   System.out.println("m2 found this substring: \"" + s.substring(i0, i1) + "\"");
              else {
                   System.out.println("m2 NOT find");
                   System.out.println("m2.hitEnd() = " + m2.hitEnd());
              System.out.println("m2.regionStart() = " + m2.regionStart());
              System.out.println("m2.regionEnd() = " + m2.regionEnd());
              System.out.println("m1 == m2: " + (m1 == m2));
              System.out.println("m1.equals(m2): " + m1.equals(m2));
              System.exit(0);
         }Unfortunately, the output gave me no insights into what is wrong.
    I looked at the source code of Matcher. find ends up calling
    boolean search(int from)and it executes with NOANCHOR. In contrast, matches ends up calling
    boolean match(int from, int anchor)and executes almost the exact same code but with ENDANCHOR. Unfortunately, this too makes sense to me, and gives me no insight into solving my problem.

    bbatman wrote:
    I -think- that my originally posted regex is correct, albeit possibly a bit verbose, No, there's a (small) mistake. The optional sign is always part of the first OR-ed part (A) and the exponent is always part of the last part (C). Let me explain.
    This is your regex:
    (?:[+-])?(?:\p{Digit}+\.(?:\p{Digit}+)?)|(?:\.\p{Digit}+)|(?:\p{Digit}+)(?:[eE](?:[+-])?\p{Digit}+)?which can be read as:
    (?:[+-])?(?:\p{Digit}+\.(?:\p{Digit}+)?)        # A
    |                                               # or
    (?:\.\p{Digit}+)                                # B
    |                                               # or
    (?:\p{Digit}+)(?:[eE](?:[+-])?\p{Digit}+)?      # COnly one of A, B or C is matched of course. So B can never have a exponent or sign (and A cannot have an exponent and C cannot have a sign).
    What you probably meant is this:
    (?:[+-])?                                   # sign       
        (?:\p{Digit}+\.(?:\p{Digit}+)?)         #   A
        |                                       #   or
        (?:\.\p{Digit}+)                        #   B
        |                                       #   or
        (?:\p{Digit}+)                          #   C
    (?:[eE](?:[+-])?\p{Digit}+)?                # exponent
    and that this must be a sun regex engine bug, but would love to be educated otherwise. Yes, it looks like a bug to me too.
    A simplified version of this behavior (in case you want to file a bug report) would look like this:
    When `test` is a single digit, m.find() returns _false_ but matches() returns true.
    When `test` is two or more digits, both return true, as expected.
    public class Test {
        public static void main(String[] args) {
            String test = "0";
            String regex = "(?:[+-])?(?:\\p{Digit}+\\.(?:\\p{Digit}+)?)|(?:\\.\\p{Digit}+)|(?:\\p{Digit}+)(?:[eE](?:[+-])?\\p{Digit}+)?";
            java.util.regex.Matcher m = java.util.regex.Pattern.compile(regex).matcher(test);
            System.out.println("matches() -> "+test.matches(regex));
            if(m.find()) {
                System.out.println("find()    -> true");
            } else {
                System.out.println("find()    -> false");
    }

Maybe you are looking for

  • App-V CRM 2013 Outlook Client and Office 2013 Click to Run

    Hi, I'm trying to get going with enbing the CRM 2013 for Outlook client with the VL version of Office Pro Plus 2013.  I have successfully published and use Office App-V package in a test environment.  App-V version 5.0 SP2 is installed on the client

  • Hard Disk - Mac OS Boot Volume failure

    Hello everyone, My Mac does not boot. I started having odd problems of clicking on a document and the mac refusing to open its application (tried with PDF, DOCX, even JPG). When I rebooted I was  greeted with a a normal grey screen with apple logo an

  • IMac is no longer finding my shared MBP or other shared computers.

    I got a new iMac (Mountain Lion) for Christmas 2012 and when I turned on my old MBP and allowed sharing, the MBP showed up immediately in the connected servers (Finder sidebar). I turned off my computers a few days ago and now I cannot see either com

  • Attribute.get exception

    the get() method on an Attribute object can throws a NamingException. Does anybody have any idea on what could go wrong in the operation (I use an LDAP). All error should occurs on the getAttributes() method. am I wrong?

  • Makepkg refuses to build due to depends array [solved]

    Makepkg refuses to build a package based on the contents of the depends array.  My understanding is that the contents of the makedepends array are needed to build, but the contents of the depends array are needed to run the software. Example, I want