Pattern (dis-)matching

Hi,
why is it that group(0) matches "6<->" and not just "6" in my code below ?
String idxPair = "6<->4";
String pattern = "(.*?)<->(.*?)";
Pattern p = Pattern.compile(pattern);     
Matcher matcher = p.matcher(idxPair);
boolean matchFound = matcher.find();
if (matchFound){
      srcIdx = matcher.group(0);
      trgIdx = matcher.group(1);
      System.out.println("PAIR IS : " + idxPair + " SRC: " + srcIdx + " TRG: " + trgIdx);
}Grazia

It looks like I had to replace my previous code with
String pattern = "(.*?)<->(.*)";
Pattern p = Pattern.compile(pattern);     
Matcher matcher = p.matcher(idxPair);
boolean matchFound = matcher.find();
if (matchFound){
     srcIdx = matcher.group(1);
      trgIdx = matcher.group(2);
System.out.println("PAIR IS : " + idxPair + " SRC: " + srcIdx + " TRG: " + trgIdx);
}I can understand teh difference between (.*?) and (.*), but I still think that I should have matched group(0) and group(1), not group(1) and group(2).
If you have any explanation, please let me know.
Thank you.
Grazia

Similar Messages

  • Pattern and Matcher of Regular Expressions

    Hello All,
    MTMISRVLGLIRDQAISTTFGANAVTDAFWVAFRIPNFLRRLFAEGSFATAFVPVFTEVK
    ETRPHADLRELMARVSGTLGGMLLLITALGLIFTPQLAAVFSDGAATNPEKYGLLVDLLR
    LTFPFLLFVSLTALAGGALNSFQRFAIPALTPVILNLCMIAGALWLAPRLEVPILALGWA
    VLVAGALQLLFQLPALKGIDLLTLPRWGWNHPDVRKVLTLMIPTLFGSSIAQINLMLDTV
    IAARLADGSQSWLSLADRFLELPLGVFGVALGTVILPALARHHVKTDRSAFSGALDWGFR
    TTLLIAMPAMLGLLLLAEPLVATLFQYRQFTAFDTRMTAMSVYGLSFGLPAYAMLKVLLP
    I need some help with the regular expressions in java.
    I have encountered a problem on how to retrieve two strings with Pattern and Matcher.
    I have written this code to match one substring"MTMISRVLGLIRDQ", but I want to match multiple substrings in a string.
    Pattern findstring = Pattern.compile("MTMISRVLGLIRDQ");
    Matcher m = findstring.matcher(S);
    while (m.find())
    outputStream.println("Selected Sequence \"" + m.group() +
    "\" starting at index " + m.start() +
    " and ending at index " m.end() ".");
    Any help would be appreciated.

    Double post: http://forum.java.sun.com/thread.jspa?threadID=726158&tstart=0

  • Pattern and Matching question

    Hey,
    I'm trying to use the pattern and matcher to replace all instances of a website
    address in some html documents as I process them and post them. I'm
    including a sample of some of the HTML below and the code I"m using to
    process it. For some reason it doesn't replace the sites in the underlying
    images and i can't figure out what I'm doing wrong. Please forgive all the
    unused variables, those are relics of another way i may have to do this if i
    can't get the pattern thing to work.
    Josh
         public static void setParameters(File fileName)
              FileReader theReader = null;
              try
                   System.out.println("beginning setparameters guide2)");
                   File fileForProcessing=new File(fileName.getAbsolutePath());
                   //wrap the file in a filereader and buffered reader for maximum processing
                   theReader=new FileReader(fileForProcessing);
                   BufferedReader bufferedReader=new BufferedReader(theReader);
                   //fill in data into the tempquestion variable to be populated
                   //Set the question and answer texts back to default
                   questionText="";
                   answerText="";
                   //Define the question variable as a Stringbuffer so new data can be appended to it
                   StringBuffer endQuestion=new StringBuffer();//Stringbuffer to store all the lines
                   String tempQuestion="";
                   //Define new file with the absolutepath and the filename for use in parsing out question/answer data
                   tempQuestion=bufferedReader.readLine();//reads the nextline of the question
                   String tempAlteredQuestion="";//for temporary alteration of the nextline
                   //while there are more lines append the stringbuffer with the new data to complete the question buffer
                   StringTokenizer tokenizer=new StringTokenizer(tempQuestion, " ");//tokenizer for reading individual words
                   StringBuffer temporaryLine; //reinstantiate temporary line holder each iterration
                   String newToken;   //newToken gets the very next token every iterration?  changed to tokenizer moretokens loop
                   String newTokenTemp;   //reset newTokenTemp to null each iterration
                   String theEndOfIt;  //string to hold everything after .com
                   char[] characters;  //character array to hold the characters that are used to hold the entire link
                   char lastCharChecked;
                   Pattern thePattern=Pattern.compile("src=\"https:////fakesite.com//ics", Pattern.LITERAL);
                   Matcher theMatcher=thePattern.matcher(tempQuestion);
                        while(tempQuestion!=null) //every time the tempquestion reads a newline, make sure you aren't at the end
                             String theReplacedString=theMatcher.replaceAll("https:////fakesite.com//UserGuide/");     
                             //          temporaryLine=new StringBuffer();
                             //add the temporary line after processed back into the end question.
                             endQuestion.append(theReplacedString);                              //temporaryLine.toString());
                             //reset the tempquestion to the newline that is going to be read
                             tempQuestion=bufferedReader.readLine();
                             if(tempQuestion!=null)
                                  theMatcher.reset(tempQuestion);
                             /*newTokenTemp=null;
                             while(tokenizer.hasMoreTokens())
                                  newToken=tokenizer.nextToken(); //get the next token from the line for processing
                                  System.out.println("uhhhhhh");
                                  if(newToken.length()>36)  //if the token is long enough chop it off to compare
                                       newTokenTemp=newToken.substring(0, 36);
                                  if(newTokenTemp.equals("src=\"https://fakesite.com"));//compare against the known image source
                                       theEndOfIt=new String();  //intialize theEndOfIt
                                       characters=new char[newToken.length()];  //set the arraylength to the length of the initial token
                                       characters=newToken.toCharArray();  //point the character array to the actual characters for newToken
                                       lastCharChecked='a';  // the last character that was compared
                                       int x=0; //setup the iterration variable and go from the length of the whole token back till you find the first /
                                       for(x=newToken.length()-1;x>0&&lastCharChecked!='/';x--)
                                            System.out.println(newToken);
                                            //set last char checked to the lsat iterration run
                                            lastCharChecked=characters[x];
                                            //set the end of it to the last char checked and the rest of the chars checked combined
                                            theEndOfIt=Character.toString(lastCharChecked)+theEndOfIt;
                                       //reset the initial newToken value to the cut temporary newToken root + userguide addin, + the end
                                       newToken=newTokenTemp+"//Userguide"+theEndOfIt;
                                  //add in the space aftr the token to the temporary line and the new token, this is where it should be parsed back together
                                  temporaryLine.append(newToken+" ");
                             //add the temporary line after processed back into the end question.
                             endQuestion.append(temporaryLine.toString());
                             //reset the tempquestion to the newline that is going to be read
                             tempQuestion=bufferedReader.readLine();
                             //reset tokenizer to the new temporary question
                             if(tempQuestion!=null)
                             tokenizer=new StringTokenizer(tempQuestion);
                   //Set the answer to the stringbuffer after converting to string
                   answerText=endQuestion.toString();
                   //code to take the filename and replace _ with a space and put that in the question text
                   char theSpace=' ';
                   char theUnderline='_';
                   questionText=(fileName.getName()).replace(theUnderline, theSpace);
              catch(FileNotFoundException exception)
                   if(logger.isLoggable(Level.WARNING))
                   logger.log(Level.WARNING,"The File was Not Found\n"+exception.getMessage()+"\n"+exception.getStackTrace(),exception);
              catch(IOException exception)
                   if(logger.isLoggable(Level.SEVERE))
                   logger.log(Level.SEVERE,exception.getMessage()+"\n"+exception.getStackTrace(),exception);
              finally
                   try
                        if(theReader!=null)
                             theReader.close();
                   catch(Exception e)
    <SCRIPT language=JavaScript1.2 type=text/javascript><!-- if( typeof( kadovInitEffects ) != 'function' ) kadovInitEffects = new Function();if( typeof( kadovInitTrigger ) != 'function' ) kadovInitTrigger = new Function();if( typeof( kadovFilePopupInit ) != 'function' ) kadovFilePopupInit = new Function();if( typeof( kadovTextPopupInit ) != 'function' ) kadovTextPopupInit = new Function(); //--></SCRIPT>
    <H1><IMG class=img_whs1 height=63 src="https://fakesite.com/ics/header4.jpg" width=816 border=0 x-maintain-ratio="TRUE"></H1>
    <H1>Associate Existing Customers</H1>
    <P>blahalbalhblabhlab blabhalha blabahbablablabhlablhalhab.<SPAN style="FONT-WEIGHT: bold"><B><IMG class=img_whs2 height=18 alt="Submit a

    If you use just / it misinterprets it and it ruins
    your " " tags for a string. I don't think so. '/' is not a special character for Java regex, nor for Java String.
    The reason i used
    literal is to try to force it to directly match,
    originally i thought that was the reason it wasn't
    working.That will be no problem because it enforces '.' to be treated as a dot, not as a regex 'any character'.
    Message was edited by:
    hiwa

  • Lexical search using Pattern and Matcher class

    Hi Folks,
    Need some help with the following Query. I want to find out how I can implement
    the following by using Pattern / Matcher classes.
    I have a query that returns the following set of strings,
    aa
    abc
    def
    ghi
    glk
    gmonalaks
    golskalskdkdkd
    lkaldkdldldkdld
    mladlad
    n33ieler
    What I would like do is to find out any string that starts with g and Lexical occurs after ghi. So this will return
    ghi / glk / gmonalaks / golskalskdkdkd
    How can I accomplish the above, by using the following two classes.
    Pattern datePattern = Pattern.compile();
    Matcher dateMatcher = datePattern.matcher();
    Thanks a bunch...
    _Shoe Maker..                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

    Nothing in your specification requires a regex. Loop though the strings until you find the string "ghi". Then continue looping though the strings outputting those where the first characters is 'g' .

  • String.matches vs Pattern and Matcher object

    Hi,
    I was trying to match some regex using String.matches but for me it is not working (probably I am not using it the way it should be used).
    Here is a simple example:
    /* This does not work */
    String patternStr = "a";
    String inputStr = "abc";
    if(inputStr.matches( "a" ))
    System.out.println("String matched");
    /* This works */
    Pattern p = Pattern.compile( "a" );
    Matcher m = p.matcher( "abc" );
    boolean found = false;
    while(m.find())
    System.out.println("Matched using Pattern and Matcher");
    found = true;
    if(!found)
    System.out.println("Not matching with Pattern and Matcher");
    Am I not matches method of String class properly?
    Please throw some lights on this.
    Thank you.

    String.matches looks at the whole string.
    bsh % "abc".matches("a");
    <false>
    bsh % "abc".matches("a.*");
    <true>

  • Patterns and Matcher.find()

    I would be really grateful for some assistance on this. I have been at this for three hours and can't get it correct.
    I have a string such as this: "Hello this is a great <!-- @@[IMAGINE_SPACIAL_VECTOR]@@ --> little string"
    I want to use RegEx, Pattern and Matcher.find() to extract "IMAGINE_SPACIAL_VECTOR" ie. the text between "<!-- @@[" and "]@@ -->"
    Thanks in advance for your help

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    public class Example {
         public static void main(String[] args) {
              String data = "Hello this is a great <!-- @@[IMAGINE_SPACIAL_VECTOR]@@ --> little string";
              Matcher matcher = Pattern.compile("<!-- @@\\[(.*?)]@@ -->").matcher(data);
              while (matcher.find()) {
                   System.out.println(matcher.group(1));
    }Might not be the best way, but it works.
    Kaj

  • How to Use Pattern and Matcher class.

    HI Guys,
    I am just trying to use Pattern and Matcher classes for my requirement.
    My requirement is :- It should allow the numbers from 1-7 followed by a comma(,) again followed by the numbers from
    1-7. For example:- 1,2,3,4,5 or 3,6,1 or 7,1,3 something like that.
    But it should not allow 0,8 and 9. And also it should not allow any Alphabets and special characters except comma(,).
    I have written some thing like..
    Pattern p = Pattern.compile("([1-7])+([\\,])?([1-7])?");
    Is there any problem with this pattern ??
    Please help out..
    I am new to pattern matching concept..
    Thanks and regards
    Sudheer

    ok guys, this is how my code looks like..
    class  PatternTest
         public static void main(String[] args)
              System.out.println("Hello World!");
              String input = args[0];
              Pattern p = Pattern.compile("([1-7]{1},?)+");
              Matcher m = p.matcher(input);
              if(m.find()) {
                   System.out.println("Pattern Found");
              } else {
                   System.out.println("Invalid pattern");
    }if I enter 8,1,3 its accepting and saying Pattern Found..
    Please correct me if I am wrong.
    Actually this is the test code I am presenting here.. I original requirement is..I will be uploading an excel sheets containg 10 columns and n rows.
    In one of my column, I need to test whether the data in that column is between 1-7 or not..If I get a value consisting of numbers other than 1-7..Then I should
    display him the msg..
    Thanks and regards
    Sudheer

  • What is the Scan from string pattern for "match everything" ?

    Hello,
    Using Scan from string for a while, I know that %s only matches string up to a whitespace. And I also thought %[^] would match everything including whitespaces. But it turned out that it would stop at the closing square brace.
    So, what is the real scan pattern for match everything including whitespaces ?

    What do you want the Scan From String to end on?  Or are you just grabbing the rest of the string?  If that is your scenario, then just use the "remaining string" output.  It might help if you give a full example of a normal input string and EVERYTHING you want as an output.

  • Searching and Matching - Difference between 'Match Pattern' and 'Match Geometric Pattern'?

    I was wondering if someone can explain to me the difference between  'Match Pattern' and 'Match Geometric Pattern' VIs? I'm really not sure which best to use for my application. I'm trying to search/match small spherical particles in a grey video in order to track their speed (I'm doing this after subtracting two subsequent frames to get rid of background motion artifacts).
    Which should I use?
    Thank you!
    Solved!
    Go to Solution.

    Hi TKassis,
    1.You may find from this link for the difference between these two,
    Pattern Match : http://zone.ni.com/reference/en-XX/help/370281P-01/imaqvision/imaq_match_pattern_3/
    Geometric Match : http://zone.ni.com/reference/en-XX/help/370281P-01/imaqvision/imaq_match_geometric_pattern/.
    2. I always prefer match pattern because of its execution speed, and incase of geometric pattern match it took lot of time to match your result. You may find in the attached figure for same image with these two algorithm execution time.
    Sasi.
    Certified LabVIEW Associate Developer
    If you can DREAM it, You can DO it - Walt Disney

  • Translation Pattern Wildcard Match

    Our organization uses 5 digit internal extensions throughout. Our CEO would like the ability to dial any 5 digit extension in our organization but wants his caller id to be shown as his name and the extension of his secretary – basically masking his 5 digit extension. I believe the simplest way to achieve this is to create a Translation Pattern, but I’m having an issue trying to match the wildcards in a TP in CUCM7.1.5. At this stage I have set up a new Partition and CSS just for the CEO’s phone and placed a test phone in the new CSS. I then created a TP which is where I run into a problem.
    In the TP I have selected the proper partition and in the Calling Party Transformations section I have listed the Calling Party Transform Mask as the secretary extension (we’ll say 55555 for this example). When I use an exact Translation Pattern match (say 12345) the translation works as I would expect (when I dial 12345 from the test phone, the caller ID shows as 55555). However, when I use any wildcards in the Translation Pattern (i.e. XXXXX) the translation does not occur. Now when I dial 12345 the true caller ID number shows instead of the translated number.
    I’m basically looking for a catch all rule from the CEO’s phone that will translate to 55555. I’m guessing I’m overlooking something simple here – any assistance? Thanks in advance.

    I set up a calling party transformation pattern with the same results. The issue seems to be in matching the dialed pattern or Translation Pattern field. In my testing the pattern is matched only when it's exact and not when wildcards are used. See the first attached screen shot where the pattern is '12345'. When this is applied it works as would be expected and the caller ID on the receiving phone shows 55555. But, on the second attached screenshot using wildcards, when 12345 is dialed the caller ID shows as the number on the phone and not the translated value. For some reason the wildcards don't seem to match.
    I've tried various wildcard patterns such as XXXXX, 1234X, and [0-8]XXXX - none work. The last one is the one I'd really like to use. Other thoughts or suggestions?

  • Regular expressions, using pattern and matcher but not include the pattern

    Hi all
    i have a regular expression but in my matcher it is including the text that is in my regular expression.
    ie
    String str="-------------------stuff===========";
    Pattern mainBody = Pattern.compile("-----(.*?)=====", Pattern.MULTILINE);this matches -------------------stuff=====
    now i expect to get some - in there as i match from the start, but i dont want to have the = in the match. how do i do a match that excludes the matching expressions.

    nevermind, figured it out
    when i do myMatch.group(1) it gives me just my match
    sorry for wasting time :)

  • Java Regex - Find Last Match Using Pattern and Matcher

    I'd like to write some regex which would allow me to grab the last occurance of match based on a specified list of items. So for:
    Pattern languageRegex = Pattern.compile("(len|end)");
    And a string of:
    "00| 0lend|"
    I want it to extract "end". However, I'd grab "len" using the above regex.
    If it was
    "00| 0lenend|"
    I'd grab "end" which is right.
    What regex would allow me to grab "end" rather than "len" from:
    "00| 0lend|"
    Thanks for your help.

    user3940995 wrote:
    I have a list of 3 letter codes that I need to check for in a field. The list is finite but about 100 or so items:
    len, end, ren, onm, enl, etc.
    However, the field I'm checking in has some other data in it which can bleed into the code but the code will always be at the end.
    An example would be "000 0rend"
    From this I'd want to extract "end". If there is a better way to do this than using regex then I'd be happy to use that, but as I have to process millions of items I'm keen to not loop trying to find a match so I was hoping there would be a regex solution.
    Your regex would work for that particular example. I think if I modify it to be
    Pattern.compile("len(?!\\D)|end(?!\\D)|enl(?!\\D)"); (which would then be extended for all the list items)
    Then I seem to pick up the last occurance as I'd like to.
    Thank you for your help!Doesn't sound like you want to use regexp. I would instead build a character graph/tree with my commands in reversed order. I would then search each line backwards and check if it matches something in my tree.

  • Pattern and Matcher problem. Help Please!

    I am trying to make the user enter a correct US$ in HTML(jsp format) ie 12.34 but not 12.3456. As far as i know, the regular expression below is correct for $$.
    but the there is no error even if the user types 12.34567
    can anyone help me please?
    this is the code i wrote...
    public static final int DATA_ENTRY = 1;
    public static final int INVALID_CURRENCY = 2;
    public static final int PROCESS_INPUT = 3;
    int state;
    String a; //data input from user
    Pattern p = Pattern.compile("\\d{1,3}(?:(?:,\\d\\d\\d)*|\\d*)(?:\\.\\d\\d)?");
    Matcher m = p.matcher(a);
    state = DATA_ENTRY; // this is default
    if (m.find() {
         state = PROCESS_INPUT;
    else {
         state = INVALID_CURRENCY;

    Here's two pattern strings that both require a two-place fraction for each entry but do not permit more than two places.
    import java.util.regex.*;
    public class snggun {
      public static void main(String[] args) {
        String input = (args.length > 0 ? args[0] : "10.00");
    //    String mask = "\\d{1,3}(?:(?:,\\d\\d\\d)*|\\d*)(?:\\.\\d\\d)?";
        // requires two-place fraction
    //    String mask = "^(\\d{1,3})(,(\\d\\d\\d))*(\\.\\d\\d)\\z";
        // requires two-place fraction
        String mask = "\\d{1,3}(?:(?:,\\d{3})*|\\d*)(\\.\\d{2})\\z";
        Pattern pattern = Pattern.compile(mask);
        Matcher match = pattern.matcher(input);
        System.out.println("input = " + input + " qualifies " + match.find());
        if(match.find())
          for(int j = 0; j <= match.groupCount(); j++)
            System.out.println("group " + j + "  " + match.group(j));
    }

  • Translation pattern not matching

    Hello All
    I am configuring a cucm 4.2 (yes i know its obsolete) integration with Lync 2010 and am having issues with a translation pattern.
    The Lync server is sending me 86xxxxxxxxxx for calls within china and will send 61xxxxxxxx for australia (strips the +)
    I have configured a [^86]! which should match any international numbers (other than China) and be prefixed and sent to the gateway. Here is the wiered thing I can dial +44xxxxxxxxxx using my lync client which proves that this is matching (when i delete the translation the call will fail).
    But when i dial a number like +61xxxxxxxx it doesnt get through and i get
    Cisco CallManagerDigit analysis: match(pi="1",fqcn="", cn="removedbymyself", plv="5", pss="LYNC:PT_Reception", TodFilteredPss="LYNC:PT_Reception", dd="61xxxxxxxx ",dac="0")
    Cisco CallManagerDigit analysis: potentialMatches=NoPotentialMatchesExist" on the traces.
    The LYNC partition has the translation rules. and the CSS assigned to the sip trunk has access to it. the CSS configured in the translation rules is the also the one assigned to the sip trunk.
    Anyone see this sort of thing before? how can i check if there is another transformation taking place?
    Only way i get round is to put a translation patter for " ! " and it works for all international calls.
    Thanks,

    Hi,
    Have you tried testing the call with Dialed Number Analyzer? I find that's a fantastic and often-overlooked tool for this kind of issue. If DNA shows the call will not route, it's probably a CSS issue for the Stafford gateway. If DNA shows the call will route, then it's probably a dial-peer issue on the Stafford gateway.
    -Jameson

  • Pattern regex matching advice needed

    Hi All,
    Many thanks for any/all advice :)
    Here's my problem. I'm trying to scan a text file for...
    \foo(parm1|parm2)
    ...in which I want the sub-string "parm1|parm2"
    So... [\\]foo matches the first section. No problem...
    It's when I try adding the '(' or ')' that I'm getting errors.
    java.util.regex.PatternSyntaxException: Unclosed character class near index
    [\]foo(.*)
    Basically, I'm trying to create a pattern, which can recognize \foo(parms), and extract the parms sections.
    Any ideas?

    Yes you can do this. It is not allowed in basic java but there are always around the syntax rules. What you can do it use AspectJ plugin in for eclipse and define a cutpoint and make it extend from two classes. What it does is it parses the byte code and inputs the code directly into the byte code. It's pretty neat.
    A simplier approach would be to have two classes A and B. Have A extend BASE and then have B Extend A and then therefore B "isa" A and a BASE.
    Hope this helps.

Maybe you are looking for