Pattern (dis-)matching
Hi,
why is it that group(0) matches "6<->" and not just "6" in my code below ?
String idxPair = "6<->4";
String pattern = "(.*?)<->(.*?)";
Pattern p = Pattern.compile(pattern);
Matcher matcher = p.matcher(idxPair);
boolean matchFound = matcher.find();
if (matchFound){
srcIdx = matcher.group(0);
trgIdx = matcher.group(1);
System.out.println("PAIR IS : " + idxPair + " SRC: " + srcIdx + " TRG: " + trgIdx);
}Grazia
It looks like I had to replace my previous code with
String pattern = "(.*?)<->(.*)";
Pattern p = Pattern.compile(pattern);
Matcher matcher = p.matcher(idxPair);
boolean matchFound = matcher.find();
if (matchFound){
srcIdx = matcher.group(1);
trgIdx = matcher.group(2);
System.out.println("PAIR IS : " + idxPair + " SRC: " + srcIdx + " TRG: " + trgIdx);
}I can understand teh difference between (.*?) and (.*), but I still think that I should have matched group(0) and group(1), not group(1) and group(2).
If you have any explanation, please let me know.
Thank you.
Grazia
Similar Messages
-
Pattern and Matcher of Regular Expressions
Hello All,
MTMISRVLGLIRDQAISTTFGANAVTDAFWVAFRIPNFLRRLFAEGSFATAFVPVFTEVK
ETRPHADLRELMARVSGTLGGMLLLITALGLIFTPQLAAVFSDGAATNPEKYGLLVDLLR
LTFPFLLFVSLTALAGGALNSFQRFAIPALTPVILNLCMIAGALWLAPRLEVPILALGWA
VLVAGALQLLFQLPALKGIDLLTLPRWGWNHPDVRKVLTLMIPTLFGSSIAQINLMLDTV
IAARLADGSQSWLSLADRFLELPLGVFGVALGTVILPALARHHVKTDRSAFSGALDWGFR
TTLLIAMPAMLGLLLLAEPLVATLFQYRQFTAFDTRMTAMSVYGLSFGLPAYAMLKVLLP
I need some help with the regular expressions in java.
I have encountered a problem on how to retrieve two strings with Pattern and Matcher.
I have written this code to match one substring"MTMISRVLGLIRDQ", but I want to match multiple substrings in a string.
Pattern findstring = Pattern.compile("MTMISRVLGLIRDQ");
Matcher m = findstring.matcher(S);
while (m.find())
outputStream.println("Selected Sequence \"" + m.group() +
"\" starting at index " + m.start() +
" and ending at index " m.end() ".");
Any help would be appreciated.Double post: http://forum.java.sun.com/thread.jspa?threadID=726158&tstart=0
-
Hey,
I'm trying to use the pattern and matcher to replace all instances of a website
address in some html documents as I process them and post them. I'm
including a sample of some of the HTML below and the code I"m using to
process it. For some reason it doesn't replace the sites in the underlying
images and i can't figure out what I'm doing wrong. Please forgive all the
unused variables, those are relics of another way i may have to do this if i
can't get the pattern thing to work.
Josh
public static void setParameters(File fileName)
FileReader theReader = null;
try
System.out.println("beginning setparameters guide2)");
File fileForProcessing=new File(fileName.getAbsolutePath());
//wrap the file in a filereader and buffered reader for maximum processing
theReader=new FileReader(fileForProcessing);
BufferedReader bufferedReader=new BufferedReader(theReader);
//fill in data into the tempquestion variable to be populated
//Set the question and answer texts back to default
questionText="";
answerText="";
//Define the question variable as a Stringbuffer so new data can be appended to it
StringBuffer endQuestion=new StringBuffer();//Stringbuffer to store all the lines
String tempQuestion="";
//Define new file with the absolutepath and the filename for use in parsing out question/answer data
tempQuestion=bufferedReader.readLine();//reads the nextline of the question
String tempAlteredQuestion="";//for temporary alteration of the nextline
//while there are more lines append the stringbuffer with the new data to complete the question buffer
StringTokenizer tokenizer=new StringTokenizer(tempQuestion, " ");//tokenizer for reading individual words
StringBuffer temporaryLine; //reinstantiate temporary line holder each iterration
String newToken; //newToken gets the very next token every iterration? changed to tokenizer moretokens loop
String newTokenTemp; //reset newTokenTemp to null each iterration
String theEndOfIt; //string to hold everything after .com
char[] characters; //character array to hold the characters that are used to hold the entire link
char lastCharChecked;
Pattern thePattern=Pattern.compile("src=\"https:////fakesite.com//ics", Pattern.LITERAL);
Matcher theMatcher=thePattern.matcher(tempQuestion);
while(tempQuestion!=null) //every time the tempquestion reads a newline, make sure you aren't at the end
String theReplacedString=theMatcher.replaceAll("https:////fakesite.com//UserGuide/");
// temporaryLine=new StringBuffer();
//add the temporary line after processed back into the end question.
endQuestion.append(theReplacedString); //temporaryLine.toString());
//reset the tempquestion to the newline that is going to be read
tempQuestion=bufferedReader.readLine();
if(tempQuestion!=null)
theMatcher.reset(tempQuestion);
/*newTokenTemp=null;
while(tokenizer.hasMoreTokens())
newToken=tokenizer.nextToken(); //get the next token from the line for processing
System.out.println("uhhhhhh");
if(newToken.length()>36) //if the token is long enough chop it off to compare
newTokenTemp=newToken.substring(0, 36);
if(newTokenTemp.equals("src=\"https://fakesite.com"));//compare against the known image source
theEndOfIt=new String(); //intialize theEndOfIt
characters=new char[newToken.length()]; //set the arraylength to the length of the initial token
characters=newToken.toCharArray(); //point the character array to the actual characters for newToken
lastCharChecked='a'; // the last character that was compared
int x=0; //setup the iterration variable and go from the length of the whole token back till you find the first /
for(x=newToken.length()-1;x>0&&lastCharChecked!='/';x--)
System.out.println(newToken);
//set last char checked to the lsat iterration run
lastCharChecked=characters[x];
//set the end of it to the last char checked and the rest of the chars checked combined
theEndOfIt=Character.toString(lastCharChecked)+theEndOfIt;
//reset the initial newToken value to the cut temporary newToken root + userguide addin, + the end
newToken=newTokenTemp+"//Userguide"+theEndOfIt;
//add in the space aftr the token to the temporary line and the new token, this is where it should be parsed back together
temporaryLine.append(newToken+" ");
//add the temporary line after processed back into the end question.
endQuestion.append(temporaryLine.toString());
//reset the tempquestion to the newline that is going to be read
tempQuestion=bufferedReader.readLine();
//reset tokenizer to the new temporary question
if(tempQuestion!=null)
tokenizer=new StringTokenizer(tempQuestion);
//Set the answer to the stringbuffer after converting to string
answerText=endQuestion.toString();
//code to take the filename and replace _ with a space and put that in the question text
char theSpace=' ';
char theUnderline='_';
questionText=(fileName.getName()).replace(theUnderline, theSpace);
catch(FileNotFoundException exception)
if(logger.isLoggable(Level.WARNING))
logger.log(Level.WARNING,"The File was Not Found\n"+exception.getMessage()+"\n"+exception.getStackTrace(),exception);
catch(IOException exception)
if(logger.isLoggable(Level.SEVERE))
logger.log(Level.SEVERE,exception.getMessage()+"\n"+exception.getStackTrace(),exception);
finally
try
if(theReader!=null)
theReader.close();
catch(Exception e)
<SCRIPT language=JavaScript1.2 type=text/javascript><!-- if( typeof( kadovInitEffects ) != 'function' ) kadovInitEffects = new Function();if( typeof( kadovInitTrigger ) != 'function' ) kadovInitTrigger = new Function();if( typeof( kadovFilePopupInit ) != 'function' ) kadovFilePopupInit = new Function();if( typeof( kadovTextPopupInit ) != 'function' ) kadovTextPopupInit = new Function(); //--></SCRIPT>
<H1><IMG class=img_whs1 height=63 src="https://fakesite.com/ics/header4.jpg" width=816 border=0 x-maintain-ratio="TRUE"></H1>
<H1>Associate Existing Customers</H1>
<P>blahalbalhblabhlab blabhalha blabahbablablabhlablhalhab.<SPAN style="FONT-WEIGHT: bold"><B><IMG class=img_whs2 height=18 alt="Submit aIf you use just / it misinterprets it and it ruins
your " " tags for a string. I don't think so. '/' is not a special character for Java regex, nor for Java String.
The reason i used
literal is to try to force it to directly match,
originally i thought that was the reason it wasn't
working.That will be no problem because it enforces '.' to be treated as a dot, not as a regex 'any character'.
Message was edited by:
hiwa -
Lexical search using Pattern and Matcher class
Hi Folks,
Need some help with the following Query. I want to find out how I can implement
the following by using Pattern / Matcher classes.
I have a query that returns the following set of strings,
aa
abc
def
ghi
glk
gmonalaks
golskalskdkdkd
lkaldkdldldkdld
mladlad
n33ieler
What I would like do is to find out any string that starts with g and Lexical occurs after ghi. So this will return
ghi / glk / gmonalaks / golskalskdkdkd
How can I accomplish the above, by using the following two classes.
Pattern datePattern = Pattern.compile();
Matcher dateMatcher = datePattern.matcher();
Thanks a bunch...
_Shoe Maker..Nothing in your specification requires a regex. Loop though the strings until you find the string "ghi". Then continue looping though the strings outputting those where the first characters is 'g' .
-
String.matches vs Pattern and Matcher object
Hi,
I was trying to match some regex using String.matches but for me it is not working (probably I am not using it the way it should be used).
Here is a simple example:
/* This does not work */
String patternStr = "a";
String inputStr = "abc";
if(inputStr.matches( "a" ))
System.out.println("String matched");
/* This works */
Pattern p = Pattern.compile( "a" );
Matcher m = p.matcher( "abc" );
boolean found = false;
while(m.find())
System.out.println("Matched using Pattern and Matcher");
found = true;
if(!found)
System.out.println("Not matching with Pattern and Matcher");
Am I not matches method of String class properly?
Please throw some lights on this.
Thank you.String.matches looks at the whole string.
bsh % "abc".matches("a");
<false>
bsh % "abc".matches("a.*");
<true> -
Patterns and Matcher.find()
I would be really grateful for some assistance on this. I have been at this for three hours and can't get it correct.
I have a string such as this: "Hello this is a great <!-- @@[IMAGINE_SPACIAL_VECTOR]@@ --> little string"
I want to use RegEx, Pattern and Matcher.find() to extract "IMAGINE_SPACIAL_VECTOR" ie. the text between "<!-- @@[" and "]@@ -->"
Thanks in advance for your helpimport java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Example {
public static void main(String[] args) {
String data = "Hello this is a great <!-- @@[IMAGINE_SPACIAL_VECTOR]@@ --> little string";
Matcher matcher = Pattern.compile("<!-- @@\\[(.*?)]@@ -->").matcher(data);
while (matcher.find()) {
System.out.println(matcher.group(1));
}Might not be the best way, but it works.
Kaj -
How to Use Pattern and Matcher class.
HI Guys,
I am just trying to use Pattern and Matcher classes for my requirement.
My requirement is :- It should allow the numbers from 1-7 followed by a comma(,) again followed by the numbers from
1-7. For example:- 1,2,3,4,5 or 3,6,1 or 7,1,3 something like that.
But it should not allow 0,8 and 9. And also it should not allow any Alphabets and special characters except comma(,).
I have written some thing like..
Pattern p = Pattern.compile("([1-7])+([\\,])?([1-7])?");
Is there any problem with this pattern ??
Please help out..
I am new to pattern matching concept..
Thanks and regards
Sudheerok guys, this is how my code looks like..
class PatternTest
public static void main(String[] args)
System.out.println("Hello World!");
String input = args[0];
Pattern p = Pattern.compile("([1-7]{1},?)+");
Matcher m = p.matcher(input);
if(m.find()) {
System.out.println("Pattern Found");
} else {
System.out.println("Invalid pattern");
}if I enter 8,1,3 its accepting and saying Pattern Found..
Please correct me if I am wrong.
Actually this is the test code I am presenting here.. I original requirement is..I will be uploading an excel sheets containg 10 columns and n rows.
In one of my column, I need to test whether the data in that column is between 1-7 or not..If I get a value consisting of numbers other than 1-7..Then I should
display him the msg..
Thanks and regards
Sudheer -
What is the Scan from string pattern for "match everything" ?
Hello,
Using Scan from string for a while, I know that %s only matches string up to a whitespace. And I also thought %[^] would match everything including whitespaces. But it turned out that it would stop at the closing square brace.
So, what is the real scan pattern for match everything including whitespaces ?What do you want the Scan From String to end on? Or are you just grabbing the rest of the string? If that is your scenario, then just use the "remaining string" output. It might help if you give a full example of a normal input string and EVERYTHING you want as an output.
-
Searching and Matching - Difference between 'Match Pattern' and 'Match Geometric Pattern'?
I was wondering if someone can explain to me the difference between 'Match Pattern' and 'Match Geometric Pattern' VIs? I'm really not sure which best to use for my application. I'm trying to search/match small spherical particles in a grey video in order to track their speed (I'm doing this after subtracting two subsequent frames to get rid of background motion artifacts).
Which should I use?
Thank you!
Solved!
Go to Solution.Hi TKassis,
1.You may find from this link for the difference between these two,
Pattern Match : http://zone.ni.com/reference/en-XX/help/370281P-01/imaqvision/imaq_match_pattern_3/
Geometric Match : http://zone.ni.com/reference/en-XX/help/370281P-01/imaqvision/imaq_match_geometric_pattern/.
2. I always prefer match pattern because of its execution speed, and incase of geometric pattern match it took lot of time to match your result. You may find in the attached figure for same image with these two algorithm execution time.
Sasi.
Certified LabVIEW Associate Developer
If you can DREAM it, You can DO it - Walt Disney -
Translation Pattern Wildcard Match
Our organization uses 5 digit internal extensions throughout. Our CEO would like the ability to dial any 5 digit extension in our organization but wants his caller id to be shown as his name and the extension of his secretary – basically masking his 5 digit extension. I believe the simplest way to achieve this is to create a Translation Pattern, but I’m having an issue trying to match the wildcards in a TP in CUCM7.1.5. At this stage I have set up a new Partition and CSS just for the CEO’s phone and placed a test phone in the new CSS. I then created a TP which is where I run into a problem.
In the TP I have selected the proper partition and in the Calling Party Transformations section I have listed the Calling Party Transform Mask as the secretary extension (we’ll say 55555 for this example). When I use an exact Translation Pattern match (say 12345) the translation works as I would expect (when I dial 12345 from the test phone, the caller ID shows as 55555). However, when I use any wildcards in the Translation Pattern (i.e. XXXXX) the translation does not occur. Now when I dial 12345 the true caller ID number shows instead of the translated number.
I’m basically looking for a catch all rule from the CEO’s phone that will translate to 55555. I’m guessing I’m overlooking something simple here – any assistance? Thanks in advance.I set up a calling party transformation pattern with the same results. The issue seems to be in matching the dialed pattern or Translation Pattern field. In my testing the pattern is matched only when it's exact and not when wildcards are used. See the first attached screen shot where the pattern is '12345'. When this is applied it works as would be expected and the caller ID on the receiving phone shows 55555. But, on the second attached screenshot using wildcards, when 12345 is dialed the caller ID shows as the number on the phone and not the translated value. For some reason the wildcards don't seem to match.
I've tried various wildcard patterns such as XXXXX, 1234X, and [0-8]XXXX - none work. The last one is the one I'd really like to use. Other thoughts or suggestions? -
Regular expressions, using pattern and matcher but not include the pattern
Hi all
i have a regular expression but in my matcher it is including the text that is in my regular expression.
ie
String str="-------------------stuff===========";
Pattern mainBody = Pattern.compile("-----(.*?)=====", Pattern.MULTILINE);this matches -------------------stuff=====
now i expect to get some - in there as i match from the start, but i dont want to have the = in the match. how do i do a match that excludes the matching expressions.nevermind, figured it out
when i do myMatch.group(1) it gives me just my match
sorry for wasting time :) -
Java Regex - Find Last Match Using Pattern and Matcher
I'd like to write some regex which would allow me to grab the last occurance of match based on a specified list of items. So for:
Pattern languageRegex = Pattern.compile("(len|end)");
And a string of:
"00| 0lend|"
I want it to extract "end". However, I'd grab "len" using the above regex.
If it was
"00| 0lenend|"
I'd grab "end" which is right.
What regex would allow me to grab "end" rather than "len" from:
"00| 0lend|"
Thanks for your help.user3940995 wrote:
I have a list of 3 letter codes that I need to check for in a field. The list is finite but about 100 or so items:
len, end, ren, onm, enl, etc.
However, the field I'm checking in has some other data in it which can bleed into the code but the code will always be at the end.
An example would be "000 0rend"
From this I'd want to extract "end". If there is a better way to do this than using regex then I'd be happy to use that, but as I have to process millions of items I'm keen to not loop trying to find a match so I was hoping there would be a regex solution.
Your regex would work for that particular example. I think if I modify it to be
Pattern.compile("len(?!\\D)|end(?!\\D)|enl(?!\\D)"); (which would then be extended for all the list items)
Then I seem to pick up the last occurance as I'd like to.
Thank you for your help!Doesn't sound like you want to use regexp. I would instead build a character graph/tree with my commands in reversed order. I would then search each line backwards and check if it matches something in my tree. -
Pattern and Matcher problem. Help Please!
I am trying to make the user enter a correct US$ in HTML(jsp format) ie 12.34 but not 12.3456. As far as i know, the regular expression below is correct for $$.
but the there is no error even if the user types 12.34567
can anyone help me please?
this is the code i wrote...
public static final int DATA_ENTRY = 1;
public static final int INVALID_CURRENCY = 2;
public static final int PROCESS_INPUT = 3;
int state;
String a; //data input from user
Pattern p = Pattern.compile("\\d{1,3}(?:(?:,\\d\\d\\d)*|\\d*)(?:\\.\\d\\d)?");
Matcher m = p.matcher(a);
state = DATA_ENTRY; // this is default
if (m.find() {
state = PROCESS_INPUT;
else {
state = INVALID_CURRENCY;Here's two pattern strings that both require a two-place fraction for each entry but do not permit more than two places.
import java.util.regex.*;
public class snggun {
public static void main(String[] args) {
String input = (args.length > 0 ? args[0] : "10.00");
// String mask = "\\d{1,3}(?:(?:,\\d\\d\\d)*|\\d*)(?:\\.\\d\\d)?";
// requires two-place fraction
// String mask = "^(\\d{1,3})(,(\\d\\d\\d))*(\\.\\d\\d)\\z";
// requires two-place fraction
String mask = "\\d{1,3}(?:(?:,\\d{3})*|\\d*)(\\.\\d{2})\\z";
Pattern pattern = Pattern.compile(mask);
Matcher match = pattern.matcher(input);
System.out.println("input = " + input + " qualifies " + match.find());
if(match.find())
for(int j = 0; j <= match.groupCount(); j++)
System.out.println("group " + j + " " + match.group(j));
} -
Translation pattern not matching
Hello All
I am configuring a cucm 4.2 (yes i know its obsolete) integration with Lync 2010 and am having issues with a translation pattern.
The Lync server is sending me 86xxxxxxxxxx for calls within china and will send 61xxxxxxxx for australia (strips the +)
I have configured a [^86]! which should match any international numbers (other than China) and be prefixed and sent to the gateway. Here is the wiered thing I can dial +44xxxxxxxxxx using my lync client which proves that this is matching (when i delete the translation the call will fail).
But when i dial a number like +61xxxxxxxx it doesnt get through and i get
Cisco CallManagerDigit analysis: match(pi="1",fqcn="", cn="removedbymyself", plv="5", pss="LYNC:PT_Reception", TodFilteredPss="LYNC:PT_Reception", dd="61xxxxxxxx ",dac="0")
Cisco CallManagerDigit analysis: potentialMatches=NoPotentialMatchesExist" on the traces.
The LYNC partition has the translation rules. and the CSS assigned to the sip trunk has access to it. the CSS configured in the translation rules is the also the one assigned to the sip trunk.
Anyone see this sort of thing before? how can i check if there is another transformation taking place?
Only way i get round is to put a translation patter for " ! " and it works for all international calls.
Thanks,Hi,
Have you tried testing the call with Dialed Number Analyzer? I find that's a fantastic and often-overlooked tool for this kind of issue. If DNA shows the call will not route, it's probably a CSS issue for the Stafford gateway. If DNA shows the call will route, then it's probably a dial-peer issue on the Stafford gateway.
-Jameson -
Pattern regex matching advice needed
Hi All,
Many thanks for any/all advice :)
Here's my problem. I'm trying to scan a text file for...
\foo(parm1|parm2)
...in which I want the sub-string "parm1|parm2"
So... [\\]foo matches the first section. No problem...
It's when I try adding the '(' or ')' that I'm getting errors.
java.util.regex.PatternSyntaxException: Unclosed character class near index
[\]foo(.*)
Basically, I'm trying to create a pattern, which can recognize \foo(parms), and extract the parms sections.
Any ideas?Yes you can do this. It is not allowed in basic java but there are always around the syntax rules. What you can do it use AspectJ plugin in for eclipse and define a cutpoint and make it extend from two classes. What it does is it parses the byte code and inputs the code directly into the byte code. It's pretty neat.
A simplier approach would be to have two classes A and B. Have A extend BASE and then have B Extend A and then therefore B "isa" A and a BASE.
Hope this helps.
Maybe you are looking for
-
I want to view new pages in tabs. I thought I had clicked the appropriate option to do so but it continually opens in a new window instead. I have restarted my browser in safe mode and the tabs are there and work normally. What gives? I am not sure h
-
Spry vertical menu causes page content to move outside of page layout
In live view the problem can not be seen, but when viewed in Dreamweaver CS4 design view, see attached .jpg you cna see the problem quite clearly. Is this something one has to live with or can i make the menu sit back on the side bar as one would exp
-
Our hard drive recently crashed. Luckily, the files were backed up. When my daughter recently opened Photo Booth, it told her it could not locate the library. If I choose "Choose Existing," it takes me to a few different photo libraries, including o
-
Netbeans palette does not work
Hi. I've had problems with Netbeans and JavaFX. I downloaded Netbeans 7.2 (SE) and JavaFX SDK 2.1 and I can create a JavaFX application in NetBeans normally, but I can not access the palette of componets of JavaFX. Actually, the palette is empty. Doe
-
Pulling GL_ACCOUNT info into Sales Billing Report(based on VDITM)
Dear all, I want populate profit center(this I can get from mat_plant attribute) and gl_account in the Report. My problem is how do I get the join condition between general ledger data and vditm to get the GL_ACCOUNT for each billing item. anyone any