Java Regex Question extract Substring

Hello
I've readt the regex course on http://www.regenechsen.de/regex_de/regex_1_de.html but the regex rules described in the course and its behavior in the "real world" doesn't makes sense. For sample: in the whole string: <INPUT TYPE="Text" name="Input_Vorname">
the matcher should extract only the fieldname so "Input_Vorname" i tried a lot of patterns so this:
"name="(.*?)\"";
"<.*name=\"(.*)\".?>";
"<.*?name=\"(w+)\".*>";
"name=\".*\"";
and so on. But nothing (NOTHING) works. Either it finds anything or nothing. Whats wrong ?
Can somebody declare me what I've made wrong and where my train of thought was gone wrong?
Roland

When you use the matches() method, the regex has to match the entire input, but if you use find(), the Matcher will search for a substring that matches the regex. Here's how you would use it:  String nameRegex = "name=\"(.*?)\"";
  Pattern namePattern = Pattern.compile(nameRegex,Pattern.CASE_INSENSITIVE);
  Matcher nameMatcher = namePattern.matcher(token);
  if (nameMatcher.find()) {
    String fieldName = nameMatcher.group(1);
  }But the main issue is that you're using the wrong method(s) to retrieve the name. The start() and end() methods return the start and end positions of the entire match, but you're only interested in whatever was matched by the subexpression inside the parentheses (round brackets). That's called a capturing group, and groups are numbered according to the order in which they appear, so you should be using start(1) and end(1) instead of start() and end(). Or you can just use group(1), as I've done here, which returns the same thing as your substring() call.
Knowing that, you could go ahead and use matches(), with an appropriate regex:  String nameRegex = "<.*?name=\"(\\w+)\".*?>";
  Pattern namePattern = Pattern.compile(nameRegex,Pattern.CASE_INSENSITIVE);
  Matcher nameMatcher = namePattern.matcher(token);
  if (nameMatcher.matches()) {
    String fieldName = nameMatcher.group(1);
  }

Similar Messages

  • Simple Java regex question

    I have a file with set of Name:Value pairs
    e.g
    Action1:fail
    Action2:pass
    Action3:fred
    Using regex package I Want to get value of Name "Action1"
    I have tried diff things but I cannot figure out how I can do it. I can find Action1: is present or not but dont know how I can get value associated with it.
    I have tried:
    Pattern pattern = Pattern.compile("Action1");
    CharSequence charSequence = CharSequenceFromFile(fileName); // method retuning charsq from a file
    Matcher matcher = pattern.matcher(charSequence);
    if(matcher.find()){
         int start = matcher.end(0);
         System.out.println("matcher.group(0)"+ matcher.group(0));
    how I can get value associated with specific tag?
    thanks
    anmol

    read the data from the text file on a line basis and you can do:
    String line //get this somehow
    String[] keyPair = line.split(":")g
    System.out.println(keyPair[0]); //your name
    System.out.println(keyPair[1]); //your valueor if you've got the text file in one big string:
    String pattern = "(\\a*):(\\a*)$"; //{alpha}:{alpha}newline //?
    //then
    //do some things with match objects
    //look in the API at java.util.regex

  • Java Regex Question (HTML Tokenizing

    Hello
    I would like to tokenize a HTML Page into its html tags and could not find any working expression. I tried it with:
    <[.]*>
    and for all input fields:
    <(INPUT.*)>
    But it doesn't find anything either or it findes anything.
    Can somebody help me?

    </?\S+?[\s\S+]*?>
    "/?" means: "/" can be there but doesnt have to
    "\S" means: every character which isnt a whitespace
    "+" means: look for the previous character if it is there at least one time.
    the "?" after the "+" means: look only for as few of the previous characters as needed to fullfill the regex.
    thats why <adf>sdf> isnt found because <adf> is the shortest string that fullfills the regex.
    "[]" means: treat everything inside the brackets as one term
    "\s" means: look for a whitespace
    "*" means: the previous character (which is the term inside the brackets) can be there as many times as it wants, even zero times
    "*?" is like "+?"

  • Java Regex Question

    I wanted to do some regex to see if a string has a subdomain.
    I want to pass string then check if there is a xxx.example.com or if it's just example.com. Anyone have a clue?
    Thanks,
    Brian

    I just went around and used the split method to check, I'm posting my code in case someone else has this problem and limited to the 1.4 jdk.
    String split = domain.split("[.]") ;
    if(split.length > 2)
        domain = split[split.length - 2] + "." + split[split.lengh -1] ;basically what I wanted to do was see if it was a subdomain and then strip the preceding and just get to the actual domain.
    Thanks for the replys

  • How to extract substring from a string based on the condition ??

    Hi,
    I'm having a very large string which as below
    EQD+CN+SAMPLE18767+2200+++5'
    NAD+CA+FIR:172:20'
    DGS+IMD+3.2+2346+55:CEL'
    FTX+AAA+++GOOD'
    FTX+AAA+++ONE'
    EQD+CN+SAMPLE18795+2200+++5'
    NAD+CA+TIR:172:20'
    DGS+IMD+3.2+2346+55:CEL'
    FTX+AAA+++SECOND'
    FTX+AAA+++IS FAIR'
    similarly FTX+AAA as above and it goes on
    i tokenized each segment with delimiter as ' and able to read each segment.
    Now i want to concatenate the FTX+AAA in a single segment if more than one FTX+AAA with IMMEDIATE below
    The output is as follows
    EQD+CN+SAMPLE18767+2200+++5'
    NAD+CA+FIR:172:20'
    DGS+IMD+3.2+2346+55:CEL'
    FTX+AAA+++GOOD,ONE'
    EQD+CN+SAMPLE18795+2200+++5'
    NAD+CA+TIR:172:20'
    DGS+IMD+3.2+2346+55:CEL'
    FTX+AAA+++SECOND,IS FAIR'
    similarly FTX+AAA should be concatenated if it has similar FTX+AAA IMMEDIATE below.
    The FTX+AAA segments can come any number of times immediate below
    Please help me how we can do this??? Can anyone help me with the code snippet to do this?
    Thanks,
    Kathir

    Encephalopathic wrote:
    You've posted > 300 times here and you still don't respect the rule regarding notification of all cross-posts? [http://www.java-forums.org/advanced-java/30061-how-extract-substring-string-based-condition.html]
    Do you think this this will help convince others to help you?See also [http://www.coderanch.com/t/500088/java/java/extract-substring-string-based-condition|http://www.coderanch.com/t/500088/java/java/extract-substring-string-based-condition].

  • Converting sed regex to Java regex

    I am new to reguler expressions.I have to write a regex which will do some replacements
    If the input string is something like test:[email protected];value=abcd
    when I apply the regex on it,this string should be changed to test:[email protected];value=replacedABC
    I am trying to replace test.com and abcd with the values i have supplied...
    The regex I have come up with is in sed
    s/\(^.*@\)\(.*\)$/\1replaceTest.com;value=replacedABC/i
    Now I am trying to get the regex in Java,I would think it will be something like (^.*@\)(.*\)$\1replaceTest.com;value=replacedABC
    But not sure How i can test this.Any idea on how to make sure my java regex is valid and does the required replacements?

    rsv-us wrote:
    Yep.Agreed.
    Since that these replacements should be done in a single regex.Note that the sed replacement I posted is really made of two replacements! Just like your Java solution would.
    I think once we send this regex to the third party,they will haev to use either sed or perl(will perl do this replacements,not sure though) to get the output.
    Since we are not sure what tool/software the third party is going to use,I was trying to see how i can really test this.Then I read about sed and this regex as is didn't work,so,I had to put all the sed required / and then the regex had become like s/\(^.*@\)\(.*\)$"/1replaceTest.com;value=replacedabcd/iAgain: AFAIK that does not work. I tried it like this:
    {code}$ echo test:[email protected];value=abcd | sed 's/\(^.*@\)\(.*\)$"/1replaceTest.com;value=replacedabcd/i'and the following is returned:test:[email protected] that we will have to send the java regex to the third party,I was trying to see how i can convert this sed regex to java.If I am right,with jave regex,we won;t be able to all the finds and replacements in a single regex..right?...If this is true,this will leave me a question of whether I need to send the sed regex to the thrid party or If I send java regex,they have to convert that to either sed or perl regex.
    One more question,can we do thse replacement in perrl also,if so,what will the equivalent regex for this in perl?
    I can't understand what you are talking about. The large amount of spelling errors also doesn't help to make it clearer.
    Good luck though.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

  • Java Regex Pattern

    Hello,
    I have parsed a text file and want to use a java regex pattern to get the status like "warning" and "ok" ("ok" should follow the "warning" then need to parser it ), does anyone have idea? How to find ok that follows the warning status? thanks in advance!
    text example
    121; test test; test0; ok; test test
    121; test test; test0; ok; test test
    123; test test; test1; warning; test test
    124; test test; test1; ok; test test
    125; test test; test2; warning; test test
    126; test test; test3; warning; test test
    127; test test; test4; warning; test test
    128; test test; test2; ok; test test
    129; test test; test3; ok; test testjava code:
    String flag= "warning";
              while ((line= bs.readLine()) != null) {
                   String[] tokens = line.split(";");
                   for(int i=1; i<tokens.length; i++){
                        Pattern pattern = Pattern.compile(flag);
                        Matcher matcher = pattern.matcher(tokens);
                        if(matcher.matches()){
    // save into a list

    sorry, I try to expain it in more details. I want to parse this text file and save the status like "warning" and "ok" into a list. The question is I only need the "ok" that follow the "warning", that means if "test1 warning" then looking for "test1 ok".
    121; content; test0; ok; 12444      <-- that i don't want to have
    123; content; test1; warning; 126767
    124; content; test1; ok; 1265        <-- that i need to have
    121; content; test9; ok; 12444      <-- that i don't want to have
    125; content; test2; warning; 2376
    126; content; test3; warning; 78787
    128; content; test2; ok; 877666    <-- that i need to have
    129; content; test3; ok; 877666    <-- that i need to have
    // here maybe a regex pattern could be deal with my problem
    // if "warning|ok" then list all element with the status "warning and ok"
    String flag= "warning";
              while ((line= bs.readLine()) != null) {
                   String[] tokens = line.split(";");
                   for(int i=1; i<tokens.length; i++){
                        Pattern pattern = Pattern.compile(flag);
                        Matcher matcher = pattern.matcher(tokens);
                        if(matcher.matches()){
    // save into a list

  • What is the escape character for DOT in java regex?

    How to specify a dot character in a java regex?
    . itself represents any character
    \. is an illegal escape character

    The regex engine needs to see \. but if you're putting it into a String literal in a .java file, you need to make it \\., as Rene said. This is because the compiler also uses \ as an escape character, so it will take the first \ as escaping the second one, and remove it, and the string that gets passed onto the regex will be \.

  • Flash/java remoting question

    Hi,
    i'm new to flash remoting and am finding using it with java
    quite troublesome. I've also found using the FileReference class
    troublesome if you use java. my question is can you somehow tie the
    remoting functionality to the FileReference.upload method
    fucntionality??
    thanks in advance

    I will use the program with photo detector to test
    the response time of the LCD screenWhy Java?I second that. With a test like that, you want to reduce the experiment down to a single variable, in this case the lcd response time. Using a java program to feed the monitor input introduces a second variable, the response time of the program. The java program's timer may not be exact, the components may not be repainted completely quickly enough, etc. If this is just for your own amusement, maybe that doesn't matter, but if you want your results to have any reliability, you'll need a more accurate and controllable input source.

  • Java Regex groups with quantifiers.

    I'm a bit stuck on a regex , i want to do something similar to this :
    (dog){6}
    dogdogdogdogdogdog
    and returned I want 6 seperate groups with 'dog' in each one.
    This works fine with jakarta-regexp but when I use the {} quantifiers in Java regex I lose the groupings which i'm looking for and just get a single group with a value of 'dog'
    I'm sure i'm doing something something stupid here and help would be great!

    You can't do this with the SunJDK regex engine without using find() with a Matcher.
    I consider this feature of jakarta-regex to be a bug and reported it (and a couple of other features) as such several years ago. The feature makes it incompatible with most regex engines I have used.

  • JAVA Regex Illegal Characters

    Hello - I am trying to find a list of all illegal characters which have to be escaped in JAVA Regex pattern matching but I cannot find a complete list.
    Also I understand that when doing the replaceall function that there is a special list of characters which can't be used for that as well, which also have to be escaped differently.
    If anyone has access to a full complete list as to when to escape and how I would greatly appreciated it!
    Thanks,
    Dan

    I also noticed this below link:
    http://java.sun.com/docs/books/tutorial/extra/regex/literals.html
    It said the following characters are meta-characters in regex API:
    ( [ { \ ^ $ | ) ? * + .
    But it also says the below:
    Note: In certain situations the special characters listed above will not be treated as metacharacters. You'll encounter this as you learn more about how regular expressions are constructed. You can, however, use this list to check whether or not a specific character will ever be considered a metacharacter. For example, the characters ! @ and # never carry a special meaning.
    Does anyone know if there would be any issues if I escaped when a character didn't need to be escaped?

  • Java Regex Pipe Delimited ?

    Hello
    I am trying to split the string which is pipe delimited. I am new to Regex and new to Java.
    My Java/Regex code line to split is:
    listColumns = aLine.split("\\|"); // my code has 2 backslash-escapes chars plus 1 pipe char but this forum does not allow me to put pipes or escapes correctly and plain text help is of NO HELP 8^(
    My input string has 3 leading and 4 trailing pipe characters
    My Output from split: (3 leading emptry strings work but 4 trailing pipe delimiters dont work)
    SplitStrings2:[]
    SplitStrings2:[]
    SplitStrings2:[]
    SplitStrings2:[col1]
    SplitStrings2:[col3]
    SplitStrings2:[col4]
    I do get 3 empty strings for all 3 leading pipes but no empty strings for the any traling 4 pipe characters.
    What do I need to change the code such that all repeated pipes resulted in same number of empty strings returned by split method?
    thanks
    YuriB
    Edited by: yurib on Nov 28, 2012 12:25 PM
    Edited by: yurib on Nov 28, 2012 12:25 PM
    Edited by: yurib on Nov 28, 2012 12:29 PM

    1. The pipe is a meta-character so escape it.
    2. Split rolls things up for you unless you tell it otherwise.
    String s = "|||A|B|C||||";
    String[] array = s.split("[|]", 10);
    for(int i=0; i < array.length; i++)
         System.out.println("" + i + ": " + array);

  • Java regex problem

    Hi:
    I have the following texts in a flat file:
    scheduler is running
    system default destination: llp
    device for ps3: /dev/ps3
    device for ps: /dev/ecpp0
    device for llp: /dev/ecpp0
    How can I use java regex to print out the string after "device for " in this case the string "ps3" ,"ps" and "llp".

        static final Pattern DEVICE_PATTERN = Pattern.compile(
                                          "device for ([^:]++)" );
        String text = "";
        Matcher m = DEVICE_PATTERN.matcher( text );
        while ( (text = bufferedReader.readLine()) != null ) {
            if ( m.reset(text).lookingAt() ) {
                String device = m.group( 1 );
        }

  • Sed rules to java regex

    Hi,
    what is the connection between sed regex rules and java regex rule. Is there an easy way to convert sed to java? or do i have to learn sed?....
    Thanks

    IIRC, Java regex rules are like Perl's (although the syntax for invocation differs a bit), and Perl's are basically a superset of sed's, except there's a difference with parentheses. In Perl/Java, parentheses always group and you have to backslash-quote them to make them interpreted as plain parenthesis characters, whereas in sed, you backslash-quote them to make them be interpreted as grouping indicators.
    Why? What problem are you having?

  • Regex question: replace

    Hi,
    I'm getting into java.util.regex lately. Having used Perl for regex I'm trying to get familiar with Java's regex "spirit".
    Concerning replacement we can use replaceAll or replaceFirst however:
    - what if I want to replace only the third or fourth element?
    - what if I want to replace second to fourth element?
    in PERL we use " regex_epression_here for 2..4;" for instance.
    I you would have some interesting website/tutorials related to JAVA regex that would be great.
    Thanks for your help.
    Rgds,
    SR

    Yep,
    here is a sample of replacement in Perl
    $Line =~ s/\]/|/ for 2..4; #Replace 2nd 'til
    4th delimiter (]) with pipe (|)
    ....Based on the reference I gave earlier
    import java.util.regex.*;
    * A rewriter does a global substitution in the strings passed to its
    * 'rewrite' method. It uses the pattern supplied to its constructor,
    * and is like 'String.replaceAll' except for the fact that its
    * replacement strings are generated by invoking a method you write,
    * rather than from another string.
    * This class is supposed to be equivalent to Ruby's 'gsub' when given
    * a block. This is the nicest syntax I've managed to come up with in
    * Java so far. It's not too bad, and might actually be preferable if
    * you want to do the same rewriting to a number of strings in the same
    * method or class.
    * See the example 'main' for a sample of how to use this class.
    * @author Elliott Hughes
    public abstract class Rewriter_1
        private Pattern pattern;
        private Matcher matcher;
         * Constructs a rewriter using the given regular expression;
         * the syntax is the same as for 'Pattern.compile'.
        public Rewriter_1(String regularExpression)
            this.pattern = Pattern.compile(regularExpression);
         * Returns the input subsequence captured by the given group
         * during the previous match operation.
        public String group(int i)
            return matcher.group(i);
         * Overridden to compute a replacement for each match. Use
         * the method 'group' to access the captured groups.
        public abstract String replacement(int index);
         * Returns the result of rewriting 'original' by invoking
         * the method 'replacement' for each match of the regular
         * expression supplied to the constructor.
        public String rewrite(CharSequence original)
            this.matcher = pattern.matcher(original);
            StringBuffer result = new StringBuffer(original.length());
            int index = 0;
            while (matcher.find())
                matcher.appendReplacement(result, replacement(++index));
            matcher.appendTail(result);
            return result.toString();
        public static void main(String[] arguments)
            String result = new Rewriter_1("\\|")
                public String replacement(int index)
                    if ((index >= 3) && (index <=5))
                        return "y";
                    else
                        return group(0);
            }.rewrite("| | | | | |");
            System.out.println(result);
    }

Maybe you are looking for

  • PowerMac G3 display

    I got an old PowerMac G3 B&W from a friend and it came with the matching Apple Studio CRT with VGA output. The problem I am having is getting the tower hooked into a display, it has onboard Svideo and the ATI Rage card has the Apple display port inpu

  • Firefox crashes at start in Safe mode - On ONE user account ONLY

    Firefox crashes at start in Safe mode - On one user account on PC only I have done everything on Firefox crash checklist No '''Report ID''' since no start. So no nothing in further support? See below https://support.mozilla.com/en-US/kb/Firefox%20cra

  • How can I change all the recipients in a PRE-PREPARED list from 'to' to 'bcc'?

    I know if I type in the first recipient and hit 'bcc', the next one automatically changes if I'm composing a multiple-recipient email from scratch. However, I already have a pre-prepared sub-list in my Contacts folder. If I click on 'Write', all the

  • I'm losing LIghtroom processing changes when opening photo in Photoshop.  Why?

    I have been using Lightroom only for a couple of weeks.  For several years I used Bridge and Camera Raw for RAW file processing.  After noticing that every .DNG file I process in LIghtroom 5.4 and subsequently open in Photoshop-CS4 loses the Lightroo

  • Z_implementation of Badi ORDER_SAVE not showing up in WEB UI.

    Hi CRM Expert, I did a Z implementation of BADI ORDER_SAVE to auto populate the date , when status changes. The Status is llinkd to the Z_ Date type .We have al the Z_date types in the WEb UI.. but the code is not being triggered .. The code works fi