Regex pattern question

Hi,
I'm trying to get my feet wet wtih java and regular expressions, done a lof of it in perl, but need some help with java.
I have an xml file (also working through the sax tutorial, but this question is related to regex)that has multiple elements, each element has a title tag:
<element lev1>10<title>element title</title>
<element lev2>20<title>another element title</title>
</element lev2>
</element lev1>If I have the following pattern:
Pattern Title = Pattern.compile("(?i)<title>([^<]*)</title>");that picks up the titles, but I can't distinguish which title belongs to which element. Basically what I want to have is:
Pattern coreTitle = Pattern.compile("(?i)<element lev1>(**any thing that isn't an </title> tag**)</title>");looked through the tutorials, will keep looking, I'm sure it's in there somewhere, but if someone could point me in the right direction, that would be great.
thanks,
bp

Just guessing, but maybe...
Pattern.compile("(?i)<element lev1>*<title>([^<]*)</title>");
But it seems that things like parsing with SAX (or loading to a DOM) or XPath would be much better suited to parsing out XML then regexp.

Similar Messages

  • Java Regex Pattern

    Hello,
    I have parsed a text file and want to use a java regex pattern to get the status like "warning" and "ok" ("ok" should follow the "warning" then need to parser it ), does anyone have idea? How to find ok that follows the warning status? thanks in advance!
    text example
    121; test test; test0; ok; test test
    121; test test; test0; ok; test test
    123; test test; test1; warning; test test
    124; test test; test1; ok; test test
    125; test test; test2; warning; test test
    126; test test; test3; warning; test test
    127; test test; test4; warning; test test
    128; test test; test2; ok; test test
    129; test test; test3; ok; test testjava code:
    String flag= "warning";
              while ((line= bs.readLine()) != null) {
                   String[] tokens = line.split(";");
                   for(int i=1; i<tokens.length; i++){
                        Pattern pattern = Pattern.compile(flag);
                        Matcher matcher = pattern.matcher(tokens);
                        if(matcher.matches()){
    // save into a list

    sorry, I try to expain it in more details. I want to parse this text file and save the status like "warning" and "ok" into a list. The question is I only need the "ok" that follow the "warning", that means if "test1 warning" then looking for "test1 ok".
    121; content; test0; ok; 12444      <-- that i don't want to have
    123; content; test1; warning; 126767
    124; content; test1; ok; 1265        <-- that i need to have
    121; content; test9; ok; 12444      <-- that i don't want to have
    125; content; test2; warning; 2376
    126; content; test3; warning; 78787
    128; content; test2; ok; 877666    <-- that i need to have
    129; content; test3; ok; 877666    <-- that i need to have
    // here maybe a regex pattern could be deal with my problem
    // if "warning|ok" then list all element with the status "warning and ok"
    String flag= "warning";
              while ((line= bs.readLine()) != null) {
                   String[] tokens = line.split(";");
                   for(int i=1; i<tokens.length; i++){
                        Pattern pattern = Pattern.compile(flag);
                        Matcher matcher = pattern.matcher(tokens);
                        if(matcher.matches()){
    // save into a list

  • Regex pattern, filter delimiter in sql code

    Hi,
    The problem I'm having is that the regex pattern below is not catching the beginning "go" and ending "go" of a string.
    "(?iu)[(?<=\\s)]\\bgo\\b(?=\\s)"
    The idea is catching the "whole word", in this case the word is "go" so if the word is at the beginning of the string or at the end, i still want to include it.
    So, for example:
    "go select * from table1 go" -> should catch 2 "go"s but catches 0
    "go go# select * from table1 --go go" -> should also catch 2 "go"s but catches 0
    "go go select * from table1 go go" -> should catch 4 "go"s but catches 2
    I have the "[(?<=\\s)]" and the "(?=\\s)" so that the word "go" when next to a special character is not included, for example "--go".
    The problem is that this also negates the beginning and ending of the string.
    Code to test example: It should split at 1st, 2nd and last "go", but only splits at the 2nd "go".
    String s = "go go select * from table1 --go go";
    String delimiter = "go";
    String[] queries = s.split("(?iu)[(?<=\\s)]\\b" + delimiter + "\\b(?=\\s)");
    for (int i = 0; i < queries.length; i++) {
         System.out.println(queries[i]);
    I really need to fix this but I'm not having much success.
    Any help will be appreciated, thanks in advance.

    Yes,
    I prefer this one: Regex Powertoy (interactive regular expressions)
    It's not 100% perfect, but you can see with my example and this online tester that the 1st "go" is not matched. And this is a problem for me.
    I want to eliminate the special characters like "#go" or "-go" but i don't want to eliminate the end and start of string.

  • Regex Pattern For this String ":=)"

    Hello All
    True, this isn't the place to ask this but here it is anyway, if you can help please do.
    Can anybody tell me the regex pattern for this String:
    ":=)"
    Thanks
    John

    Yep, cheers it's ":=\\)"
    public class Test {
         public static void main( String args[] ) {
              String s = "one:=)two:=)three:=)four";
              String ss[] = s.split( ":=\\)" );
              for( int i=0; i<ss.length; i++ )
                   System.out.println( "ss["+i+"] = {" + ss[i] + "}" );
    }resulting in:
    ss[0] = {one}
    ss[1] = {two}
    ss[2] = {three}
    ss[3] = {four}

  • Quick regex "link" question

    I need to extract the link and the links anchor text from Strings which take the following format:
    Bestsellers
    Or
    Find Gift
    (I.e. has more attributes other than the "href" attribute)
    I have the following regex:
    <a\s+[^<>]*?href\s*=\s*["'](.*)["']\s*>(.*?)</a>
    However, although, this works fine for the first example above, it does not match the second example correctly. Instead the "link" it matches comes out as:
    /gp/product/" id="gift" name="findGift
    When it should be:
    gp/product/
    I thought my regex pattern says "extract the link (everything between) the two quotes after "href=" but it seems to match any other attributes which may be inside the "<a>" tag.
    Could someone explain where I have gone wrong.
    Thanks

    Watch out with those .* things. Try this one:
    &#60;a\s*href\s*=\s*['"]([^'"]*)[^>]*>([^&#60;]*)&#60;/a>

  • Util.regex.Pattern documentation

    The 1.5 documentation for util.regex.Pattern defines quantifiers that are greedy, reluctant, or possessive. The definitions of these quantifiers seem to be the same. For example, X?, X??, and X?+ are each defined as "X, once or not at all." Is this a mistake? If not, what's that difference among greedy, reluctant, and possessive?

    It's not a mistake, it's just incomplete. A normal (greedy) quantifier matches as many times as it can, but will back off if necessary to achieve an overall match. A reluctant quantifier matches the minimum number of times that it has to, and only tries to match more if that's the only way to achieve an overall match. A greedy quantifier matches as many times as it can and never backs off, even if that makes an overall match impossible. Here's a demonstration:import java.util.regex.*;
    public class Test
      public static void main(String[] args)
        String input = "XXXXX";
        Pattern p1 = Pattern.compile("(X+)(X+)");
        Pattern p2 = Pattern.compile("(X+?)(X+)");
        Pattern p3 = Pattern.compile("(X++)(X+)");
        Matcher m = p1.matcher(input);
        if (m.matches())
           System.out.println("p1:\t" + m.group(1) + "\t" + m.group(2));
        m = p2.matcher(input);
        if (m.matches())
           System.out.println("p2:\t" + m.group(1) + "\t" + m.group(2));
        m = p3.matcher(input);
        if (m.matches())
           System.out.println("p3:\t" + m.group(1) + "\t" + m.group(2));
    p1:     XXXX    X
    p2:     X       XXXXIn p1, the X+ in the first group initially matches all five X's, then hands off to the second group. The X+ there has to match at least one X, but there are none left. So the first group gives up one of its X's, the second group matches it, and Bob's your uncle.
    In p2, the X+? has to match at least one X, so it does, then hands off to the second group, which happily gobbles up the rest of the input.
    In p3, the X++ matches all the X's, but refuses to back off and give the X+ in the second group the one X it needs, so the match fails.

  • Applying REGEX-pattern into XML File

    I have the following problem:
    I have an xml-file. let's say...
    <NODE><NODE1 attr1="a1" attr2="a2">
         <NAME> abc</NAME>
         <VERSION> 1.0</VERSION>
    </NODE1>
    <NODE2 attr1="a3" attr2="a4">
         <NAME> xyz</NAME>
         <VERSION> 3.1</VERSION>
    </NODE2></NODE>I need to know "HOW can I get the values of <NAME></NAME> and <VERSION></VERSION> without using DOM.
    Since my xml-file is pretty big and DOM will take much Memory, i want to avoid it.
    Can anybody suggest some "Regex pattern" so that i can apply it on the xml-file (after converting into String)
    Thanks in Advance

    That worked perfectly. I assumed ( insert comment here ) that the members of the Properties objects were Strings, and therefore followed the same rules where "\" characters are concerned.
    Thank you for pointing out the difference between the two objects, I am not sure how long it would have taken me to figure that out.
    Regards,
    John Gooch

  • Searching Site Content Using REGEX Patterns

    Intent: Detect content in SharePoint 2013 lists and libraries that matches a REGEX pattern, like social security numbers.
    SharePoint 2013 only exposes KQL and FQL languages. 
    http://msdn.microsoft.com/en-us/library/office/jj163973.aspx
    I am comfortable writing this as an App.  I do not know how to pass the search index through a REGEX match.  Perhaps there is a way to access the data on more of a server model instead of through the client APIs?

    Hi  Eric,
    For achieving your demand, you can write a Content Enrichment Web Service to extract regex patterns from a managed property.
    Here is a blog you can refer to:
    SharePoint 2013 Content Enrichment: Regular Expression Data Extraction:
    http://blogs.technet.com/b/peter_dempsey/archive/2013/12/04/sharepoint-2013-content-enrichment-regular-expression-data-extraction.aspx
    Reference:
    http://msdn.microsoft.com/en-us/library/office/jj163982.aspx
    http://blogs.msdn.com/b/richard_dizeregas_blog/archive/2013/06/19/advanced-content-enrichment-in-sharepoint-2013-search.aspx
    Thanks,
    Eric
    Forum Support
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support,
    contact [email protected]
    Eric Tao
    TechNet Community Support

  • Regex - Pattern for positive numbers

    Hi,
    I wanna check for positive numbers.
    My code so far:
    Pattern p = Pattern.compile("\\d+");
    Matcher m = p.matcher(str);
    boolean b = m.matches(); But I don't know how to check for positive numbers (including 0).
    Thanks
    Jonny

    Just to make your life easier:
    package samples;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    * @author notivago
    public class Positive {
        public static void main(String[] args) {
            String input = "- 12 +10 10 -12 15 -12,000 10,000 5,000.42";
            Pattern p = Pattern.compile( "\\b(?<!-\\s?|\\.|,)([0-9]+(?:,?[0-9]{3})*(?:\\.[0-9]*)?)" );
            Matcher matcher = p.matcher( input );
            while( matcher.find() ) {
                System.out.println( "Match: " + matcher.group(1) );
    }

  • Regular Expression Pattern Question

    I'm building up a file URL String in my program. I'd like to remove all occurrences of "\\" in a String with "/" using the java.lang.String.replaceAll method:
    String fileName = "junk.txt";
    static String DEFAULT_FILE_URL_ROOT = "file:/" + System.getProperty("user.dir") + "/";
    String schemaLocation    = DEFAULT_FILE_URL_ROOT + fileName;
    schemaLocation.replaceAll(java.io.File.separator, "/");But when I run this, I get the following exception:
    java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
    ^
            at java.util.regex.Pattern.error(Pattern.java:1489)
            at java.util.regex.Pattern.compile(Pattern.java:1281)
            at java.util.regex.Pattern.<init>(Pattern.java:1030)
            at java.util.regex.Pattern.compile(Pattern.java:777)
            at java.lang.String.replaceAll(String.java:1710)
            at forum.jdom.example.DOMValidator.main(Unknown Source)What's the regex pattern I need to match "\\"? I can't find it. - MOD

    The file seperator is a string which represents the single character which is a backslash (in windows.)
    A single backslash in regex is an escape charater which escapes the next character. But there is no next character in your expression so it doesn't work. You need to replace what you have with the following...
    And if you really insist on using the constant then the expression would be...
    "\\" + java.io.File.separator

  • Display pattern questions

    I would like to be able to set up a display patter that will add some asterisks or some other characters on each side of what ever a user enters in a field in order to flag the text. Is that possible? or is there a better way to do this than using display patters. I check the Designer help but could find much to answer my question.

    Amanda,
    Try to put a single quote around the parenthesis like this:
    '('$ZZ,ZZ9.99')' for your display pattern.

  • Regex Pattern help.

    Me and my friend pedrofire, that�s probably around forums somewhere, are newbies. We are trying to get a log file line and process correctly but we are found some dificculties to create the right expression pattern.
    My log have lines like:
    User 'INEXIST' with session 'ax1zjd8yEeHh' added content '769' with uri 'http://mail.yahoo.com/'.
    User 'INEXIST' with session 'ax1zjd8yEeHh' changed folder from 'E-mails' to 'Milhagem'.
    User 'INEXIST' with session 'a8jXrY_N38ja' updated all content of folder 'Bancos'.
    i need to get the following data
    USER - [INEXIST]
    SESSION - [ax1zjd8yEeHh]
    ACTION - [added] or [changed] or [updated].
    Getting the user and the session is easy, but i am having difficulties grabing the action, because i need to take just the action word without blank spaces igonring the words content or folder or all.
    I m trying this for hours, but to a newbie is a little dificult
    Any help is welcome
    Thanks
    Peter Redman

    Hi,
    How about something like:
    import java.util.regex.*;
    public class RegexpTest
       private static final Pattern p = Pattern.compile(
             "^User '([^']+)' with session '([^']+)' ([^ ]+) .*$" );
       public static void main( String[] argv )
          find( "User 'INEXIST' with session 'ax1zjd8yEeHh' added content '769' with uri 'http://mail.yahoo.com/'." );
          find( "User 'INEXIST' with session 'ax1zjd8yEeHh' changed folder from 'E-mails' to 'Milhagem'." );
          find( "User 'INEXIST' with session 'a8jXrY_N38ja' updated all content of folder 'Bancos'." );
       public static void find( String text )
          System.out.println( "Text: " + text );
          Matcher m = p.matcher( text );
          if ( ! m.matches() ) return;
          String user = m.group(1);
          String session = m.group(2);
          String action = m.group(3);
          System.out.println( "User: " + user );
          System.out.println( "Session: " + session );
          System.out.println( "Action: " + action );
       }which results in:
    Text: User 'INEXIST' with session 'ax1zjd8yEeHh' added content '769' with uri 'http://mail.yahoo.com/'.
    User: INEXIST
    Session: ax1zjd8yEeHh
    Action: added
    Text: User 'INEXIST' with session 'ax1zjd8yEeHh' changed folder from 'E-mails' to 'Milhagem'.
    User: INEXIST
    Session: ax1zjd8yEeHh
    Action: changed
    Text: User 'INEXIST' with session 'a8jXrY_N38ja' updated all content of folder 'Bancos'.
    User: INEXIST
    Session: a8jXrY_N38ja
    Action: updatedYou should probably change the Pattern to be less explicit about what it matches. i.e. changes spaces to \\s+ or something similar.
    Ol.

  • OO pattern Question

    the Question:
    In the file system, there are many files, their's file format is FormatA.
    and now come two new file formats, FormatB and FormatC.
    Now the old files with file format FormatA should be converted into files with FormatB,and In the future the FormatB maybe convertend into FormatC,
    how to design using design patterns?

    Hi myQA,
    what exactly are you considering the client?
    The one calling readFile and writeFile?
    If you are concerned about explicitly creating FileXReaderWriter
    and thus knowing about the differnt file types, one could easily overcome
    this limitation by using a Factory class:
    public class ReaderWriterFactory {
        public FileReaderWriter createReaderWriter(){
            if (someCondition) { // the condition might be depend on some parameter provided (filename for example) or the state of the system or ...
                 return new FileAReaderWriter();
            } else {
                 return new FileAReaderWriter();
    }If you only need this whole stuff for converting (thus you don't want to read files in order to work with the data inside) you could wrap all
    the logic behind on function call convert(String fromName, String toName) maybe with some additional information what formats to use.
    Although this really belongs in its own class Converter I guess one could add It to the Factory, which then shouldn't be named Factory anymore. If the other classes and Interfaces aren't needed anywhere outside your Converter you should make them private to the converter.
    regards
    Spieler

  • Patterns Question

    Hey everyone, got a question regarding patterns in Illustrator and this one is sorta bugging me.
    Basically, I have a single object, a 3 leaf clover. When I import this object into the swatch pallet, it can make a pattern...yet it is more of a x and y axis pattern (clovers that make 90 degree angles and 180 angles). Is there an easy way to make the clovers pattern everywhere (making 45 degree angles , 90 degree angles, and 180)?
    Thanks in advance for your input. I appreciate it.

    Teri Petit has a sort of tutorial on how to do this do a search for Teri Petit Pattern Fill.

  • Ava.util.regex.pattern and * - + /

    hi...
    i'm korean... so I can't speak english.. sorry..^^
    but i hava a problem..
    import java.util.regex.*;
    public class Operator
    /     public static void main(String args[])
              String operator="/";
    ////////////////////////////////////////////////////////////// error point..
              Pattern pattern=Pattern.compile(operator);
              Matcher m=pattern.matcher("- ----* / */* /+");
              int count=0;
              while(m.find()) {
         count++;
              System.out.println(count);
    Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 0
    +
    operator : / - : ok...
    operator : * + : error...
    i had to use + *..
    what's problem??

    Are you using matches()? Then keep in mind that it requires that the entire String is matched by the RE.
    pattern.matcher("about:foobar").matches(); //will return false, as "foobar" is not matched by your pattern
    pattern.matcher("about:").matches(); //will return true
    pattern.matcher("about:foobar").find(); //will return true
    pattern.matcher("notabout:foobar").find(); // will return false

Maybe you are looking for