Regex pattern

I am using (?!^)\\b which produces all tokens and delimiters from a string.split();
I would like it to also separate a ');' pair into ')' and ';'
Any way to modify my expression to do so?

Always Learning wrote:
I need help with regular expression pattern matching please.
I would like to separate words, punctuation, parenthesis, and other characters using split()
So far I have come up with string.split("(?!^)\\b|;") which separates things nicely and lops off the end of line.You can't have tested it very thoroughly since it can't possibly get even close to "separate things nicely" .
>
I would like it to ignore spaces and return just the things desired above.I really don't understand this.
>
Can you help? Seems hard to find folks with regex experience.There are plenty of us here who can help with regex but most like me expect a better requirement specification. Please spend some time providing a better specification.

Similar Messages

  • Regex pattern, filter delimiter in sql code

    Hi,
    The problem I'm having is that the regex pattern below is not catching the beginning "go" and ending "go" of a string.
    "(?iu)[(?<=\\s)]\\bgo\\b(?=\\s)"
    The idea is catching the "whole word", in this case the word is "go" so if the word is at the beginning of the string or at the end, i still want to include it.
    So, for example:
    "go select * from table1 go" -> should catch 2 "go"s but catches 0
    "go go# select * from table1 --go go" -> should also catch 2 "go"s but catches 0
    "go go select * from table1 go go" -> should catch 4 "go"s but catches 2
    I have the "[(?<=\\s)]" and the "(?=\\s)" so that the word "go" when next to a special character is not included, for example "--go".
    The problem is that this also negates the beginning and ending of the string.
    Code to test example: It should split at 1st, 2nd and last "go", but only splits at the 2nd "go".
    String s = "go go select * from table1 --go go";
    String delimiter = "go";
    String[] queries = s.split("(?iu)[(?<=\\s)]\\b" + delimiter + "\\b(?=\\s)");
    for (int i = 0; i < queries.length; i++) {
         System.out.println(queries[i]);
    I really need to fix this but I'm not having much success.
    Any help will be appreciated, thanks in advance.

    Yes,
    I prefer this one: Regex Powertoy (interactive regular expressions)
    It's not 100% perfect, but you can see with my example and this online tester that the 1st "go" is not matched. And this is a problem for me.
    I want to eliminate the special characters like "#go" or "-go" but i don't want to eliminate the end and start of string.

  • Regex Pattern For this String ":=)"

    Hello All
    True, this isn't the place to ask this but here it is anyway, if you can help please do.
    Can anybody tell me the regex pattern for this String:
    ":=)"
    Thanks
    John

    Yep, cheers it's ":=\\)"
    public class Test {
         public static void main( String args[] ) {
              String s = "one:=)two:=)three:=)four";
              String ss[] = s.split( ":=\\)" );
              for( int i=0; i<ss.length; i++ )
                   System.out.println( "ss["+i+"] = {" + ss[i] + "}" );
    }resulting in:
    ss[0] = {one}
    ss[1] = {two}
    ss[2] = {three}
    ss[3] = {four}

  • Java Regex Pattern

    Hello,
    I have parsed a text file and want to use a java regex pattern to get the status like "warning" and "ok" ("ok" should follow the "warning" then need to parser it ), does anyone have idea? How to find ok that follows the warning status? thanks in advance!
    text example
    121; test test; test0; ok; test test
    121; test test; test0; ok; test test
    123; test test; test1; warning; test test
    124; test test; test1; ok; test test
    125; test test; test2; warning; test test
    126; test test; test3; warning; test test
    127; test test; test4; warning; test test
    128; test test; test2; ok; test test
    129; test test; test3; ok; test testjava code:
    String flag= "warning";
              while ((line= bs.readLine()) != null) {
                   String[] tokens = line.split(";");
                   for(int i=1; i<tokens.length; i++){
                        Pattern pattern = Pattern.compile(flag);
                        Matcher matcher = pattern.matcher(tokens);
                        if(matcher.matches()){
    // save into a list

    sorry, I try to expain it in more details. I want to parse this text file and save the status like "warning" and "ok" into a list. The question is I only need the "ok" that follow the "warning", that means if "test1 warning" then looking for "test1 ok".
    121; content; test0; ok; 12444      <-- that i don't want to have
    123; content; test1; warning; 126767
    124; content; test1; ok; 1265        <-- that i need to have
    121; content; test9; ok; 12444      <-- that i don't want to have
    125; content; test2; warning; 2376
    126; content; test3; warning; 78787
    128; content; test2; ok; 877666    <-- that i need to have
    129; content; test3; ok; 877666    <-- that i need to have
    // here maybe a regex pattern could be deal with my problem
    // if "warning|ok" then list all element with the status "warning and ok"
    String flag= "warning";
              while ((line= bs.readLine()) != null) {
                   String[] tokens = line.split(";");
                   for(int i=1; i<tokens.length; i++){
                        Pattern pattern = Pattern.compile(flag);
                        Matcher matcher = pattern.matcher(tokens);
                        if(matcher.matches()){
    // save into a list

  • Util.regex.Pattern documentation

    The 1.5 documentation for util.regex.Pattern defines quantifiers that are greedy, reluctant, or possessive. The definitions of these quantifiers seem to be the same. For example, X?, X??, and X?+ are each defined as "X, once or not at all." Is this a mistake? If not, what's that difference among greedy, reluctant, and possessive?

    It's not a mistake, it's just incomplete. A normal (greedy) quantifier matches as many times as it can, but will back off if necessary to achieve an overall match. A reluctant quantifier matches the minimum number of times that it has to, and only tries to match more if that's the only way to achieve an overall match. A greedy quantifier matches as many times as it can and never backs off, even if that makes an overall match impossible. Here's a demonstration:import java.util.regex.*;
    public class Test
      public static void main(String[] args)
        String input = "XXXXX";
        Pattern p1 = Pattern.compile("(X+)(X+)");
        Pattern p2 = Pattern.compile("(X+?)(X+)");
        Pattern p3 = Pattern.compile("(X++)(X+)");
        Matcher m = p1.matcher(input);
        if (m.matches())
           System.out.println("p1:\t" + m.group(1) + "\t" + m.group(2));
        m = p2.matcher(input);
        if (m.matches())
           System.out.println("p2:\t" + m.group(1) + "\t" + m.group(2));
        m = p3.matcher(input);
        if (m.matches())
           System.out.println("p3:\t" + m.group(1) + "\t" + m.group(2));
    p1:     XXXX    X
    p2:     X       XXXXIn p1, the X+ in the first group initially matches all five X's, then hands off to the second group. The X+ there has to match at least one X, but there are none left. So the first group gives up one of its X's, the second group matches it, and Bob's your uncle.
    In p2, the X+? has to match at least one X, so it does, then hands off to the second group, which happily gobbles up the rest of the input.
    In p3, the X++ matches all the X's, but refuses to back off and give the X+ in the second group the one X it needs, so the match fails.

  • Applying REGEX-pattern into XML File

    I have the following problem:
    I have an xml-file. let's say...
    <NODE><NODE1 attr1="a1" attr2="a2">
         <NAME> abc</NAME>
         <VERSION> 1.0</VERSION>
    </NODE1>
    <NODE2 attr1="a3" attr2="a4">
         <NAME> xyz</NAME>
         <VERSION> 3.1</VERSION>
    </NODE2></NODE>I need to know "HOW can I get the values of <NAME></NAME> and <VERSION></VERSION> without using DOM.
    Since my xml-file is pretty big and DOM will take much Memory, i want to avoid it.
    Can anybody suggest some "Regex pattern" so that i can apply it on the xml-file (after converting into String)
    Thanks in Advance

    That worked perfectly. I assumed ( insert comment here ) that the members of the Properties objects were Strings, and therefore followed the same rules where "\" characters are concerned.
    Thank you for pointing out the difference between the two objects, I am not sure how long it would have taken me to figure that out.
    Regards,
    John Gooch

  • Searching Site Content Using REGEX Patterns

    Intent: Detect content in SharePoint 2013 lists and libraries that matches a REGEX pattern, like social security numbers.
    SharePoint 2013 only exposes KQL and FQL languages. 
    http://msdn.microsoft.com/en-us/library/office/jj163973.aspx
    I am comfortable writing this as an App.  I do not know how to pass the search index through a REGEX match.  Perhaps there is a way to access the data on more of a server model instead of through the client APIs?

    Hi  Eric,
    For achieving your demand, you can write a Content Enrichment Web Service to extract regex patterns from a managed property.
    Here is a blog you can refer to:
    SharePoint 2013 Content Enrichment: Regular Expression Data Extraction:
    http://blogs.technet.com/b/peter_dempsey/archive/2013/12/04/sharepoint-2013-content-enrichment-regular-expression-data-extraction.aspx
    Reference:
    http://msdn.microsoft.com/en-us/library/office/jj163982.aspx
    http://blogs.msdn.com/b/richard_dizeregas_blog/archive/2013/06/19/advanced-content-enrichment-in-sharepoint-2013-search.aspx
    Thanks,
    Eric
    Forum Support
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support,
    contact [email protected]
    Eric Tao
    TechNet Community Support

  • Regex - Pattern for positive numbers

    Hi,
    I wanna check for positive numbers.
    My code so far:
    Pattern p = Pattern.compile("\\d+");
    Matcher m = p.matcher(str);
    boolean b = m.matches(); But I don't know how to check for positive numbers (including 0).
    Thanks
    Jonny

    Just to make your life easier:
    package samples;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    * @author notivago
    public class Positive {
        public static void main(String[] args) {
            String input = "- 12 +10 10 -12 15 -12,000 10,000 5,000.42";
            Pattern p = Pattern.compile( "\\b(?<!-\\s?|\\.|,)([0-9]+(?:,?[0-9]{3})*(?:\\.[0-9]*)?)" );
            Matcher matcher = p.matcher( input );
            while( matcher.find() ) {
                System.out.println( "Match: " + matcher.group(1) );
    }

  • Regex Pattern help.

    Me and my friend pedrofire, that�s probably around forums somewhere, are newbies. We are trying to get a log file line and process correctly but we are found some dificculties to create the right expression pattern.
    My log have lines like:
    User 'INEXIST' with session 'ax1zjd8yEeHh' added content '769' with uri 'http://mail.yahoo.com/'.
    User 'INEXIST' with session 'ax1zjd8yEeHh' changed folder from 'E-mails' to 'Milhagem'.
    User 'INEXIST' with session 'a8jXrY_N38ja' updated all content of folder 'Bancos'.
    i need to get the following data
    USER - [INEXIST]
    SESSION - [ax1zjd8yEeHh]
    ACTION - [added] or [changed] or [updated].
    Getting the user and the session is easy, but i am having difficulties grabing the action, because i need to take just the action word without blank spaces igonring the words content or folder or all.
    I m trying this for hours, but to a newbie is a little dificult
    Any help is welcome
    Thanks
    Peter Redman

    Hi,
    How about something like:
    import java.util.regex.*;
    public class RegexpTest
       private static final Pattern p = Pattern.compile(
             "^User '([^']+)' with session '([^']+)' ([^ ]+) .*$" );
       public static void main( String[] argv )
          find( "User 'INEXIST' with session 'ax1zjd8yEeHh' added content '769' with uri 'http://mail.yahoo.com/'." );
          find( "User 'INEXIST' with session 'ax1zjd8yEeHh' changed folder from 'E-mails' to 'Milhagem'." );
          find( "User 'INEXIST' with session 'a8jXrY_N38ja' updated all content of folder 'Bancos'." );
       public static void find( String text )
          System.out.println( "Text: " + text );
          Matcher m = p.matcher( text );
          if ( ! m.matches() ) return;
          String user = m.group(1);
          String session = m.group(2);
          String action = m.group(3);
          System.out.println( "User: " + user );
          System.out.println( "Session: " + session );
          System.out.println( "Action: " + action );
       }which results in:
    Text: User 'INEXIST' with session 'ax1zjd8yEeHh' added content '769' with uri 'http://mail.yahoo.com/'.
    User: INEXIST
    Session: ax1zjd8yEeHh
    Action: added
    Text: User 'INEXIST' with session 'ax1zjd8yEeHh' changed folder from 'E-mails' to 'Milhagem'.
    User: INEXIST
    Session: ax1zjd8yEeHh
    Action: changed
    Text: User 'INEXIST' with session 'a8jXrY_N38ja' updated all content of folder 'Bancos'.
    User: INEXIST
    Session: a8jXrY_N38ja
    Action: updatedYou should probably change the Pattern to be less explicit about what it matches. i.e. changes spaces to \\s+ or something similar.
    Ol.

  • Regex pattern question

    Hi,
    I'm trying to get my feet wet wtih java and regular expressions, done a lof of it in perl, but need some help with java.
    I have an xml file (also working through the sax tutorial, but this question is related to regex)that has multiple elements, each element has a title tag:
    <element lev1>10<title>element title</title>
    <element lev2>20<title>another element title</title>
    </element lev2>
    </element lev1>If I have the following pattern:
    Pattern Title = Pattern.compile("(?i)<title>([^<]*)</title>");that picks up the titles, but I can't distinguish which title belongs to which element. Basically what I want to have is:
    Pattern coreTitle = Pattern.compile("(?i)<element lev1>(**any thing that isn't an </title> tag**)</title>");looked through the tutorials, will keep looking, I'm sure it's in there somewhere, but if someone could point me in the right direction, that would be great.
    thanks,
    bp

    Just guessing, but maybe...
    Pattern.compile("(?i)<element lev1>*<title>([^<]*)</title>");
    But it seems that things like parsing with SAX (or loading to a DOM) or XPath would be much better suited to parsing out XML then regexp.

  • Ava.util.regex.pattern and * - + /

    hi...
    i'm korean... so I can't speak english.. sorry..^^
    but i hava a problem..
    import java.util.regex.*;
    public class Operator
    /     public static void main(String args[])
              String operator="/";
    ////////////////////////////////////////////////////////////// error point..
              Pattern pattern=Pattern.compile(operator);
              Matcher m=pattern.matcher("- ----* / */* /+");
              int count=0;
              while(m.find()) {
         count++;
              System.out.println(count);
    Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 0
    +
    operator : / - : ok...
    operator : * + : error...
    i had to use + *..
    what's problem??

    Are you using matches()? Then keep in mind that it requires that the entire String is matched by the RE.
    pattern.matcher("about:foobar").matches(); //will return false, as "foobar" is not matched by your pattern
    pattern.matcher("about:").matches(); //will return true
    pattern.matcher("about:foobar").find(); //will return true
    pattern.matcher("notabout:foobar").find(); // will return false

  • Regex Pattern Match in an extremely long string

    I need to search a file containing 1 extremely long line (approximately 1 million characters), The pattern I want to search is "ABC" as long as it appears at least n times whatever user input as n. I need the position of where this pattern is found. How to best do this? I tried to break the input into blocks of 100000 characters at a time as too many characters read cause the 'java out of memory' error to occur.
    Then I converted this to a string in order to use REGEX to search. My problem is how to ensure that the last few characters of the current block is also being searched too? How to write the regex expression to do this? Will breaking the input file into multiple lines help?
    eg:
    Searching for ABC as long as it appears at least 3 times continuously ie (ABCABCABC)
    Original Line = XXXXXXXABCABCABCXXXXXXABCX
    The first block of 10 characters read is XXXXXXXABCABC
    The second block of 10 characters read is ABCXXXXXXABCX
    The search result should be position 7 and position 22

    If the sequence of characters is longer than a few hundred KB, then turning it into a String requires you to have enough heap space available in the JVM to store the entire String.
    If that is a problem, an alternative solution is to have a while loop over an InputStream that reads from the source of characters (a file, a network connection, stdin etc.) and looks for the string. Keep a ring buffer the size of the query string, and read the data from the InputStream into it. Then for each character read, compare the content of the ring buffer to the query string.
    This way you will not use more heap space than the size of the query-string, and the size of whatever buffer you use in your InputStream (8KB for the empty constructor of BufferedInputStream at the moment) plus the odds'n ends from the implementation.

  • Help in finding a regex pattern

    Dear all REGEX Gurus. 
    We have a comma separated string like this:
    132, 143, "222, 144, abc", 227, 888, "222#55#ab"
    As you might have guessed, this comes from excel when we save as CSV file, and when one of the columns in excel has the value 222,144,abc. So excel itself puts " at beginning and end.
    Now we want to split it based on ',' like this:
    132
    143
    222,144,abc
    227
    888
    222#55#ab
    So what we thought is to use a REGEX to find a pattern that has anything beginning with ", ending with ", and has a comma (,) in between. We'd replace that comma with a special character.
    System seems to perform greedy search if i search like this ",".
    I tried this:
    "([^"]+)" -> this works that it gives me all values within quotes. But I want only those which have a comma in them.
    So in above, i do not want 222#55#ab to come.
    Tried various combinations, but not able to get it work.
    Can someone advise please how to achieve it?
    Thanks in adv.

    Hi,
    I am in 4.6 version and dont have the facility to regex it for you.
    Just wrote a sample code, if useful then use it.
    DATA:lv_string(100) TYPE c.
    DATA:lv_len TYPE i.
    TYPES:BEGIN OF ty,
          field(100) TYPE c,
          END OF ty.
    DATA:it TYPE TABLE OF ty.
    FIELD-SYMBOLS:<fs> TYPE ty.
    lv_string = '132, 143, "222, 144, abc", 227, 888, "222#55#ab"'.
    CONDENSE lv_string NO-GAPS.
    SPLIT lv_string AT '"' INTO TABLE it.
    CLEAR lv_string.
    LOOP AT it ASSIGNING <fs>.
      IF <fs> IS INITIAL.
        delete it index sy-tabix.
        CONTINUE.
      ENDIF.
      lv_len = strlen( <fs> ) - 1.
      IF ( <fs>+lv_len(1) CA '"' ) OR ( <fs>+lv_len(1) CA ',' ).
        <fs>+lv_len(1) = ' '.
      ENDIF.
      IF ( <fs>+0(1) CA '"' ) OR ( <fs>+0(1) CA ',' ).
        <fs>+0(1) = ' '.
      ENDIF.
      CONDENSE <fs>.
      WRITE:/ <fs>.
    ENDLOOP.

  • Help with regex pattern matching.

    Hi everyone
    I am trying to write a regex that will extract each of the links from a piece of HTML code. The sample piece of HTML is as follows:
    <td class="content" valign="top">
         <!-- BODY CONTENT -->
    <script language="JavaScript"
    src="http://chat.livechatinc.net/licence/1023687/script.cgi?lang=en&groups=0"></script>
    <a href="makeReservation.html">Making a reservation</a><br/>
    <a href="changeAccount.html">Changing my account</a><br/>
    <a href="viewBooking.html">Viewing my bookings</a><br/>I am interested in extracting each link and the corrresponding text for that link into groups.
    So far I have the following regex <td class="content" valign="top">.*?<a href="(.*?)">(.*?)</a><br>However this regex only matches the first line in the block of links, but I need to match each line in the block of links.
    Any ideas? Any suggestions are appeciated as always.
    Thanks.

    Hi sabre,
    thanks for the reply.
    I am already using a while loop with matcher.find(), but it still only returns the first link based on my regex.
    the code is as follows.
    private static final Pattern MENU_ITEM_PATTERN = compilePattern("<td class=\"content\" valign=\"top\">.*?<a href=\"(.*?)\">(.*?)</a><br>");
    private LinkedHashMap<String,String> findHelpLinks(String body) {
        LinkedHashMap<String, String> helpLinks = new LinkedHashMap<String,String>();
        String link;
        String linkText;
          Matcher matcher = MENU_ITEM_PATTERN.matcher(body);
          while(matcher.find()){
            link = matcher.group(1);
            linkText = matcher.group(2);
            if(link != null && linkText != null){
              helpLinks.put(link,linkText);
        return helpLinks;
    private static Pattern compilePattern(String pattern) {
        return Pattern.compile(pattern, Pattern.DOTALL + Pattern.MULTILINE
            + Pattern.CASE_INSENSITIVE);
      }Any ideas?

  • What's the regex pattern for regular English chars and numbers ...

    In Java String there is a matches(String regex) method, how can I specify the regex for regular English chars, I want it to tell me if the string contains non-English chars. The string is from an email, my application wants to know if the email contains any foreign characters, such as French, Spanish, Chinese ...
    So if the string looks like this : "Hi Frank, today is (5-7-2007), my address is : '[email protected]', email me ~ ! ^_^, do you know 1+2=3 ? AT&T, 200%, *[$300] | {#123}, / _ \;"
    It should recognize it as regular English characters.
    But if it contains any foreign language (outside of a-z) chars either western(Russian, Greek ...) or eastern languages (Japanese, Chinese ...), it should return false.
    I wonder how to write this "regex" ?
    Frank

    since french, english, and spanish use thesame
    alphabet, i don't know how you will have theregex
    for "english chars but not french or spanishones"
    :)Not true. Spanish has �
    KajAlso ll.ll was in 1994 dropped from the Spanish alphabet. :)No shit?
    What about ch and rr?

Maybe you are looking for

  • MacBook Pro DVI-HDMI 1080p doesn't work correctly

    Hello, I've just bought new Samsung LCD TV LE40B550. This TV is Full HD and support 1080p. I've connected it via DVI (macbook pro) -> HDMI (TV) cable. It works ok, when I set 1080i. But when I set up 1080p, it shows few dots (pixels) on TV not correc

  • Print quality in Aperture 3 and layout export

    I searched, but did not find recent information on print quality of books ordered through Aperture 3. So, my questions are: 1. How is the quality of the books ordered through Aperture 3? Few years ago I heard that the pictures are a bit darken. Is th

  • Deauthorizing - Once per year rule?

    Hello, I have run into a problem with deauthorizing my computers on my iTunes account. I work in a research lab that is constantly setting up, working on, breaking, and reformatting machines. I install iTunes on them so I can listen to my music while

  • Printer scanner operation severely degraded in 10.6.8

    I have 2 iMacs and a MacBook, all connected via an Airport Extreme; three printers: Canon MF4350d, Epson Workforce 500, and Lexmark S800. After installing the update for 10.6.8 on August 17, 2011, on all three Macs,  printing and scanning has been aw

  • Profiles for Canon D5?

    I recently upgraded to Lightroom 3 (v3.2). I shoot with a Canon D5 and import files as raw. Under the camera calibration tab I find no profiles except the ACR 4.4. and 3.3 (neither under process 2010 or 2003). I suspect that I am doing something wron