What regular expression

I have a problem with a regular expression. The objective is to find each word that does not begin with a capital letter. I don't know what regex to use. So far I have this:
[a-z]+but that cuts off the first (capital) letter and returns also such cut words. I appreciate all help.
PR.

Aardenon wrote:
I have a problem with a regular expression. The objective is to find each word that does not begin with a capital letter. I don't know what regex to use. So far I have this:
[a-z]+
You need to read up on "boundary" meta-characters. Try [url http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#sum]here for starters; although you'd probably be even better to go through [url http://download.oracle.com/javase/tutorial/essential/regex/]the tutorial.
Winston

Similar Messages

  • Regular Expressions - Logical AND

    I know this isn't Java, and it's not really an algorithm, but I can't figure this out. I hope amongst this bright group of developers someone can help me.
    I am searching for a regular expressions that will match a series of words.
    Example:
    Given the words: "ship book"
    What regular expression could be used to find both the word "ship" and the "book"?
    I have found one expression that will do it ... ship.*book|book.*ship
    But that expression doesn't scale. Does anyone know of a better way?
    Thanks,
    BacMan

    Hi,
    How about something like:
    public class Regexp1
       private static final Pattern all = Pattern.compile(
             "^(\\s*\\b(monkey|turnip|ship|book)\\b\\s*)*$" );
       private static final Pattern p = Pattern.compile(
             "\\s*\\b(monkey|turnip|ship|book)\\b\\s*" );
       public static void main( String[] argv )
          find( "ship book turnip monkey" );
          find( "monkey giraffe mango" );
          find( "ship shipship ship" );
       public static void find( String text )
          System.out.println( "Text: " + text );
          Matcher m = p.matcher( text );
          System.out.println( "Matches all: " + all.matcher( text ).matches() );
          while ( m.find() )
             System.out.println( "Matching word: '" + m.group(1) + "'" );
       }which will produce this when run:
    Text: ship book turnip monkey
    Matches all: true
    Matching word: 'ship'
    Matching word: 'book'
    Matching word: 'turnip'
    Matching word: 'monkey'
    Text: monkey giraffe mango
    Matches all: false
    Matching word: 'monkey'
    Text: ship shipship ship
    Matches all: false
    Matching word: 'ship'
    Matching word: 'ship'Pattern all tests to see if all the words are present.
    Pattern p finds each matching word and ignores others.
    Ol.

  • Regular expressions its URGENT !!!

    i have a long string of regular expressions seperated by "|" and i need to know which regular expression the particular string matched how can i find that and can i do it using java .util.regex
    thanks in advance

    Consider to use "capturing groups" or a better solution should be to split this long regular expression with alternations in small ones that will cause considerable reduction in backtracking. Also in this way will be easier to find what regular expression matches the target string.
    Regards.

  • Working w/ regular expressions

    Hi all,
    I'm having trouble figuring out what regular expression to use to parse my output. I want to capture a block of text that starts with a tab (\t) and ends with a line containing the word "error code."
    For example, if my output is the following: ( [\tab] denotes ASCII tab character)
    [\tab]Some command
    Anytext
    Anytext
    blah blah error code is 0
    [\tab]Some other command
    [\tab]Yet another command
    Anytext
    blah blah error code is 1
    I would like subsequent calls to matcher.find() to find these two blocks:
    [\tab]Some command
    Anytext
    Anytext
    blah blah error code is 0
    and
    [\tab]Yet another command
    Anytext
    blah blah error code is 1
    I thought the regular expression should be something like
    "[\t](.*\n)+.*error code.*"
    but the above regular expression returns the entire text intead of the two "blocks" that I want. I know that Java returns the longest match for the expression but I don't know how to exclude "error code" in the middle...
    "[\t](.*\n)+(.*error code.*){1}" ???
    Any help is greatly appreciated.
    Thanks,
    KK

    ..but I'm not sure what is the purpose of the second part (?:\\n|\\Z).
    Would someone care to explain this too me?This non-capturing group is for final delimiter, it means "last character is a Line Feed char or end of input reached", just to prevent error if last line is an error line without Line Feed char.
    A good reference about groups are at JRegex documentation/examples:
    http://jregex.sourceforge.net/.
    Good news for you: the minimal regular expression that solves your problem.
    import java.io.*;
    import java.util.*;
    import java.util.regex.*;
    public class ParseTest {
    public static void main(String[] args) throws Exception {
    String Output =
    "\t\tgcc -someoption file0\n"
    + "\t\tgcc -someoption file1\n"
    + "\t\tgcc -someoption file2\n"
    + "\t\tgcc -someoption file3\n"
    + "some error message\n"
    + "some more error message\n"
    + "make: 1254-004 The error code from the last command is 1.\n"
    + "make: 1254-005 Ignored error code 1 from last command.\n"
    + "\t\tgcc -someoption file4\n"
    + "\t\tgcc -someoption file5\n"
    + "error message\n"
    + "more error message\n"
    + "make: 1254-004 The error code from the last command is 1.\n"
    + "make: 1254-005 Ignored error code 1 from last command.\n"
    + "\t\tgcc -someoption file6\n"
    + "\t\tgcc -someoption file7\n"
    + "last error message\n"
    + "make: 1254-005 Ignored error code 1 from last command.";
    System.out.println(Output+"\n");
    final String
    // Regular Expression pattern to find only commands with error messages.
    // $1 = command
    // $2 = error message
    re = "\\t+(.+)\\n([^\\t]+[^\\t\\n])(?:\\n|$)";
    Pattern p = Pattern.compile(re);
    System.out.println("Pattern:\n"+p.pattern());
    Matcher m = p.matcher(Output);
    for (int j=1; m.find(); j++) {
    System.out.println("\nMatching "+j+":\n");
    System.out.println("--------------------------------");
    System.out.println(m.group(1));
    System.out.println("--------------------------------");
    System.out.println(m.group(2));
    System.out.println("--------------------------------");
    Regards.

  • What's wrong with the regular expression?

    Hi all,
    For the life of me I can not figure out what is wrong with this regular expression
    .*\bA specific phrase\.\b.*
    This is just an example the actual phrase can be an specific phrase. My problem comes when the specific phrase ends in a period. I've escaped the period but it still gives me an error. The only time I don't get an error is when I take off the end boundry character which will not suffice as a solution. I need to be able to capture all the text before and after said phrase. If the phrase doesn't have a period it would look like this...
    .*\bA specific phrase\b.*
    which works fine. So what is it about the \.\b combination that is not matching?
    I've been banging my head on this for a while and I'm getting nowhere.
    The application highlights text that comes from a server. The user builds custom highlights that have some options. Highlight entire line, match partial word, and ignore case. The code that builds my pattern is here
    String strHighlight = _strHighlight;
            strHighlight = strHighlight.replaceAll("\\*", "\\\\*");
            strHighlight = strHighlight.replaceAll("\\.", "\\\\.");
            String strPattern = strHighlight;
            if(_bEntireParagraph)
                if(_bPartialWord)
                    strPattern = ".*" + strHighlight + ".*";
                else               
                    strPattern = ".*\\b" + strHighlight + "\\b.*";           
            else
                if(_bPartialWord)
                    strPattern = strHighlight;
                else               
                    strPattern = "\\b" + strHighlight + "\\b";  
            if(_bIgnoreCase)
                _patHighlight = Pattern.compile(strPattern, Pattern.CASE_INSENSITIVE);
            else
                _patHighlight = Pattern.compile(strPattern);So for example I matching the phrase: The dog ate the cat. And that phrase came over in the following text: Look there's a dog. The dog ate the cat. "Oh my!"
    And my user has the entire line and ignore case options selected then my regex woud look like this: .*\bThe dog ate the cat\b.*
    That should get highlighted, but for some reason it doesn't. Correct me if I'm wrong but doesn't the regex read as follows:
    any characters
    word boundry
    The dog ate the cat[period]
    word boundry
    any characters until newline.
    Any help will be much appreciated

    A word boundary (in the context of regexes) is a position that is either followed by a word character and not preceded by one (start of word) or preceded by a word character and not followed by one (end of word). A word character is defined as a letter, a digit, or an underscore. Since a period is not a word character, the only way the position following it could be a word boundary is if the next character is a letter, digit or underscore. But a sentence-ending period is always followed by whitespace, if anything, so it makes no sense to look for a word boundary there. I think, instead of \b, you should use negative lookarounds, like so:   strPattern = ".*(?<!\\w)" + strHighlight + "(?!\\w).*";

  • What is a Regular Expression?

    I have been programming for a couple of years, and I keep hearing people talk about regular expressions and perl like regular excpression. Java has an api for regular expression's, and I would just like some one to lay down the facts and explain to me what they are and how they are used.
    Thanks. :)

    http://developer.java.sun.com/developer/technicalArticles/releases/1.4regex/

  • Match Regular Expression does not match what Match Pattern does

    I have read through a lot of posts about how Match Pattern does not match what Match Regular Expression will due to not processing some characters.
    However, I found a problem with the other way. A simple Reg-Ex that works in Match Pattern but not Match Regular Expression.
    What I have here is just an example. I want to use Match Regular Expression so I can specify some sub-matches.
    The reg-ex is for: one or more non-numeric characters, a space, one or more numeric characters. At the start of the string.
    How can I get this working in Match Regular Expression? I am working in LabVIEW 2010f2 32 bit. Here is the code snippet and the results:
    Rob
    Solved!
    Go to Solution.

    Robert Cole wrote:
    I think I prefer the ~ for negation since ^ is also used for beginning of the string. But we work with what we have.
    Let me offer you a tip and perhaps defend the honor of the regex a little bit.  One of my favorite features of regexes is the ability to specify character classes (and their negation).  One of the reasons I have to think about the ~ versus ^ is that I rarely use ^ in a regex alternative. 
    Some examples:
    [0-9] = \d (digit)
    [^0-9] = \D (not a digit)
    The equivalent regex for your case is: \D+ \d+

  • What is regular expression for cheking a set operation

    I want a regular expression for checking the syntax of the set operation given as input
    ex:
    the input should be
    [1,2,3]+[2,3] or [1,2]-[2] or [1,2,3]*[1,2]
    is should check the square bracket and the operator between two set operands

    I think you should code that from scratch. When you validate input, you usually want to tell the user what they did wrong, but a regex will only tell you whether it matches or not.

  • What, no regular expressions?

    I was porting a CGI to use Oracle as a backend instead of
    PostgreSQL. When I couldn't find any regular expression
    facilities in Oracle, I reconsidered and ended up waiting for the
    next release of PostreSQL.
    Still, Oracle was startlingly fast and can do things PostgreSQL
    can't (a VIEW based on a UNION ALL, and a column based on a
    subselect).
    Maybe Sybase has regexps, but I won't find out till I have some
    more spare time.
    null

    JoeD. wrote:
    > Greetings all. I'm having a heckofa problem and I was
    hoping someone on these
    > boards may be able to help. I have 1000s links that I
    need to process and I
    > dread doing this by hand.
    Why don't you simplify your request?
    This is what I have: <a
    href="xxx.php?blah">text</a>
    This is what I want: <a href="xxx.html">text</a>
    It's my guess that this would be very easy, but give us
    examples of
    before and after.
    Mick
    Many of the links are now bad links?and they are
    > 'bad' in a very predictable way as they call contain
    'php?title=' ? so I want
    > to search for the bad link code and simply strip the
    <anchor> tag and leave the
    > containing text as is. My guess is that this shouldn't
    be as difficult as it
    > is, but I could be very wrong here. So, I've done the
    following in the
    > Find/Replace dialog: Search: Specific Tag Selected Tag:
    A With Attribute:
    > HREF Here is where I get confused. The 'with attribute'
    option has '=, <, >,
    > !=' as the optoins. So I tried the equality operator
    using both regular
    > expressoins (*php\?title\=*) and without regular
    expressions(php?title), but no
    > matter what I do, none of the links are being found. I'm
    very new to regex so
    > maybe maybe my code here is bad. I wish there was a
    'containing' option that
    > with attribute exposed, but there isn't. So: do any of
    y'all DW gurus have
    > any ideas about this? I'm pulling what little hair I
    have left out over this.
    > Thanks, Joe
    >

  • What is the regular expression for "__" ?

    I just read through String and Pattern API, can't figure it out.
    I want to go:
    String.replaceFirst("__", "_");
    i.e., replace the first 'double under-score' with a single underscore.
    But it "__" is now a regular expression. Please help ! Thanks.

    My mistake was I though it would change the
    string...
    text.replace(...)
    should be
    text = text.replace(...)A frequent mistake. I didn't notice, since I was focused on the regex aspect of the question.
    � {�                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • What is the regular expression for the end of a story?

    Forgive me if this is wrong forum for asking this, but I'm trying to use the Find command using GREP and I need to know the regular expression for the end of a story. (Or, the last character of a story.) Thanks in advance.

    I'd try search for .\z (that's a dot in front) which ought to find the very last character in the story, and replace with $0 and your additional text.
    You know you can use a keyboard shortcut to move your cursor to the end of any story, right? Ctrl + End on Windows, Cmd + End, I think, on Mac. Unless you want to do this to every single story in the document, I would think you might be just as well off to put your text on the clipboard, put the cursor in the story and hit the key combo followed by Ctrl/Cmd + V to paste.

  • Help in regular expression matching

    I have three expressions like
    1) [(y2009)(y2011)]
    2) [(y2008M5)(y2011M3)] or [(y2009M5)(y2010M12)]
    3) [(y2009M1d20)(y2011M12d31)]
    i want regular expression pattern for the above three expressions
    I am using :
    REGEXP_LIKE(timedomainexpression, '???[:digit:]{4}*[:digit:]{1,2}???[:digit:]{4}*[:digit:]{1,2}??', 'i');
    but its giving results for all above expressions while i want different expression for each.
    i hav used * after [:digit:]{4}, when i am using ? or . then its giving no results. Please help in this situation ASAP.
    Thanks

    I dont get your question Can you post your desired output? and also give some sample data.
    Please consider the following when you post a question.
    1. New features keep coming in every oracle version so please provide Your Oracle DB Version to get the best possible answer.
    You can use the following query and do a copy past of the output.
    select * from v$version 2. This forum has a very good Search Feature. Please use that before posting your question. Because for most of the questions
    that are asked the answer is already there.
    3. We dont know your DB structure or How your Data is. So you need to let us know. The best way would be to give some sample data like this.
    I have the following table called sales
    with sales
    as
          select 1 sales_id, 1 prod_id, 1001 inv_num, 120 qty from dual
          union all
          select 2 sales_id, 1 prod_id, 1002 inv_num, 25 qty from dual
    select *
      from sales 4. Rather than telling what you want in words its more easier when you give your expected output.
    For example in the above sales table, I want to know the total quantity and number of invoice for each product.
    The output should look like this
    Prod_id   sum_qty   count_inv
    1         145       2 5. When ever you get an error message post the entire error message. With the Error Number, The message and the Line number.
    6. Next thing is a very important thing to remember. Please post only well formatted code. Unformatted code is very hard to read.
    Your code format gets lost when you post it in the Oracle Forum. So in order to preserve it you need to
    use the {noformat}{noformat} tags.
    The usage of the tag is like this.
    <place your code here>\
    7. If you are posting a *Performance Related Question*. Please read
       {thread:id=501834} and {thread:id=863295}.
       Following those guide will be very helpful.
    8. Please keep in mind that this is a public forum. Here No question is URGENT.
       So use of words like *URGENT* or *ASAP* (As Soon As Possible) are considered to be rude.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

  • Bracket in Regular Expression constant?

    I am a bit puzzled by the behavior I am experiencing in LV 2011. I hope to get some light from experts out there.
    I am trying to parse a messy ASCII header file and after having split it into individual lines (strings), I use the "Match Regular Expression" function to remove some of the info before the substantial information.
    Some of the strings include square brackets ([, ]), which are special characters for the function, therefore, as documented in the help, one needs to precede them with a backslash.
    Example:
    I want to parse the following line:
       #PR [PR_DEV,I,2]
    One way (which I am using because of considerations related to the rest of the header) is the the following:
    Note that the first string constant is using "Code Display" whereas the second one is using "Normal Display".
    Why did I not put a backslash in front of the bracket in the first string, you may ask? Well, I did, but it disappeared after I typed the other characters. And reverting to "Normal Display" did not restore it.
    Of course, the first version does not parse the input string correctly, whereas the second one does it fine.
    In other words, the custom display string (which is convenient for cryptic codes such as \s* or to distinguish between space and tab...or simply ENTER tabs!) seems to mess up with the \[ combo (likewise with the \] one).
    It is not a huge deal. I can use the "Normal Display" mode, but I tend to think that this qualifies as a hidden "feature". And again, it is still a pain in the ... when dealing with special characters such as tabs, etc...
    Solved!
    Go to Solution.

    I think that [ is a special character which needs to be preceded by a backslash, but it is not one of the defined backslash characters (like \s). So, you need to put in two \\ to get one \ while in '\' Codes Display.
    You can put in any character by using \xx where the xx is a hex character using only upper case letters for A..F.  I converted the strings to byte arrays and tried to see what made the arrays match and the Match work.
    Lynn

  • Help with regular expression to not run shtml-hacktype on particular reqs

    Hi,
    Okay, back to 7.0 u2 not ignoring things it should. Our regular site has a lot of SHTML includes. Then we have the glassfish loadbalancer plugin configured for a back-end web application. The problem is, the application URLs end in .html and the web server is looking locally for those requests. If I disable:
    ObjectType fn="shtml-hacktype"In the instance obj.conf, then the requests process normally in the proxy.
    I can't do that, though, because of the includes - then our main site doesn't function properly.
    Is there a way I can surround that line with something like "if request is not /gog/*" then execute? Similar to what I had to do for the j2ee problem looking locally instead of on the back end Weblogic servers.
    <If $uri =~ '^/wp-(.*)'>
    NameTrans fn="ntrans-j2ee" name="j2ee"
    </If>Only I want the opposite - if $uri is not equal to '^/gog/(.*)' then execute the shtml hacktype. I just don't know exactly what the expression would be.
    Thanks!

    Nevermind - I got it. RTFM :-)
    <If $uri !~ '^/gog/(.*)'>
    ObjectType fn="shtml-hacktype"
    </If>

  • Help with Regular Expression for field validation

    I'm fairly new to using regular expressions and using Acrobat. This is probably a simple question, but I've been unable to figure it out.
    I have a text field on a PDF that I would like to be 9 characters in length. The first 2 characters can only be alphanumeric, the last 7 characters can only be numeric.
    At first I was using the following, which allows all the characters to be alphanumeric:
    var re = /^[A-Za-z0-9 :\\_]$/;
    if (event.change.length >0) {
    if (event.willCommit == false) {
        if (!re.test(event.change)) {
            event.rc = false
    That works fine, but it's not quite what I needed. With some assistance I changed it (see below) to fit what I was looking for. However, this didn't work; it prevents anything from being entered in the field:
    var re = /^[A-Za-z0-9]{2}\d{7}$/;
    if (event.change.length >0) {
    if (event.willCommit == false) {
        if (!re.test(event.change)) {
            event.rc = false
    Any help would be greatly appreciated.
    Thanks...

    Here's a function you can call form the field's custom Format script. It should be placed in a document-level JavaScript:
    function custom_ks1() {
        // Define non-commited regular expression
        var re = /^[A-Za-z0-9]{0,2}([0-9]{0,7})?$/;
        // Get all of the characters the user has entered
        var value = AFMergeChange(event);
        // Allow field to be cleared
        if(!value) return;
        if (event.willCommit) {
            // Define commited regular expression
            var re = /^[A-Za-z0-9]{2}[0-9]{7}$/;
            if (!re.test(value)) {  // If final value doesn't match, alert user
                app.alert("Your error message goes here.");
                // event.rc = false
        } else {  // not commited
            // Only allow characters that match the regular expression
            event.rc = re.test(value);
    Call it like this:
    // Custom Keystroke script
    custom1_ks();

Maybe you are looking for