Regular Expressions... finding a '('

Hi guys,
Been trying for an hour or so now, but i can't seem to find out how to locate a "(" in a regular expression and the usuall break out characters '\\' doesnt seem to work..
I.e. in the text that I'm searching, the string I want to scrape out may sometimes contain a " ( .* ) " at the end of it, so I've been writing regex's like
([A-Z]) ([a-z]*) ([_]?) ([\\([? )    //more follows
I hope that's clear.. its the last part with the '(' that doesnt work..
o.0
___

tried the double backslash.. and still doesn't work.. here's the actual regex I'm writing;
pattern = Pattern.compile("((href=\"/wiki/)([A-Z])([a-z]*)(_)([A-Z])([a-z]*)([A-Z]?)([a-z?]*)([_]?)([\\(]?)", Pattern.MULTILINE);{code}
Edited by: Mark.ONeill on Jan 2, 2008 12:47 AM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

Similar Messages

  • Java – Regular Expressions – Finding any non digit byte in a multiple byte

    Hello,
    I’m new to JAVA and Regular Expressions; I’m trying to write a regular expression that will find any records that contain a non digit byte in a multiple byte field.
    I thought the following was the correct expression but it is only finding records that contain “all” non digit bytes.
    \D{1,}
    \D = Non Digit
    {1,} = at least 1 or more
    Below is my sample data. I would like the regular expression to find all of the records that are not all numeric. However when I use the regular expression \D{1,} it is only finding the 2 records that all bytes are non digits. (i.e. “ “ and “A “)
    “ 111229”
    “2 111229”
    “20091229”
    “200912c9”
    “201#1229”
    “20101229”
    “20110229”
    “20111*29”
    “20111029”
    “20111229”
    “20B11229”
    “A “
    “A0111229”
    Please note I have also tried \D{1,}+ and \D{1,}? And they also do not return my desired results
    Any assistance someone can provide would be greatly appreciated.

    You don't show the code you are using but I surmise you are using String.matches() which requires that the whole target must match the regular expression not just part of it. Instead you should create a Pattern and then a Matcher and use the Matcher.find() method. Check the Javadoc for Pattern and Matcher and look at the Java regex tutorial - http://docs.oracle.com/javase/tutorial/essential/regex/ .
    P.S. You can re-use the Pattern object - you don't have to create it every time you need one.
    P.P.S. Java regular expressions work with characters not bytes and characters are not not not bytes.

  • Regular Expressions find and replace

    Hi ,
    I have a question on using Regular Expressions in Java(java.util.regex).
    Problem Description:
    I have a string (say for example strHTML) which contains the whole HTML code of a webpage. I want to be able to search for all the image source tags and check whether they are absolute urls to the image source(for eg. <img src="www.google.com/images/logo.gif" >) or relative(for eg. <img src="../images/logo.gif" >).
    If they are realtive urls to the image path, then I wish to replace them with their absolute urls throughout the webpage(in this case inside string strHTML).
    I have to do it inside a servlet and hence have to use java.
    I tried . This is the code. It doesn't match and replace and goes inside an infinite loop i.e probably the pattern matches everything.
    //Change all images to actual http addresses FOR example change src="../images/logo.gif" to src="http://www.google.com/../images/logo.gif"
              String ddurl="http://www.google.com/";
    String strHTML=" < img src=\"../images/logo.gif\" alt=\"Google logo\">";
    Pattern p = Pattern.compile ("(?i)src[\\s]*=[\\s]*[\"\']([./]*.*)[\"\']");
    Matcher m = p.matcher (strHTML);
    while(m.find())
    m.replaceAll(ddurl+m.group(1));
    what is wrong in this?
    Thanks,
    Rajiv

    Right, here's the full monte (whatever that means):import java.util.regex.*;
    public class Test1
      public static void main(String[] args)
        String domain = "http://www.google.com/";
        String strHTML =
          " < img src=\"images/logo.gif\" alt=\"Google logo\">\n" +
          " <img alt=\"Google logo\" src=images/logo.gif >\n" +
          " <IMG SRC=\"/images/logo.gif\" alt=\"Google logo\">\n" +
          " <img alt=\"Google logo\" src=../images/logo.gif>\n" +
          " <img src=http://www.yahoo.com/images/logo.gif alt=\"Yahoo logo\">";
        String regex =
          "(<\\s*img.+?src\\s*=\\s*)   # Capture preliminaries in $1.  \n" +
          "(?:                         # First look for URL in quotes. \n" +
          "   ([\"\'])                 #   Capture open quote in $2.   \n" +
          "   (?!http:)                #   If it isn't absolute...     \n" +
          "   /?(.+?)                  #    ...capture URL in $3       \n" +
          "   \\2                      #   Match the closing quote     \n" +
          " |                          # Look for non-quoted URL.      \n" +
          "   (?!http:)                #   If it isn't absolute...     \n" +
          "   /?([^\\s>]+)             #    ...capture URL in $4       \n" +
        Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE | Pattern.COMMENTS);
        Matcher m = p.matcher(strHTML);
        StringBuffer sbuf = new StringBuffer();
        while (m.find())
          String relURL = m.group(3) != null ? m.group(3) : m.group(4);
          m.appendReplacement(sbuf, "$1\"" + domain + relURL + "\"");
        m.appendTail(sbuf);
        System.out.println(sbuf.toString());
    }First off, observe that I'm using free-spacing (or "COMMENTS") mode to make the regex easier to read--all the whitespace and comments will be ignored by the Pattern compiler. I also used the CASE_INSENSITIVE flag instead of an embedded (?i), just to remove some clutter. By the way, your second (?i) was redundant; the first one would remain in effect until "turned off" with a (?-i). Another way to localize a flag's effect by using it within a non-capturing group, e.g., (?i:img).
    As jaylogan said, the best way to filter out absolute URL's is by using a negative lookahead, and that's what I've done here. The problem of optional quotes I addressed by trying to match first with quotes, then without. The all-in-one approach might work with URL's, since they can't (AFAIK) contain whitespace anyway, but the alternation method can be used to match any attribute/value pair. It's also, I feel, easier to understand and maintain. Unfortunately, it also means that you can't use replaceAll(), since you have to determine which alternative matched before doing the replacement, but the long version is still pretty simple (especially when you can just copy it from the javadoc for the appendReplacement() method, as I did).

  • Regular expression - find if string does NOT contain text....

    I have a string that I want to tokenize. The string can contain basically anything. I want to produce tokens for each "word" found, and for each "<=" or "," found. There does not need to be whitespace around a "<=" or a "," to consider it a token. So for example:
    joe schmoe<=jack, jane
    should become
    joe
    schmoe
    <=
    jack
    jane
    As a constraint, I do not want to use StringTokenizer at all, as "its use is discouraged in new code". http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html
    Here's the code I plan on using for this:
        public String[] getWords(String input) {
            Matcher matcher = WORD_PATTERN.matcher(input);
            ArrayList<String> words = new ArrayList<String>();
            while (matcher.find()) {
                words.add(matcher.group());
            return (String[]) words.toArray(new String[0]);
        }The trick, though, is coming up with a working regular expression. The closest I've found yet is:
    ([^\s]|^(,)|^(<=))+|,|<=
    but that produces the following:
    joe
    schmoe<=jack,
    jane
    I think what I need is to be able to find if a string does not contain the substring "<=" or "," using a regular expression. Anyone know how to do this, or another way to do this using regular expressions?

    Try:
    * Tokenizer.java
    * version 1.0
    * 01/06/2005
    package samples;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    * @author notivago
    public class StrangeTokenizer {
        public static void main(String[] args) {
            String text = "joe schmoe<=jack, jane";
            Pattern pattern = Pattern.compile( "((?:<=)|(?:,)|(?:\\w+))");
            Matcher matcher = pattern.matcher(text);
            while( matcher.find() ) {
                System.out.println( "Item: " + matcher.group(1) );
    }May the code be with you.

  • Regular Expression - find double hyphens only

    I am wondering if there's a way to write a regular expression to find double hyphens and change them to single hyphens.  The catch is that some of the text I'm searching through have multiple hyphens.
    Example:
    str1 = "Here is my sample text with double -- and I would like to replace this with one hyphen."
    str2 = "Here is another sample with multiple hyphens ----- that I do not want to change but leave as is."
    Is there a way to change only str1 to a single hyphen and keep str2 as is?

    You are correct.  I should have been more explict.  Here are some real examples.  I hope this helps.
    Helps what?
    Have you tried to write the regex yourself, at least?
    Adam

  • Regular Expression Find and Replace with Wildcards

    Hi!
    For the world of me, I can't figure out the right way to do this.
    I basically have a list of last names, first names. I want the last name to have a different css style than the first name.
    So this is what I have now:
    <b>AAGAARD, TODD, S.</b><br>
    <b>AAMOT, KARI,</b> <br>
    <b>AARON, MARJORIE, C. </b> <br>
    and this is what I need to have:
    <span class="LastName">AAGAARD</span>  <span class="FirstName">, TODD, S. </span> <br />
    <span class="LastName">AAMOT</span> <span class="FirstName">, KARI,</span> <br/>
    <span class="LastName">AARON</span> <span class="FirstName">, MARJORIE, C.</span> <br/>
    Any ideas?
    Thanks!

    Make a backup first.
    In the Find field use:
    <b>(\w+),\s+([^<]+)<\/b>\s*<br>
    In the Replace field use:
    <span class="LastName">$1</span> <span classs="FirstName">$2</span><br />
    Select Use regular expression. Light the blue touch paper, and click Replace All.

  • Regular expressions: find files with exactly 'n' digits in a row

    Hi there,
    I want to filter files that contain only a fixed number of digits, but not more (at least not in after the digits).
    For example, I have
    01.mp3
    02.mp3
    test10.txt
    test000110101010.txt
    04.flac
    and for n=2 I want to get all files except 'test000110101010.txt'.
    The following is not working, and I'm a total newb regarding regular expressions
    ls -l | grep '^-' | awk '{print $9}' | grep '([0-9]\{2\})[^0-9]\{2\}'
    Thanks for help.
    Regards,
    drm

    Thanks!
    I wrote a python script to scan e.g. a music folder for missing files and needed to extract the file numbers from the files to get the "highest" number.
    You can get it from here: http://pastebin.com/Sg9yDHiw (Python3, expires in 1 month)
    Regards,
    drm
    Edit: found a bug
    Last edited by drm00 (2011-02-04 13:57:43)

  • Regular expression - find repeating +++ signs

    I'm trying to use regular expressions to remove duplicate +++ signs in a string. When I test my pattern using the expresso test (www.ultrapico.com) it parses the string correctly, in Java 1.5 it doesn't work. .. mp.matches() is always false. Any suggestions would be appreciated.
    finalLongstring = "TTL1,clip1+TTL2+++clip3,TTL4,clip4,TTL5,clip5+TTL6+clip6+TTL7+clip7,TTL8,clip8,TTL9,clip9,TTL10,clip10,TTL11,clip11,TTL12,clip12,TTL13,clip13+TTL14+clip14,TTL15,clip15,TTL16,clip16,TTL17,clip17,TTL18,clip18,TTL19,clip19,TTL20,clip20,TTL21,clip21,TTL22,clip22,TTL23,clip23,TTL24,clip24,TTL25,clip25,TTL26,clip26,TTL27,clip27,TTL28,clip28,TTL29,clip29";
    Pattern multiplePunctuation=null;
              multiplePunctuation=Pattern.compile("[,+]{2,6}");
              //                                     |  |
              //                                     |  2 or more times
              //                                     a comma or plus character
              Matcher mp=multiplePunctuation.matcher(finalLongstring);
              if(mp.matches()){
                   finalLongstring=mp.replaceAll("+");
    /code]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

    Answere in your other thread.
    http://forum.java.sun.com/thread.jspa?threadID=5143654

  • REGULAR EXPRESSION FIND PLEASE  ;(

    HI FORM
    I have the following documents with the content
    doc1 = "ELECTRONICS DIGITAL CAMERA"
    doc2 = "ELECTRONICS DIGITAL CAMERA ACCESSIORIES"
    doc3 = "ELECTRONICS DIGITAL CAMERA OPTICS "
    Using regexpression I would like to get only 2nd document ONLY which has the content
    "ELECTRONICS DIGITAL CAMERA ACCESSIORIES"
    How to Achieve this
    Karthik

    You can try this one: ((digital|camera|accessories)[\\s]*)+
    Explainations:
    (digital|camera|accessories) - this group matches any of the 3 words
    [\\s] matches a space character [\\s]* matches any number of spaces
    ((digital|camera|accessories)[\\s]*) - this group matches any of the 3 words, optionnally followed by spaces
    ((digital|camera|accessories)[\\s]*)+ - matches any sequence of 3 words, seperated by spaces
    NOTES:
    - this regular expression matches "digitalcameraaccessories" because the * operator accepts 0 occurences.
    If you want to avoid this situation, change the * to a+, but you will have to append a space to the searched string
    in order to make the pattern match.
    - this regular expression will also match "digital digital camera" because there is no unicity checking.
    Hope this helped,
    Regards.

  • Regular Expression + Find and Replace

    Hey there-  I have a question about regExp and the Find and Replace.  Basically I want to search a wildcard between a href tag, how would that look, because the code below does not work.
    countryLink = "<a href=\"http://www.whateve.com\" target=\"_parent\">";
    [code]
    countryLink = "([^"]*)";
    [code]
    Thanks! Any help is appreciated!
    Also, how do i add code blocks to this forum?

    Yes, I meant the <a> tag, but thank you for displaying the href attribute solution as well.  This solved my issue.  Thanks!  Thought I would display what I did with your code incase someone was interested in using this code to convert a javascript string to XML.
    query this:
    countryLink = "<a href=\"http://www.whateve.com\" target=\"_blank\">";
    add this to the Find box:
    countryLink = "<a href=\\"([-\w:/.?=&;]+)\\" target=\\"_parent\\">";
    add this to the Replace box:
    <countryLink>$1</countryLink>
    creates an output of this:
    <countryLink>http://www.whateve.com</countryLink>

  • NIRG LabVIEW regular expression for covering multiple requirements

    The Word document type in NI Requirements Gateway allows for comma separating the requirements in a Reference / coverage statement.  I would like to do the same within my LabVIEW code, but the type does not have the same Sub regular expression field available.  Is there any way to have a LabVIEW regular expression find coverage statements such as the following:
    [Covers: REQ-5, REQ-9, REQ-15]
    currently within LabVIEW comments I have to have 3 separate [Covers: REQ-5] type statements

    cdweiss,
    I'm very interested to know if you have any other feedback on NI Requirements Gateway.  I'd also be curious to know what products are you're using with it and how extensive your requirements are.
    Feel free to email me directly at [email protected]
    Cheers,
    Eli 
    Message Edited by Elijah K on 01-19-2010 11:40 PM
    Elijah Kerry
    Senior Product Manager, LabVIEW
    Follow my Software Engineering for LabVIEW Blog

  • Help with regular expression to find a pattern in clob

    can someone help me writing a regular expression to query a clob that containts xml type data?
    query to find multiple occurrences of a variable string (i.e <EMPID-XX> - XX can be any number). If <EMPID-01> appears twice in the clob i want the result as EMPID-01,2 and if EMPID-02 appears 4 times i want the result as EMPID-02,4.

    with
    ofx_clob as
    (select q'~
    <EMPID>1
    < UNQID>123456
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>2
    < UNQID>123457
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>1
    < UNQID>123458
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    ~' ofx from dual
    select '<EMPID>' || to_char(ids) || '(' || to_char(count(*)) || ')' multi_empid
      from (select replace(regexp_substr(ofx,'<EMPID>\d*',1,level),'<EMPID>') ids
              from ofx_clob
            connect by level <= regexp_count(ofx,'<EMPID>')
    group by ids having count(*) > 1
    MULTI_EMPID
    <EMPID>1(2)
    with
    ofx_clob as
    (select q'~
    <EMPID>1
    < UNQID>123456
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>2
    < UNQID>123457
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>1
    < UNQID>123456
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>2
    < UNQID>123456
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>1
    < UNQID>123458
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    ~' ofx from dual
    select '<EMPID>' || listagg(to_char(ids) || '(' || to_char(count(*)) || ')',',') within group (order by ids) multi_empid
      from (select replace(regexp_substr(ofx,'<EMPID>\d*',1,level),'<EMPID>') ids
              from ofx_clob
            connect by level <= regexp_count(ofx,'<EMPID>')
    group by ids having count(*) > 1
    MULTI_EMPID
    <EMPID>1(3),2(2)
    Regards
    Etbin
    Message was edited by: Etbin
    used listagg to report more than one multiple <EMPID>

  • Regular expression in FIND statement

    Hi All,
    I am writing the regular expressions.
    But i didn't get properly how to write them.
    I have one internal table with the five fields.
    Exapmle wa-mandt = '800'.
                 wa_number = '3768'
                 wa_path = '/usr/tmp/sapuser/3768/test.txt.'
    append wa to itab.
    Loop at itab itno wa.
    Here i need to find client and number system id from the WA using regular expression in singe line
    endloop.
    Can anybody please explain how to write this.
    Thanks,

    Hi,
    What do you mean by FIND?
    If I got it right, you can use a READ statement with KEY f1 f2 etc BINARY SEARCH.Mention all the fields you want in the KEY fields.
    Dont forget to SORT this itab before the loop.
    Thanks
    Kiran

  • How can I remove all content between two tags using Find/Replace regular expressions?

    This one is driving me bonkers...  I'm relatively new to regular expressions, but I'm trying to get Dreamweaver to remove all content between two tags in an XML document.  For example, let's say I have the following XML:
    <custom>
    <![CDATA[<p>Some text</p>
    <p>Some more text</p>]]>
    </custom>
    I'd like to do a Find/Replace that produces:
    <custom>
    </custom>
    In essence, I'd like to strip all of the content between two tags.  Ideally, I'd like to know how to strip the CDATA content as well, to return the following:
    <custom>
    <![CDATA[]]>
    </custom>
    I'd much appreciate any suggestions on accomplishing this.
    Many thanks!

    Thanks much for your response.  I found David's article to be a little thin with respect to examples using quantifiers in coordination with the wildcard metacharacters; however, I was able to cobble together a working expression through trial and error using the information he presented.  For posterity, here’s the solution:
    Find:
    <custom>[\d\D]*?</custom>
    Replace:
    <custom>
    <![CDATA[]]>
    </custom>
    I believe this literally translates to:
    [] = find anything in this range/character class
    \d = find any digit character (i.e. any number)
    \D = find any non-digit character (i.e. anything except numbers)
    *? = match zero or more times, but as few times as possible (i.e. match multiple characters per instance, but only match one instance at a time, or none at all)
    I’m still not sure how to effectively utilize the . wildcard.  For example, the following expression will not find content that ends with a number:
    <custom>.*?[\D]*?</ custom >
    I'm presuming this is because numbers aren't included in the \D metacharacter; however, shouldn't numbers be picked up by the .*? expression?

  • Help Finding Regular Expressions

    Does Java have regular expressions? Where can I find them in the API?
    Thanks!

    You can find them in the package java.util.regex (from version 1.4).
    If you don't have that version, there is a GNU regular expression package available.
    S&oslash;ren Bak

  • Using regular expressions to find and replace code.

    Hi! Semi-newbie coder here.
    I'm trying to strip out code from multiple pages, I've tried regular expressions but I'm struggling to understand them. I also need to do it across a LOT of pages, so I need an automated way of doing it.
    The best way I can explain is with an analogy:
    I want to delete any string of characters that start with c, ends with t and includes anything inbetween, so it would pick up "cat, cut, chat, coconut, can do it" whatever appears in the middle of those.
    Except, instead of c and t, I want it to find strings of code starting with <div class="advert" and ending with Vote<br> while picking up everything in between, (including spaces, code, comments, etc.). Then, deletes that whole string including the starting and ending.Is there a regular expression I could use in dreamweaver that could do this? Is there a way to do this at all?

    Let me begin by saying I'm a complete idiot with DW's Reg Ex.   I use Search Specific Tag whenever possible.  See screenshot below.
    Try this on your Current Document to see if it works. Then make a back-up copy of site before attempting it on Entire Local Site as you cannot "Undo" this process.
    Good luck,
    Nancy O.

Maybe you are looking for

  • When using ipad4, why do videos not play on Google?  Audio works fine

    WWhen using ipad,why do videos not play on Google? Audio still plays. I have reset the Google app and it will be ok for short time but then reverts back to not playing videos.

  • Sound issues in bfvietnam

    I can hear sound in vietnam, the shooting , the explosions, and the music peopel play from their mp3's. But i run into a problem where i will see someone shooting right in front of me and there will be no audio or at times there will be an explosion

  • Palm Z22 and issue with using Outlook 2007.

    Info:  Palm Z22, Desktop software for Palm is updated to most current, Outlook 2007, 32 bit Windows Vista computer. I have been using Palm Z22 for several years.  I want to use Outlook 2007  but cannot get the data from the Palm device or the Palm de

  • Getting Magic Mouse to Connect At Startup

    I'm using a Magic Mouse with Windows 7, and everything including scrolling works fine once I install the mouse. However, the mouse no longer works the next time I boot into Windows. It shows up under my devices and Windows doesn't find anything wrong

  • IMessage won't recognize my apple password.

    Can't get iMessage or face time to work. Neither of these programs will recognize my apple login. Anyone know how to fix this?