Hash marks in Regular Expressions

Given this example code, what do you expect as output?import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class HashTest
   * @param args
  public static void main(String[] args)
    String testChar = "#";
    String testInput = "this string is # split by a hash";
    String[] stringArray = testInput.split( testChar );
    for (String s : stringArray)
      System.out.println( s );
    Pattern p = Pattern.compile( testChar );
    Matcher m = p.matcher( testInput );
    if ( m.lookingAt() )
      System.out.println( m.groupCount() );
    p = Pattern.compile( "(#)(.*)" );
    m = p.matcher( testInput );
    if ( m.lookingAt() )
      for (int i = 0; i <= m.groupCount(); i++)
        System.out.println( m.group( i ) );
}The output I get is
this string is
split by a hash
replacing the last pattern with
p = Pattern.compile( "(.*?)(#)(.*)" );I get
this string is
split by a hash
this string is # split by a hash
this string is
split by a hash
This is very unexpected behavior and if anyone could come up with a convincing solution, I'd appreciate it.

As for matches(), I think it was because they were also adding regex-related wrapper methods to String. Part of the contract for the find() method is that, if you call it multiple times without resetting the matcher, each match attempt will start searching at the position where the last match ended. But that wouldn't work in String, because there was no way to ensure that the variable it's being called on this time still refers the same string it referred to the last time. So, if it can only be meaninfully called once anyway, they might as well have it match the whole string. That's my guess.
I can't even guess what they were thinking about with lookingAt(). I found it perfectly useless until JDK 1.5, when they added the "regions" API and the usePattern(Pattern) method to Matcher. Now it's useful for scanning text with multiple regexes, which is how they use it in java.util.Scanner, and how I use it in my syntax-highlighting editor. Before, I had to prepend "\G" to every regex and, for each match attempt, create a new Matcher and call its find(int) method--yech!.

Similar Messages

  • How to find Regular Expressions in a Hash Map

    Hi,
    I Have a hash map with some keys. The Keys are like this(Java.util.regex, Javax.swing.table, javax.swing.text, Java.util.jar, Java.text etc). Suppose if the user gives the search pattern as "text", the o/p should be javax.swing.text and java.text.. How to do it using regular Expressions

    // Sample code...
    import java.util.regex.*;
    public class TestRegex {
        public static void main(String[] args) {
            String test1 = "java.util.regex";
            String test2 = "javax.swing.text";
            String test3 = "java.util.jar";
            String test4 = "java.text";
            Pattern pat = Pattern.compile(".*text.*");
            Matcher mt1 = pat.matcher(test1);
            System.out.println("1> " +mt1.matches());
            Matcher mt2 = pat.matcher(test2);
            System.out.println("2> " +mt2.matches());
            Matcher mt3 = pat.matcher(test3);
            System.out.println("3> " +mt3.matches());
            Matcher mt4 = pat.matcher(test4);
            System.out.println("4> " +mt4.matches());
    }

  • Urgent!!! Problem in regular expression for matching braces

    Hi,
    For the example below, can I write a regular expression to store getting key, value pairs.
    example: ((abc def) (ghi jkl) (a ((b c) (d e))) (mno pqr) (a ((abc def))))
    in the above example
    abc is key & def is value
    ghi is key & jkl is value
    a is key & ((b c) (d e)) is value
    and so on.
    can anybody pls help me in resolving this problem using regular expressions...
    Thanks in advance

    "((key1 value1) (key2 value2) (key3 ((key4 value4)
    (key5 value5))) (key6 value6) (key7 ((key8 value8)
    (key9 value9))))"
    I want to write a regular expression in java to parse
    the above string and store the result in hash table
    as below
    key1 value1
    key2 value2
    key3 ((key4 value4) (key5 value5))
    key4 value4
    key5 value5
    key6 value6
    key7 ((key8 value8) (key9 value9))
    key8 value8
    key9 value9
    please let me know, if it is not possible with
    regular expressions the effective way of solving itYes, it is possible with a recursive regular expression.
    Unfortunately Java does not provide a recursive regular expression construct.
    $_ = "((key1 value1) (key2 value2) (key3 ((key4 value4) (key5 value5))) (key6 value6) (key7 ((key8 value8) (key9 value9))))";
    my $paren;
       $paren = qr/
               [^()]+  # Not parens
             |
               (??{ $paren })  # Another balanced group (not interpolated yet)
        /x;
    my $r = qr/^(.*?)\((\w+?) (\w+?|(??{$paren}))\)\s*(.*?)$/;
    while ($_) {
         match()
    # operates on $_
    sub match {
         my @v;
         @v = m/$r/;
         if (defined $v[3]) {
              $_ = $v[2];
              while (/\(/) {
                   match();
              print "\"",$v[1],"\" \"",$v[2],"\"";
              $_ = $v[0].$v[3];
         else { $_ = ""; }
    C:\usr\schodtt\src\java\forum\n00b\regex>perl recurse.pl
    "key1" "value1"
    "key2" "value2"
    "key4" "value4"
    "key5" "value5"
    "key3" "((key4 value4) (key5 value5))"
    "key6" "value6"
    "key8" "value8"
    "key9" "value9"
    "key7" "((key8 value8) (key9 value9))"
    C:\usr\schodtt\src\java\forum\n00b\regex>

  • In a Regular expressions I can set up an "OR" statement?

    HI, I'm using e-tester version 8.2..My problem is about regular expessions.
    I need to catch a dynamic value from a Form Field (My-TxtBox).. this textbox gets pre populated data ( when the customer has info in the database).. this is the code:
    *<input name="My-TxtBox" type="text" value="XXXX" id="ID-TxtBox"*
    ---- ( XXXX = dynamic number prepopulated by the web app)
    So, I'm using this regular expression:
    *<input name="My-TxtBox" type="text" value="(.+?)" id="ID-TxtBox"*
    -----At this point, everything is fine.. but there is an exception
    I'm getting a big problem when the customer doesn't have data in the server, the code is NOT like this
    <input name="My-TxtBox" type="text" value="" id="ID-TxtBox"
    When the customer doesn't have data in the server, the code is more like this:
    *<input name="My-TxtBox" type="text" id="ID-TxtBox"*
    ---- (please note that there is not a value parameter now)
    So, I think there is no way to create a CDV that will work for both cases? any idea to solve this?
    i was thinking that maybe in the reg exps sintax you can create an "OR" statement.. my idea was to create a CDV
    that works for both cases.. when there is the "value=" string and when there is not.
    something like this
    This CDV returns the dynamic value when there is the "value=" string =
    <input name="My-TxtBox" type="text" value="(.+?)" id="ID-TxtBox"
    And this CDV returns "" when there is no "value=" string = Without value:
    <input name="My-TxtBox" type="text" (.*?)id="ID-TxtBox"
    My idea is to place something like this in some point of the CDV = *{* value="(.+?)" *OR* (.*?) *}*
    so my dream is to create a CDV similar to this:
    <input name="My-TxtBox" type="text" { value="(.+?)" *OR* (.*?) }id="ID-TxtBox
    I was searching on google but I simply don't get an answer....it is posible to place an OR statement into a Reg Exp and how the sintax is? ..
    Regards.. I appreciate your time.

    Hola,
    You can use a regular expression such as:
    <input name="My-TxtBox" type="text" ?v?a?l?u?e?=?"?(.*?)"? id="ID-TxtBox"
    Note that:
    * Note that there is a question mark sign (?) after each letter that may appear in the string. This means that the letter may or may not appear.
    * Note that instead of using (.+?) you should use (.*?).
    This means that will match any character that appears zero or mutliple times. The question mark here means that is non-greedy, meaning that it will not include in the .* matching anything like the rest of the pattern (in this case the rest of the pattern is "? id="ID-TxtBox").
    * Note that the question mark in front of the v of value is there to match a space that may or may not exists.
    Few other facts:
    * In regular expressions the parenthesys determine sequence of operations and mark groups. Such groups can be referenced in code (not in etester but in general). In eTester it will always get the value of the first group (first group = first set of parenthesys).
    * ORs in regular expressions can be expressed with the pipe "|" (without the quotes), but you will need parenthesys in this case which would not allow you to capture the group of characters that you want.
    Regards,
    [Z]{1}uriel C?
    Edited by: Zuriel on Oct 5, 2009 3:07 PM
    Edited to avoid having the text changed by the forum formatting options.

  • Regular expression question (should be an easy one...)

    i'm using java to build a parser. im getting an expression, which i split on a white-space.
    how can i build a regular-expression that will enable me to split only on unquoted space? example:
    for the expression:
    (X=33 AND Y=44) OR (Z="hello world" AND T=2)
    I will get the following values split:
    (X=33
    AND
    Y=34)
    OR
    (Z="hello world"
    AND
    T=2)
    and not:
    (Z="
    hello
    world"
    thank you very much!

    Instead of splitting on whitespace to get a list of tokens, use Matcher.find() to match the tokens themselves: import java.util.*;
    import java.util.regex.*;
    public class Test
      public static void main(String[] args) throws Exception
        String str = "(X=33 AND Y=44) OR (Z=\"hello world\" AND T=2)";
        List<String> tokens = new ArrayList<String>();
        Matcher m = Pattern.compile("[^\\s\"]+(?:\".*?\")?").matcher(str);
        while (m.find())
          tokens.add(m.group());
        System.out.println(tokens);
    }{code} The regex I used is based on the assumptions that there will be at most one run of quoted text per token, that it will always appear in the right hand side of an expression, and that the closing quote will always mark the end of the token.  If the rules are more complicated (as sabre150 suggested), a more complicated regex will be needed.  You might be better off doing the parsing the old-fashioned way, with out regexes.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

  • Regular expressions in Format Definition add-on

    Hello experts,
    I have a question about regular expressions. I am a newbie in regular expressions and I could use some help on this one. I tried some 6 hours, but I can't get solve it myself.
    Summary of my problem:
    In SAP Business One (patch level 42) it is possible to use bank statement processing. A file (full of regular expressions) is to be selected, so it can match certain criteria to the bank statement file. The bank statement file consists of a certain pattern (look at the attached code snippet).
    :61:071222D208,00N026
    :86:P  12345678BELASTINGDIENST       F8R03782497                $GH
    $0000009                         BETALINGSKENM. 123456789123456
    0 1234567891234560                                            
    :61:071225C758,70N078
    :86:0116664495 REGULA B.V. HELPMESTRAAT 243 B 5371 AM HARDCITY HARD
    CITY 48772-54314                                                  
    :61:071225C425,05N078
    :86:0329883585 J. MANSSHOT PATTRIOTISLAND 38 1996 PT HELMEN BIJBETA
    LING VOOR RELOOP RMP1 SET ORDERNR* 69866 / SPOEDIG LEVEREN    
    :61:071225C850,00N078
    :86:0105327212 POSE TELEFOONSTRAAT 43 6448 SL S-ROTTERDAM MIJN OR
    DERNR. 53846 REF. MAIL 21-02
    - I am in search of the right type of regular expression that is used by the Format Definition add-on (javascript, .NET, perl, JAVA, python, etc.)
    Besides that I need the regular expressions below, so the Format Definition will match the right lines from my bankfile.
    - a regular expression that selects lines starting with :61: and line :86: including next lines (if available), so in fact it has to select everything from :86: till :61: again.
    - a regular expression that selects the bank account number (position 5-14) from lines starting with :86:
    - a regular expression that selects all other info from lines starting with :86: (and following if any), so all positions that follow after the bank account number
    I am looking forward to the right solutions, I can give more info if you need any.

    Hello Hendri,
    Q1:I am in search of the right type of regular expression that is used by the Format Definition add-on (javascript, .NET, perl, JAVA, pythonetc.)
    Answer: Format Definition uses .Net regular expression.
    You may refer the following examples. If necessary, I can send you a guide about how to use regular expression in Format Defnition. Thanks.
    Example 6
    Description:
    To match a field with an optional field in front. For example, u201C:61:0711211121C216,08N051NONREFu201D or u201C:61:071121C216,08N051NONREFu201D, which comprises of a record identification u201C:61:u201D, a date in the form of YYMMDD, anther optional date MMDD, one or two characters to signify the direction of money flow, a numeric amount value and some other information. The target to be matched is the numeric amount value.
    Regular expression:
    (?<=:61:\d(\d)?[a-zA-Z]{1,2})((\d(,\d*)?)|(,\d))
    Text:
    :61:0711211121C216,08N051NONREF
    Matches:
    1
    Tips:
    1.     All the fields in front of the target field are described in the look behind assertion embraced by (?<= and ). Especially, the optional field is embraced by parentheses and then a u201C?u201D  (question mark). The sub expression for amount is copied from example 1. You can compose your own regular expression for such cases in the form of (?<=REGEX_FOR_FIELDS_IN_FRONT)(REGEX_FOR_TARGET_FIELD), in which REGEX_FOR_FIELDS_IN_FRONT and REGEX_FOR_TARGET_FIELD are respectively the regular expression for the fields in front and the target field. Keep the parentheses therein.
    Example 7
    Description:
    Find all numbers in the free text description, which are possibly document identifications, e.g. for invoices
    Regular expression:
    (?<=\b)(?<!\.)\d+(?=\b)(?!\.)
    Text:
    :86:GIRO  6890316
    ENERGETICA NATURA BENELU
    AFRIKAWEG 14
    HULST
    3187-A1176
    TRANSACTIEDATUM* 03-07-2007
    Matches:
    6
    Tips:
    1.     The regular expression given finds all digits between word boundaries except those with a prior dot or following dot; u201C.u201D (dot) is escaped as \.
    2.     It may find out some inaccurate matches, like the date in text. If you want to exclude u201C-u201D (hyphen) as prior or following character, resemble the case for u201C.u201D (dot), the regular expression becomes (?<=\b)(?<!\.)(?<!-)\d+(?=\b)(?!\.)(?!-). The matches will be:
    :86:GIRO  6890316
    ENERGETICA NATURA BENELU
    AFRIKAWEG 14
    HULST
    3187-A1176
    TRANSACTIEDATUM* 03-07-2007
    You may lose some real values like u201C3187u201D before the u201C-u201D.
    Example 8
    Description:
    Find BP account number in 9 digits with a prior u201CPu201D or u201C0u201D in the first position of free text description
    Regular expression:
    (?<=^(P|0))\d
    Text:
    0000006681 FORTIS ASR BETALINGSCENTRUM BV
    Matches:
    1
    Tips:
    1.     Use positive look behind assertion (?<=PRIOR_KEYWORD) to express the prior keyword.
    2.     u201C^u201D stands for that match starts from the beginning of the text. If the text includes the record identification, you may include it also in the look behind assertion. For example,
    :86:0000006681 FORTIS ASR BETALINGSCENTRUM BV
    The regular expression becomes
    (?<=:86:(P|0))\d
    Example 9
    Description:
    Following example 8, to find the possible BP name after BP account number, which is composed of letter, dot or space.
    Regular expression:
    (?<=^(P|0)\d)[a-zA-Z. ]*
    Text:
    0000006681 FORTIS ASR BETALINGSCENTRUM BV
    Matches:
    1
    Tips:
    1.     In this case, put BP account number regular expression into the look behind assertion.
    Example 10
    Description:
    Find the possible document identifications in a sub-record of :86: record. Sub-record is like u201C?00u201D, u201C?10u201D etc.  A possible document identification sub-record is made up of the following parts:
    u2022     keyword u201CREu201D, u201CRGu201D, u201CRu201D, u201CINVu201D, u201CNRu201D, u201CNOu201D, u201CRECHNu201D or u201CRECHNUNGu201D, and
    u2022     an optional group made up of following:
         a separator of either a dot, hyphen or slash, and
         an optional space, and
         an optional string starting with keyword u201CNRu201D or u201CNOu201D followed by a separator of either a dot, hyphen or slash, and
         an optional space
    u2022     and finally document identification in digits
    Regular expression:
    (?<=\?\d(RE|RG|R|INV|NR|NO|RECHN|RECHNUNG)((\.|-|/)\s?((NR|NO)(\.|-|/))?\s?)?)\d+
    Kind Regards
    -Yatsea

  • Is there a way to give two different regular expressions in a Grep command?

    I am trying to search message logs from CLI. My search query which involves a regular expression is giving thousands of results. We need to further filter the results using another regular expression. Please let me know if we can put two regular expressions in the search query.
    Below is how I use grep command: Can I use multiple regular expression separated by a "|" (pipe) symbol
    Enter the regular expression to grep.
    []> MID 123456

    No - unfortunately - from the CLI on the ESA/SMA - the 'grep' command is not as intuitive as the unix/linux 'grep'.  You would need to push logs off appliance, and then take advantage of an external OS in order to parse through the logs as definitively needed.
    -Robert
    (*If you have received the answer to your original question, and found this helpful/correct - please mark the question as answered, and be sure to leave a rating to reflect!)

  • Using regular expressions

    Hi Experts,
    After going through some documentation on regular expressions in Oracle I have tried to draw some conclusions about the same. As I wasn’t much confident on how the patterns are built, I have tried to interpret them by looking at the output. It’s basically a reverse engineering I have tried to do.
    Please let me know if my interpretations are correct. Any additions /suggestions/corrections are most welcome.
    Some of the examples may lack conclusions, please ignore those.
    select regexp_substr('1PSN/231_3253/ABc','^([[:alnum:]]*)') from dual;
    Output: 1PSN
    Interpreted as:
    ^ From the start of the source string
    ([[:alnum:]]*) zero or more occurrences of alphanumeric characters
    select regexp_substr('@@/231_3253/ABc','@*([[:alnum:]]+)') from dual;
    Output: 231
    Interpreted as:
    @* Search for zero or more occurrences of @
    ([[:alnum:]]+) followed by one or more occurrences of alphanumeric characters
    Note: In the above example oracle looks for @(zero times or more) immediately followed by alphanumeric characters.
    Since a '/' comes between @ and 231 the o/p is 0 occurences of @ + one or more occurrences of alphanumerics.
    select regexp_substr('1@/231_3253/ABc','@+([[:alnum:]]*)') from dual;
    Output: @
    Interpreted as:
    @+ one or more ocurrences of @
    ([[:alnum:]]*) followed by 0 or more occurrences of alphanumerics
    select regexp_substr('1@/231_3253/ABc','@+([[:alnum:]]+)') from dual;
    Output: Null
    Interpreted as:
    @+ one or more occurences of @
    ([[:alnum:]]+) followed by one or more occurences of aplhanumerics
    select regexp_substr('@1PSN/231_3253/ABc125','([[:digit:]]+)$') from dual;
    Output: 125
    Interpreted as:
    ([[:digit:]]+) one or more occurences of digits only
    $ at the end of the string
    select regexp_substr('@1PSN/231_3253/ABc','([^[:digit:]]+)$') from dual;
    output: /ABc
    Interpreted as:
    ([^[:digit:]]+)$ one or more occurrences of non-digit literals at the end of the string
    '^' inside square brackets marks the negation of the class
    Look for http:// followed by a substring of one or more alphanumeric characters and optionally, a period (.)
    SELECT REGEXP_SUBSTR('Go to http://www.oracle.com/products and click on database','http://([[:alnum:]]+\.?){3,4}/?') RESULT
    FROM dual;
    Output: http://www.oracle.com
    Interpreted as:
    [[:alnum:]]+ one or more occurences of alplanumeric characters
    \.? dot optionally (backslash represents escape sequence,? represents optionally)
    {3,4} 3 or 4 times
    /? followed by forward slash optionally
    If you have www.oracle.co.uk; {3,4} extracts it for you as well
    Validate email:
    select case  when
           REGEXP_LIKE('[email protected]',
                       '^([[:alnum:]]+(\_?|\.))([[:alnum:]]*)@([[:alnum:]]+)(.([[:alnum:]]+)){1,2}$') then 'Match Found'
           else 'No Match Found'
           end
    as output from dual;
    Interpreted as:
    ([[:alnum:]]+(\_?|\.)) one or more occurrences of alpha numerics optionally followed by an underscore or dot
    ([[:alnum:]]*) followed by 0 or more occurrences of alplhanumerics
    @ followed by @
    ([[:alnum:]]+) followed by one or more occurrences of alplhanumerics
    (.([[:alnum:]]+)){1,2} followed by a dot followed by alphanumerics from once till max of twice (Ex- .com or .co.uk)
    Output: Match Found
    Input: [email protected]
    Output: Match Found
    Input: [email protected]
    Output: No Match Found
    Truncate the part, ending with digits
    select regexp_substr('Yahoo11245@US','^.*[[:digit:]]',1) from dual;
    Output: Yahoo11245
    select regexp_substr('*Yahoo*11245@US','^.*[[:digit:]]',1) from dual;
    Output: *Yahoo*11245
    Interpreted as:
    .* zero or more occurrences of any characters (dot signifies any character)
    Replace 2 to 8 spaces with single space
    select regexp_replace('Hello   you      OPs       there','[[:space:]]{2,8}',' ')
    from dual;
    Search for control characters
    select case  when
           regexp_like('Super' || chr(13) || 'Star' ,'[[:cntrl:]]')
                  then 'Match Found'
           else 'No Match Found'
           end
    as output from dual;
    Output: Match Found
    Search for lower case letters only with a string length varying from a min of 3 to max of 12
    select case  when
           regexp_like('terminator' ,'^[[:lower:]]{3,12}$')
                  then 'Match Found'
           else 'No Match Found'
           end
    as output from dual;
    4th character must be a special character
    select case  when
           regexp_like('ter*minator' ,'^...[^[:alnum:]]')
                  then 'Match Found'
           else 'No Match Found'
           end
    as output from dual;
    Ouput: Match Found
    Case Sensitive Search
    select case  when
           regexp_like('Republic Of  Africa' ,'of','c')
                  then 'Match Found'
           else 'No Match Found'
           end
    as output from dual;
    Output: No match found
    c stands for case sensitive
    select case  when
           regexp_like('Republic Of  africa' ,'of','i')
                  then 'Match Found'
           else 'No Match Found'
           end
    as output from dual;
    Output: Match Found
    i stands for case insensitive
    Two consecutive occurences of characters from a to z
    select regexp_substr('Republicc Of Africaa' ,'([a-z])\1', 1,1,'i') from dual;
    Output: cc
    Interpreted as:
    ([a-z]) character set a-z
    \1 consecutive occurence of any character
    1 starting from 1st character in the string
    1 First occurence
    i case insensitive
    Three consecutive occurences of characters from 6 to 9
    select case  when
           regexp_like('Patch 10888 applied' ,'([7-9])\1\1')
                  then 'Match Found'
           else 'No Match Found'
           end
    as output from dual;
    Output: Match Found
    Phone validator:
    select case  when
           regexp_like('123-44-5555' ,'^[0-9]{3}-[0-9]{2}-[0-9]{4}$')
                  then 'Match Found'
           else 'No Match Found'
           end
    as output from dual;
    Output: Match Found
    Input: 111-222-3333
    Output: No match found
    Interpreted as:
    ^ start of the string
    [0-9]{3} three ocurrences of digits from 0-9
    - followed by hyphen
    [0-9]{2} two ocurrences of digits from 0-9
    - followed by hyphen
    [0-9]{4} four ocurrences of digits from 0-9
    $ end of the string
    ************************************************************************Source Links:
    http://www.psoug.org/reference/regexp.html
    http://www.oracle.com/technology/obe/obe10gdb/develop/regexp/regexp.htm
    Edited by: Preta on Feb 25, 2010 4:38 PM
    Corrected the example for www.oracle.com
    Edited by: Preta Incorported Logan's comments

    Hi,
    It looks like you have a good understanding of how regular expressions work.
    You can put comments like the ones in your message directly in the code. For example, your validate e-mail code could be re-written
    select      case 
             when REGEXP_LIKE ( '[email protected]'
                        , '^'          || -- Starting from the beginning of the string
                        '('          || -- Begin \1
                          '[[:alnum:]]+'|| --     0 or more alphnumerics
                          '(\_?|\.)'     || --     optional underscore or dot
                        ')'          || -- End \1
                        '([[:alnum:]]*)'|| -- 0 or more alphnumerics
                        '@'          || -- @ sign
                        '([[:alnum:]]+)'|| -- 1 or more alpanumerics
                        '('          || -- Begin \5
                          '\.'          || --   dot
                          '([[:alnum:]]+)'
                                  || --   1 or more alphanumerics
                        ')'          || -- End \5
                        '{1,2}'          || -- \5 can occur 1 or 2 times
                        '$'             -- End of string
             then 'Match Found'
                    else 'No Match Found'
                end          as output
    from      dual;I find this easier to debug and maintain.
    There's no denying, it does make the code very long. You be the judge of when to do this.
    You use parentheses and \ unnceccessarily sometimes. That's not really an error; if you find they make the code easier to develop and maintain, use them as much as you like.
    For example, about the 4th line of the regular expression as I formatted it above:
    '(\_?|\.)'     || --     optional underscore or dotUnderscore has no special meaning in regular expressions (only in LIKE), so you don't have to escape it.
    I might write that line:
    '(_|\.)?'     || --     optional underscore or dotjust because I think it's clearer.
    I think you forgot a \ about 7 lines later:
    '\.'          || --   dotBe very careful about testing patterns that include literal dots; always make sure that a random character, like ~ , fails in a place where a dot is expected.

  • Regular Expression in CF ?

    Hi
    I have the following line from an rss feed
    #weather_xml.rss.channel.item[x].description.xmlText#
    At present it outputs as
    Thursday: sunny, Max Temp: 8°C (46°F), Min Temp:
    4°C (39°F)
    Is it possible using regular expressions or another method,
    that after each comma the script will insert a <br> tag so
    all the data is not displayed on one line ?
    Any ideas ?

    rambo wrote:
    > But on the page it shows up as normal text i.e.
    > REReplace(Max Temp: 8?C (46?F), Min Temp: 4?C (39?F),
    Wind Direction: SSE,
    > Wind Speed: 14mph, Visibility: moderate, Pressure:
    1038mb, Humidity: 88%, UV
    > risk: low, Sunrise: 08:12GMT, Sunset: 16:06GMT, ',', '
    You do have to provide the basic CFML syntax of hash|pound|#
    signs
    around the function inside a
    <cfoutput></cfoutput> block so that the
    ColdFusion application server knows this is a statement to be
    resolved,
    not just text to display.
    I.E.
    <cfoutput>#replace(weather_xml.rss.channel.item[x].description.xmlText,
    ',', '<br>', 'ALL'>#</cfoutput>

  • FM9 SDL Authoring Assitant Regular Expression Syntax?

    I'm trying to trick SDL into identifying words that are not approved by STE.
    Under "Configure|Style and Linguistic Checks|User Defined Rules" the program allows regular expressions to create custom rules.
    I have all other options in the Utility unchecked.
    I am by no means a pro at regular expressions but was able to create a pretty solid command at http://regexlib.com/RETester.aspx.
    The idea is to create an expression that looks for any word other than those seperated by vertical bars.
    For the test text "this is not the way that should work. this is not the way that should work."
    \b(?:(?!should|not|way|this|is|that).)+
    returns: the work the work
    At that website, I can change the excluded words and it works every time. Change the test text, same thing, still works.
    Perfect! I ripped every approved word in STE into the formula and it (SDL) only returns words at the end of the sentence that are followed by a periods and question marks. So I added"\." to the exclusion list in the expression and it only found words next to question marks. I excluded question marks and now it finds nothing. I don't understand this as I wasn't aware that I had any criteria in the expression that dictates functionality only at the end of the sentence.
    I have an O'reilly book to refer to, if anyone can give me a shove in the right direction as to which set of rules to adhere to, I would appreciate it. Why did negative word matching have to be my introduction to this subject?

    I tried your expression in a couple of regex tools and it seems to parse as you wanted it to. I suspect that the SDL implementation doesn't follow the unix/linux standards. I haven't used the tool and the usage documentation is non-existant, except for the limited flash-based demo.
    From the SDL knowledgebase, it states that their regex filter uses the .NET regex flavour and I believe that the differences on this are explained in the "Mastering Regular Expressions" book.

  • Regular Expressions Question

    I am using regular expressions to parse through a text file generated by a set of sensors attached to a SeaBird CTD profiler, what matter is that some lines use quotes as a special marker and some dont, but I need to treat both as the same data. Here is the example:
    SeaBird Datafile (just some part):
    "# name 0 = depS: depth, salt water [m]"
    "# name 1 = t068: temperature, IPTS-68 [deg C]"
    # name 2 = sal00: salinity, PSS-78 [PSU]
    "# name 3 = sigma-t00: density, sigma-t [kg/m^3]"
    "# name 4 = flS: fluorometer, sea tech"And I am using this regular expression to read this lines and save those names:
    private static final String ColumnName = "(^\"#|^# ) name (\\d)+ = ((.+)\"$|(.+)$)";There is a possiblity that the string either starts with "# and ends with ", or that it only starts with # and no special marker to the end. Can anyone enlighten me on the correct regex because the one I posted up there only works in the case of being surrounded by quotes. Thanks in advance!
    Christian A. Sueiras

    Try this: private static final String ColumnName = "^(\"?)# name (\\d+) = (.+)\\1$";{code} You match an optional quotation mark at the beginning, and capture it in group #1.  At the end, you match whatever was captured in group #1: either a quotation mark, or nothing.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

  • Online chat, Nov. 12, on Regular Expressions

    One of the new packages added in J2SE v 1.4 is java.util.regex , which provides classes for handling regular exprssions. A regular expression is a string pattern that can be used to perform sophisticated string searching and replacement. Learn more about java.util.regex and regular expressions, and get questions answered in this chat with Sun engineers Michael McCloskey and Mark Reinhold. The chat is scheduled for Tuesday, November 12 at 11:00 A.M. PST/7:00 P.M. GMT.
    To join the chat, go to http://developer.java.sun.com/developer/community/chat/index.html and click on "Join the current session".

    Is there an agenda? Or can we ask anything, e.g.
    about the ActiveX bridge?There is no agenda. You're free to ask any question related to Java Plug-In Technology during the chat.

  • Regular Expressions... finding a '('

    Hi guys,
    Been trying for an hour or so now, but i can't seem to find out how to locate a "(" in a regular expression and the usuall break out characters '\\' doesnt seem to work..
    I.e. in the text that I'm searching, the string I want to scrape out may sometimes contain a " ( .* ) " at the end of it, so I've been writing regex's like
    ([A-Z]) ([a-z]*) ([_]?) ([\\([? )    //more follows
    I hope that's clear.. its the last part with the '(' that doesnt work..
    o.0
    ___

    tried the double backslash.. and still doesn't work.. here's the actual regex I'm writing;
    pattern = Pattern.compile("((href=\"/wiki/)([A-Z])([a-z]*)(_)([A-Z])([a-z]*)([A-Z]?)([a-z?]*)([_]?)([\\(]?)", Pattern.MULTILINE);{code}
    Edited by: Mark.ONeill on Jan 2, 2008 12:47 AM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

  • ACE20 Module, webservices and regular expressions.

    Hello All,
    I am trying to loadbalance requests for webservices in a serverfarm. But for some reason, ACE20 module y not making matches on the requests.
    We have a serverfarm Prod1 with 2 real servers and another serverfarm named WSDL with other 2 real servers.
    The idea is the following, if we receive the following string, /App.WebService, the ACE should redirect it to serverfarm Prod1, but if it receives /App.WebService?wsdl, it should be redirected to WSDL.
    Request with string /App.WebService --------------> ServerFarm Prod1
    Request with string /App.WebService?wsdl -----> ServerFarm WSDL
    We use regular expression in L7 class maps to make the loadbalance to happen.
    class-map type http loadbalance match-all APP.WEBSERVICES-L7-SLB
      2 match http url /App\.WebService\?wsdl
    class-map type http loadbalance match-all APP-L7-SLB
      2 match http url /App\.WebService
    policy-map type loadbalance first-match L7_SLB-POLICY
      class APP.WEBSERVICES-L7-SLB
        serverfarm WSDL
      class APP-L7-SLB
        serverfarm Prod1
      class L4_SLB_DATAPOWER(9050)
        loadbalance vip inservice
        loadbalance policy L7_SLB-POLICY
        loadbalance vip icmp-reply
        appl-parameter http advanced-options HTTP_PARAM
        ssl-proxy server wildcard.test.org
        connection advanced-options TCP_PARAM
    But the ACE20 Module seems to be removing the ?wsdl from the URL and only the class-map called APP-L7-SLB is being matched.
    Any comments or suggestions on why this could be happening?
    Thanks in advance,
    Fernando

    Hello Kanwal and all,
    Finally, after reading and reading I found a fix to this problem. Seems that the HTTP protocol uses the question mark (?) character as a delimiter for data appended to the URL. So, if you get the following:
    www.test1.org/App.WebService?wsdl
    If you configured a L7 class map to parse the URL, it will only parse until the question mark (?).
    So you need to create a PARAMETER-MAP changing the URL delimiter start. Here is an example:
    parameter-map type http HTTP_PARAMETER_MAP_WSDL
      persistence-rebalance strict
      set secondary-cookie-delimiters ;!@?
      set secondary-cookie-start ;
    I used the semicolon ( ; ) as delimiter.
    Hope this helps.
    Fernando

  • Sed Request Regular Expression Format

    A quick question....
    There are lots of different syntaxes for regular expressions and lots for SED. With the sed_request and sed_response filter I have tried different syntaxes for marking word boundaries, but don't know which to use. The \b syntax is supported but doesn't seem to do anything and the \< and \> syntax throughs up errors when I start up the web server. I tried the more complex (?<!\w)(?=\w) and (?<=\w)(?!\w) but the \w isn't supported. I am wondering if I just can't do this.... I am trying to stop SQL injection attacks using a syntax such as
    s/\bselect\b.{1,100}?\bfrom\b.{1,100}?\bwhere\b//g
    Are word boundaries not supported?

    Actually, the entries should be \\< and \\>, which looks double escaped to me but the entries are correct then
    Input fn="insert-filter"
    method="(GET|HEAD|POST)"
    filter="sed-request"
    sed="s/</\\</g"
    sed="s/%3c/\\</g"
    sed="s/%3C/\\</g"
    sed="s/>/\\>/g"
    sed="s/%3e/\\>/g"
    sed="s/%3E/\\>/g"
    sed="s/\x3C ?iframe//g"
    sed="s/\\<src\\>[^a-zA-Z_0-9]*?\\<javascript://g"
    sed="s/\\<src\\>[^a-zA-Z_0-9]*?\\<vbscript://g"
    sed="s/\\<href\\>[^a-zA-Z_0-9]*?\\<javascript://g"
    sed="s/\\<alert\\>[^a-zA-Z_0-9]*?\x28//g"
    sed="s/\\<src\\>[^a-zA-Z_0-9]*?\\<http://g"
    sed="s/\\<type\\>[^a-zA-Z_0-9]*?\\<text\\>[^a-zA-Z_0-9]*?\\<vbscript\\>//g"
    sed="s/\\<href\\>[^a-zA-Z_0-9]*?\\<vbscript://g"
    sed="s/\\<url\\>[^a-zA-Z_0-9]*?\\<javascript://g"
    sed="s/\x3C ?script\\>//g"
    sed="s/\\<type\\>[^a-zA-Z_0-9]*?\\<text\\>[^a-zA-Z_0-9]*?\\<javascript\\>//g"
    sed="s/\\<url\\>[^a-zA-Z_0-9]*?\\<vbscript://g"
    sed="s/(asfunction|javascript|vbscript|data|mocha|livescript)://g"
    sed="s/(?i:<object[ /+\t].*?((type)|(codetype)|(classid)|(code)|(data))[ /+\t]*=)//g"
    sed="s/(?i:[ /+\t\"\'`]datasrc[ +\t]*?=.)//g"
    sed="s/(?i:<link[ /+\t].*?href[ /+\t]*=)//g"
    sed="s/(?i:<meta[ /+\t].*?http-equiv[ /+\t]*=)//g"
    sed="s/(?i:<embed[ /+\t].*?SRC.*?=)//g"
    sed="s/(?i:[ /+\t\"\'`]on\x63\x63\x63+?[ +\t]*?=.)//g"
    sed="s/(?i:<?frame.*?[ /+\t]*?src[ /+\t]*=)//g"
    sed="s/(?i:<isindex[ /+\t>])//g"
    sed="s/(?i:<form.*?>)//g"
    sed="s/(?i:<script.*?[ /+\t]*?src[ /+\t]*=)//g"
    sed="s/(?i:<script.*?>)//g"
    sed="s/\\<select\\>.{0,40}buser\\>//g"
    sed="s/\\<select\\>.{0,40}\\<substring\\>//g"
    sed="s/\\<select\\>.{0,40}\\<ascii\\>//g"
    sed="s/\\<user_tables\\>//g"
    sed="s/\\<user_tab_columns\\>//g"
    sed="s/\\<all_objects\\>//g"
    sed="s/\\<drop\\>//g"
    sed="s/\\<substr\\>//g"
    sed="s/\\<sysdba\\>//g"
    sed="s/\\<user_password\\>//g"
    sed="s/\\<user_users\\>//g"
    sed="s/\\<user_constraints\\>//g"
    sed="s/\\<column_name\\>//g"
    sed="s/\\<substring\\>//g"
    sed="s/\\<object_type\\>//g"
    sed="s/\\<object_id\\>//g"
    sed="s/\\<user_ind_columns\\>//g"
    sed="s/\\<column_id\\>//g"
    sed="s/\\<table_name\\>//g"
    sed="s/\\<object_name\\>//g"
    sed="s/\\<rownum\\>//g"
    sed="s/\\<user_group\\>//g"
    sed="s/\\<utl_http\\>//g"
    sed="s/\\<select\\>.*?\\<to_number\\>//g"
    sed="s/\\<group\\>.*\\<byb.{1,100}?\\<having\\>//g"
    sed="s/\\<select\\>.*?\\<data_type\\>//g"
    sed="s/\\<isnull\\>[^a-zA-Z_0-9]*?\x28//g"
    sed="s/\\<union\\>.{1,100}?\\<select\\>//g"
    sed="s/\\<insert\\>[^a-zA-Z_0-9]*?\\<into\\>//g"
    sed="s/\\<select\\>.{1,100}?\\<count\\>.{1,100}?\\<from\\>//g"
    sed="s/\x3B[^a-zA-Z_0-9]*?\\<drop\\>//g"
    sed="s/\\<select\\>.*?\\<to_char\\>//g"
    sed="s/\\<dbms_java\\>//g"
    sed="s/\\<nvarchar\\>//g"
    sed="s/\\<utl_file\\>//g"
    sed="s/\\<inner\\>[^a-zA-Z_0-9]*?\\<join\\>//g"
    sed="s/\\<select\\>.{1,100}?\\<from\\>.{1,100}?\\<where\\>//g"
    sed="s/\\<intob[^a-zA-Z_0-9]*?\\<dumpfile\\>//g"
    sed="s/\\<delete\\>[^a-zA-Z_0-9]*?\\<from\\>//g"
    sed="s/\x3B[^a-zA-Z_0-9]*?\\<shutdown\\>//g"
    sed="s/\\<dba_users\\>//g"
    sed="s/\\<select\\>.{1,100}?\\<top\\>.{1,100}?\\<from\\>//g"

Maybe you are looking for

  • How do I change the workspace background color?

    Hi all, I need to change the workspace background color from the default mid grey. But I can't find how to do this! Any ideas! Thanks in advance, Pedro.

  • Time Machine Running Very Slow After Repairing Permission under "get info"

    Hi, I have a timemachine backup external hard drive which had permissions problem (perhaps this is because I crossed over from 10.5 to 10.6 at some point?)  When I enter timemachine some files had that red minus sign beside it and told me "you don't

  • StyleableTextField doesn't display fontWeight or fontStyle in iPad 3

    I have some LabelItemRenderers in my app that have a few StyleableTextFields in them: one to display a title in bold, one to display a bit of content and one to display a status in italics and another colour.  I apply the text stylings through setSty

  • Using reports to "just" create a pdf file

    Hi All, I'm using oracle aplication server 10g. I have a situation where i must send several emails into my clients. I want to use the reports server to create several pdf files attached into a "pretty" mail message. In the end the user will get a fu

  • Contacts screen changes to black background

    What causes the contacts to switch to a black background with blue and white text. I actually prefer it to the default but have no idea what action causes it to change then change back. It seems completely random. No, its not the invert colors option