Quoted Regular Expression

Hello All,
I'm trying to use the RegExp class in a search routine, in
which I want to match each word typed by a user to the start of a
word in the text. This means I need a regular expression like
/\bexpr/i for example, to match the word expression - well the
start of it, anyway.
The problem I have is taking the prefix entered by the user
("expr") and making it part of the regexp. Assuming the variable is
userPrefix, I can do something like
var re:RegExp = new RegExp("\\b" + userPrefix, "i");
This works fine as long as the user doesn't type any magic
characters. If the user types "\", for example, I'm left with an
invalid RE. Java has a static method Pattern.quote(), or something
like that, that takes the literal and makes it just a literal from
the point of view of the regexp pattern compiler.
Does anyone know if there's a similar utility for AS3?
Thanks,
Tim

Hello All,
I'm trying to use the RegExp class in a search routine, in
which I want to match each word typed by a user to the start of a
word in the text. This means I need a regular expression like
/\bexpr/i for example, to match the word expression - well the
start of it, anyway.
The problem I have is taking the prefix entered by the user
("expr") and making it part of the regexp. Assuming the variable is
userPrefix, I can do something like
var re:RegExp = new RegExp("\\b" + userPrefix, "i");
This works fine as long as the user doesn't type any magic
characters. If the user types "\", for example, I'm left with an
invalid RE. Java has a static method Pattern.quote(), or something
like that, that takes the literal and makes it just a literal from
the point of view of the regexp pattern compiler.
Does anyone know if there's a similar utility for AS3?
Thanks,
Tim

Similar Messages

  • Using Regular Expressions to Find Quoted Text

    I have run into a couple problems with the following code.
    1) Slash-Star and Slash-Slash commented text must be ignored.
    2) It does not detect backslashed quotes, or if that backslash is backslashed.
    Can this be accomplished with Regular Expressions, or should I implement this using if/indexOf logic?
    Thank You in advance,
    Brian
        * Finds position of next quoted string in a line
        * of source code.
        * If no strings exist, then a Pointer position of
        * (0,0) is returned.
        * @param startPos position to start search from
        * @param argText  the line of text to search
        * @returns next string position
       public Pointer getQuotedStringPosition(int startPos, String aString) {
          String argText = new String( aString );
          Pattern p = Pattern.compile("[\"][^\"]+[\"]");
          Matcher m = p.matcher( argText.substring(startPos); );
          if( m.find() )
             return new Pointer( m.start() + startPos, m.end() + startPos );
          else
             return new Pointer( 0, 0 ); // indicates nothing was found
       }

    YATArchivist was right about the regular expressions.
    I think I've got it but somebody test it if you want. Let me know what you find.
    I've included a barebones Position class as well...
    import java.util.regex.*;
    import java.io.*;
    import java.util.*;
      @author Joshua A. Logan, Jr.
    public class RegexTest
       private static final String SLASH_SLASH = "(//.*)";
       private static final String SLASH_STAR =
                               "(/\\*(?:[^\\*]|(?:\\*(?!/)))+(\\*/)?)";
       private static final Pattern COMMENT_PATTERN =
                         Pattern.compile( SLASH_SLASH + "|" + SLASH_STAR );
       private static final Pattern QUOTED_STRING_PATTERN =
                      Pattern.compile( "\"  ( (?:(\\\\.) | [^\\\"])*+  )     \"",
                                       Pattern.COMMENTS );
       // Breaking the above regular expression down, you'd have:
       //   "  ( (?: (\\ .)  |  [^\\ "]  ) *+ )   "
       //   ^          ^     ^     ^       ^      ^
       //   |          |     |     |       |      |
       //   1          2     3     4       5      6
       // which matches:
       // 1) The starting quote...
       // Followed by something that is either:
       // 2) some escaped sequence ( e.g. _\n_  or even _\"_ ),
       // 3)                ...or...
       // 4) a character that is neither a _\_ nor a _"_ .
       // 5) Keep searching this as much as possible, w/o giving up
       //                    any found text at the end.
       //        Note: the text found would be in group(1)
       // 6) Finally, find the ending quote!!
       public static Position [] getQuotedStringPosition( final String text )
          Matcher cm = COMMENT_PATTERN.matcher( text ),
                  qm = QUOTED_STRING_PATTERN.matcher( text );
          final int len = text.length();
          int startPos = 0;
          List positions = new ArrayList();
          while ( startPos < len )
             if ( cm.find(startPos) )
                int commStart = cm.start(),
                    commEnd   = cm.end();
                // are we starting @ a comment?
                if ( commStart == startPos )
                   startPos = commEnd;
                else if ( qm.find(startPos) )
                   // Search for unescaped strings in here.
                   int stringStart = qm.start(1),
                       stringEnd   = qm.end(1);
                   // Is the quote start after comment start?
                   if ( stringStart > commStart )
                      startPos = commEnd; // restart search after comment end...
                   else if ( (stringEnd > commEnd) ||
                             (stringEnd < commStart) )
                      // In this case, the "comment" is actually part of
                      // the quoted string. We found a match.
                      positions.add( new Position(text, qm.group(1),
                                                  stringStart,
                                                  stringEnd) );
                      int quoteEnd = qm.end();
                      startPos = quoteEnd;
                   else
                      throw new IllegalStateException( "illegal case" );
                else
                   startPos = commEnd;
             else
                // no comments were found. Search for unescaped strings.
                int quoteEnd = len;
                if ( qm.find( startPos ) ) {
                   quoteEnd = qm.end();
                   positions.add( new Position(text,
                                               qm.group(1),
                                               qm.start(1),
                                               qm.end(1)) );
                startPos = quoteEnd;
          return positions.isEmpty() ? Position.EMPTY_ARRAY
                                     : (Position[])positions.toArray(
                                              Position.EMPTY_ARRAY);
       public static void main( String [] args )
          try
             BufferedReader br = new BufferedReader(
                      new InputStreamReader(System.in) );
             String input = null;
             final String prompt = "\nText (q to quit): ";
             System.out.print( prompt );
             while ( (input = br.readLine()) != null )
                if ( input.equals("q") ) return;
                Position [] matches = getQuotedStringPosition( input );
                // What does it do?
                for ( int i = 0, max = matches.length; i < max; i++ )
                   System.out.println( "-->" + matches[i] );
                System.out.print( prompt );
          catch ( Exception e )
             System.out.println ( "Exception caught: " + e.getMessage () );
    class Position
       public Position( String target,
                        String match,
                        int start,
                        int end )
          this.target = target;
          this.match = match;
          this.start = start;
          this.end = end;
       public String toString()
          return "match==" + match + ",{" + start + "," + end + "}";
       final String target;
       final int start;
       final int end;
       final String match;
       public static final Position [] EMPTY_ARRAY = { };
    }

  • Using Regular Expressions to replace Quotes in Strings

    I am writing a program that generates Java files and there are Strings that are used that contain Quotes. I want to use regular expressions to replace " with \" when it is written to the file. The code I was trying to use was:
    String temp = "\"Hello\" i am a \"variable\"";
    temp = temp.replaceAll("\"","\\\\\"");
    however, this does not work and when i print out the code to the file the resulting code appears as:
    String someVar = ""Hello" i am a "variable"";
    and not as:
    String someVar = "\"Hello\" i am a \"variable\"";
    I am assumming my regular expression is wrong. If it is, could someone explain to me how to fix it so that it will work?
    Thanks in advance.

    Thanks, appearently I'm just doing something weird that I just need to look at a little bit harder.

  • Matching Quotes With Regular Expressions

    Hi, I have been attempting to develop an app which extract texts from pdfs then applies regular expression to the text. In one instance I attempt to match a curly open quote symbol which matches the . construct as well as the \W. However, it does not match \p{Punct} or \p{P} - does anyone know why this might be?
    I have read that a pattern \p{Pi} exists for matching opening brackets but am told when I run Java that no such characters class exists - is there anywhere where I can get info on all of the character classes available (not the Pattern javadoc)?
    Thanks very much,
    Ross

    . construct match anything and \W matches any non alphanumaric
    to match a curly bracket use \{
    Basicaly ( { [  has special meanings in regex so when you match them you have to use \{, \(,  \[  in java strings it means \\{, \\(, \\[                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

  • In a Regular expressions I can set up an "OR" statement?

    HI, I'm using e-tester version 8.2..My problem is about regular expessions.
    I need to catch a dynamic value from a Form Field (My-TxtBox).. this textbox gets pre populated data ( when the customer has info in the database).. this is the code:
    *<input name="My-TxtBox" type="text" value="XXXX" id="ID-TxtBox"*
    ---- ( XXXX = dynamic number prepopulated by the web app)
    So, I'm using this regular expression:
    *<input name="My-TxtBox" type="text" value="(.+?)" id="ID-TxtBox"*
    -----At this point, everything is fine.. but there is an exception
    I'm getting a big problem when the customer doesn't have data in the server, the code is NOT like this
    <input name="My-TxtBox" type="text" value="" id="ID-TxtBox"
    When the customer doesn't have data in the server, the code is more like this:
    *<input name="My-TxtBox" type="text" id="ID-TxtBox"*
    ---- (please note that there is not a value parameter now)
    So, I think there is no way to create a CDV that will work for both cases? any idea to solve this?
    i was thinking that maybe in the reg exps sintax you can create an "OR" statement.. my idea was to create a CDV
    that works for both cases.. when there is the "value=" string and when there is not.
    something like this
    This CDV returns the dynamic value when there is the "value=" string =
    <input name="My-TxtBox" type="text" value="(.+?)" id="ID-TxtBox"
    And this CDV returns "" when there is no "value=" string = Without value:
    <input name="My-TxtBox" type="text" (.*?)id="ID-TxtBox"
    My idea is to place something like this in some point of the CDV = *{* value="(.+?)" *OR* (.*?) *}*
    so my dream is to create a CDV similar to this:
    <input name="My-TxtBox" type="text" { value="(.+?)" *OR* (.*?) }id="ID-TxtBox
    I was searching on google but I simply don't get an answer....it is posible to place an OR statement into a Reg Exp and how the sintax is? ..
    Regards.. I appreciate your time.

    Hola,
    You can use a regular expression such as:
    <input name="My-TxtBox" type="text" ?v?a?l?u?e?=?"?(.*?)"? id="ID-TxtBox"
    Note that:
    * Note that there is a question mark sign (?) after each letter that may appear in the string. This means that the letter may or may not appear.
    * Note that instead of using (.+?) you should use (.*?).
    This means that will match any character that appears zero or mutliple times. The question mark here means that is non-greedy, meaning that it will not include in the .* matching anything like the rest of the pattern (in this case the rest of the pattern is "? id="ID-TxtBox").
    * Note that the question mark in front of the v of value is there to match a space that may or may not exists.
    Few other facts:
    * In regular expressions the parenthesys determine sequence of operations and mark groups. Such groups can be referenced in code (not in etester but in general). In eTester it will always get the value of the first group (first group = first set of parenthesys).
    * ORs in regular expressions can be expressed with the pipe "|" (without the quotes), but you will need parenthesys in this case which would not allow you to capture the group of characters that you want.
    Regards,
    [Z]{1}uriel C?
    Edited by: Zuriel on Oct 5, 2009 3:07 PM
    Edited to avoid having the text changed by the forum formatting options.

  • Regular expression question (should be an easy one...)

    i'm using java to build a parser. im getting an expression, which i split on a white-space.
    how can i build a regular-expression that will enable me to split only on unquoted space? example:
    for the expression:
    (X=33 AND Y=44) OR (Z="hello world" AND T=2)
    I will get the following values split:
    (X=33
    AND
    Y=34)
    OR
    (Z="hello world"
    AND
    T=2)
    and not:
    (Z="
    hello
    world"
    thank you very much!

    Instead of splitting on whitespace to get a list of tokens, use Matcher.find() to match the tokens themselves: import java.util.*;
    import java.util.regex.*;
    public class Test
      public static void main(String[] args) throws Exception
        String str = "(X=33 AND Y=44) OR (Z=\"hello world\" AND T=2)";
        List<String> tokens = new ArrayList<String>();
        Matcher m = Pattern.compile("[^\\s\"]+(?:\".*?\")?").matcher(str);
        while (m.find())
          tokens.add(m.group());
        System.out.println(tokens);
    }{code} The regex I used is based on the assumptions that there will be at most one run of quoted text per token, that it will always appear in the right hand side of an expression, and that the closing quote will always mark the end of the token.  If the rules are more complicated (as sabre150 suggested), a more complicated regex will be needed.  You might be better off doing the parsing the old-fashioned way, with out regexes.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

  • Regular expressions in Workshop 8.1

    Hello,
    I'm posting this question here because I don't see a "jdk" subcategory in this
    newsgroup and it might be problem peculiar to Workshop.
    I'm trying to use the Pattern and Matcher classes in java.util.regex (JDK 1.4.2)
    in BEA Workshop 8.1, but I'm getting "ERROR: Unknown escape code" (red squiggly
    line appears under the regex and this message is the screen tip) whenever I try
    to use the backslash to escape a special character in the Pattern.compile() and
    the Pattern.matches() methods.
    For example, it doesn't allow "\d" to mean "any digit". For this particular one,
    I can get around the problem by specifying "[0-9]", but in the case of the period
    character, I'm stuck. I cannot use "\." However, the JDK API doc (http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html)
    says the backslash is to be used for this purpose, if I'm reading it correctly.
    Is this a problem with Workshop, and is there a workaround? I need to specify
    that I require one and exactly one period.
    Any help would be most appreciated!
    Thanks.

    Yes, I had read the Java doc, but I guess I hadn't fully understood it. Now I
    do! Thanks!!
    David
    Josh Eckels <[email protected]> wrote:
    This isn't particular to Workshop, but you'll need to use two
    backslashes in your source code. Inside a string, backslash is used to
    escape the next character so that you can enter special characters like
    newlines ('\n'), tabs ('\t'), etc.
    So, in order to enter a backslash character into your string, you need
    to escape it, like '\\'.
    There's a small section on this in the java.util.regex.Pattern JavaDoc,
    under the "Backslashes, escapes, and quoting" header:
    Backslashes within string literals in Java source code are interpreted
    as required by the Java Language Specification as either Unicode escapes
    or other character escapes. It is therefore necessary to double
    backslashes in string literals that represent regular expressions to
    protect them from interpretation by the Java bytecode compiler. The
    string literal "\b", for example, matches a single backspace character
    when interpreted as a regular expression, while "\\b" matches a word
    boundary. The string literal "\(hello\)" is illegal and leads to a
    compile-time error; in order to match the string (hello) the string
    literal "\\(hello\\)" must be used.
    Josh
    David Chang wrote:
    Hello,
    I'm posting this question here because I don't see a "jdk" subcategoryin this
    newsgroup and it might be problem peculiar to Workshop.
    I'm trying to use the Pattern and Matcher classes in java.util.regex(JDK 1.4.2)
    in BEA Workshop 8.1, but I'm getting "ERROR: Unknown escape code" (redsquiggly
    line appears under the regex and this message is the screen tip) wheneverI try
    to use the backslash to escape a special character in the Pattern.compile()and
    the Pattern.matches() methods.
    For example, it doesn't allow "\d" to mean "any digit". For this particularone,
    I can get around the problem by specifying "[0-9]", but in the caseof the period
    character, I'm stuck. I cannot use "\." However, the JDK API doc
    (http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html)
    says the backslash is to be used for this purpose, if I'm reading itcorrectly.
    Is this a problem with Workshop, and is there a workaround? I needto specify
    that I require one and exactly one period.
    Any help would be most appreciated!
    Thanks.

  • Matches from regular expression into collection

    Hello,
    I need to do the following:
    I have a long string with some similar repeated data. I would like, using a regular expression, to extracts all matches in a collection. Is there a way of performing this task?
    I have look through the owa_pattern package, but as far as I found out, I can extract only a simple match. Here is an exact quote:
    "If multiple overlapping strings can match the regular expression, this function takes the longest match. " - http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/w_patt.htm
    So what can I do if I want to get all the matches?
    Thank you in anticipation. Any help would be appreciated.
    Best regards,
    beroetz

    I think your need a tokenizer-function.
    If the string +:in_str+ is delimited by +:in_delimiter+ you could try this:
    SELECT REGEXP_REPLACE(REGEXP_SUBSTR( :in_str || :in_delimiter, '(.*?)' || :in_delimiter, 1, LEVEL ), :in_delimiter, '') TOKEN
    BULK COLLECT INTO :my_nested_table
    FROM DUAL
    CONNECT BY REGEXP_INSTR( :in_str || :in_delimiter, '(.*?)' || :in_delimiter, 1, LEVEL ) > 0
    ORDER BY LEVEL ASC;
    I wrote a string-to-textarray-tokenizer (and it's pendant) some times ago, being able to cut from certain positions within the string using regular expressions and return the elements into an nested table of varchar2. It looks like:
    TYPE pos_arraytype IS TABLE OF POSITIVE ;
    TYPE text_arraytype IS TABLE OF VARCHAR2(2000);
    FUNCTION stringToTextarray(in_str IN VARCHAR2, in_pos_arr IN pos_arraytype, in_regexp_arr IN text_arraytype DEFAULT NULL, in_trim_strings IN BOOLEAN DEFAULT TRUE)
    RETURN text_arraytype ;
    in_str is the string to be tokenized
    in_pos_arr is a table of positive values of positions in the string to be cut
    in_regexp_arr is a table of regular expressions to use at each position declared by in_pos_arr
    in_trim_strings is a flag, if the cutted element should be trimmed
    using above for example:
    in_str = 'Markus van Muster 347651234XY Musterdaam ABCDE'
    in_pos_arr = (1, 13, 35, 35, 42)
    in_regexp_arr = ('(.?){12}', '([^[:digit:]]?){22}', '[[:digit:]]{4}', '[[:alpha:]]{2}', '(.?){14}')
    in_trim_strings = TRUE
    RETURN collection ('Markus','van Muster','1234','XY','Musterdaam')
    If you need the code, then tell me! I'm looking for....
    Cheers,
    Martin
    Edited by: Nuerni on 17.10.2008 08:49

  • Regular expressions in eclipse

    am trying to filter out some search results in eclipse with regular expressions, but am not having any luck. I am trying to find all jsp pages that contain html:text tags that DON'T have a maxlength attribute.
    I have tried this, but it doesn't seem to work:
    html:text .*[^m][^a][^x][^l|L][^e][^n][^g][^t][^h]How do you specify to NOT match an entire string (like maxlength, for example) while still matching another string (like html:text) in the same regexp?
    Thanks in advance.

    "[^>m]++" matches one or more of anything that's not '>' or 'm', which is always safe. If that fails (meaning a '>' or 'm' was seen), the second alternative tries to match an 'm' unless it's the beginning of "maxlength" (it could actually be the middle of "foobarmaxlength", but I'm assuming that won't happen). The whole alternation is in a group controlled by an asterisk, so it will keep cycling until either '>' or "maxlength" is seen. If it's "maxlength" rather than '>' that stops the loop, the regex fails.
    I was assuming that '>' wouldn't appear inside a tag. (It actually can, even in HTML, if it's enclosed in quotes, but nobody ever seems to do that. JSP, of course, is another story.) I think the simplest way to fix this regex is to add another alternative for quoted values:  <html:text\b([^>"m]++|"[^"]*+"|m(?!axlength))*+>BTW, it's important to use possessive quantifiers (++, +, etc.) in a pattern like this, where the general form is (X)*. Otherwise, you could get into a runaway-backtracking scenario where the regex takes forever to decide no match is possible.

  • BUG?? Syntax Highlighting Regular Expressions

    I'm working on a rather long regular expression in
    dreamweaver code view. It has about 200+ characters in it.
    I wrote it in another application and copied it over for use
    in js, and it wouldn't highlight correctly (sort of like when you
    leave off quote on a string). Naturally I thought I screwed up my
    regexp or forgot to escape a character or something so I went
    through it re-writing it and checking everything but it was
    correct. I saved the page and ran it in the browser and sure enough
    it worked.
    So I started typing the regexp from scratch this time in
    dreamweaver to see when it stopped highlighting correctly. It
    stopped at exactly 100 characters including opening and closing
    forward slashes. I tried writing another exp this time filled with
    just one letter. Again 100 characters exactly - not one more.
    Example:
    var reg1 =
    /ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss ssssssssss/
    var reg2 =
    /ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss sssssssssss/
    reg1 will highlight correctly, while reg2 won't.
    Is this a bug or am I missing something? The highlight rule
    (in Configuration\CodeColoring\CodeColoring.xml) is
    (\s*/\e*\\/
    -or-
    =\s*/ \e*\\ /
    There seems to be no limitation to length there.
    While this is not a huge deal, I might as was well be coding
    in notepad (or my other script editors which highlight this
    correctly) because the highlighting is worthless from this point
    on.
    I could make these strings instead of literals but I have a
    lot of these long expressions and would rather not go through them
    and escape all of my back slashes (there are tons) as well as
    quotes - and make them more un-maintainable as they already are.
    Anyone have this problem? Or a solution to it?

    random_acts wrote:
    > I'm working on a rather long regular expression in
    dreamweaver code view. It
    > has about 200+ characters in it.
    I have never written a regex as long as that, but was
    fascinated by your
    question, so attempted to replicate your problem.
    I gave up at 614 characters, but the syntax coloring didn't.
    I suggest
    that you submit a bug report with the details of your actual
    code:
    http://www.adobe.com/cfusion/mmform/index.cfm?name=wishform
    David Powers
    Author, "Foundation PHP for Dreamweaver 8" (friends of ED)
    Author, "Foundation PHP 5 for Flash" (friends of ED)
    http://foundationphp.com/

  • Regular Expressions for converting HTML to Structured Plain Text

    I'm writing a PL/SQL function that will convert HTML to plain text, but still preserve some of the formatting/line breaks. One of my challenges is in writing a regular expression to capture the text blocks while ignoring the markup. I'm trying to write an expression that will grab all of the text between start/end tags, but discard the tags. For example, to find all of the text between a start/end paragraph, I want to do something like:
    REGEXP_REPLACE('&lt;p style=&quot;text-align:center&#59;&quot;&gt;This is the body of the paragraph&lt;/p&gt;', '&lt;p.*&gt;(.*)&lt;/p&gt;', '\1||v_crlf' )
    where \1 returns the contents of the paragraph and v_crlf (declared earlier in the function) inserts a line break. I know there are more general expressions that will remove all tags, but I want to specifically identify the tags so I can process them appropriately. This way I can easily convert HTML to plain text for email and reporting without having to keep two versions around. Any help would be greatly appreciated. Once I get this worked out, I will repost with the function code for others to use. Thanks.
    Edited by: jritschel on Oct 26, 2010 9:58 AM

    Here's a function I wrote for an app. I'm not making in promises on it's accuracy as the app was just a proof of concept and never made it to production.
    function strip_html( p_clob in clob )
    return clob
    is
        l_out clob;
        l_test  number := 0;
        l_max_loops constant number := 20;
        i   pls_integer := 0;
    begin
        l_out := regexp_replace(p_clob,'<br>|<br />',chr(13)||chr(10),1,0,'imn');
        l_out := regexp_replace(l_out,'<p>',chr(13)||chr(10),1,0,'imn');
        l_out := replace(l_out,'<li>',chr(13)||chr(10)||'*<li>');
        l_out := regexp_replace(l_out,'<b>(.+?)</b>','*\1*',1,0,'imn');
        l_out := regexp_replace(l_out,'<u>(.+?)</u>','_\1_',1,0,'imn');
        loop
            l_test := regexp_instr(l_out,'<([A-Z][A-Z0-9]*)[^>]*>.*?</\1>',1,1,0,'imn');
            exit when l_test = 0 or i > l_max_loops;
            l_out := regexp_replace(l_out,'<([A-Z][A-Z0-9]*)[^>]*>(.*?)</\1>','\2',1,0,'imn');
            i := i + 1;
        end loop;
        return l_out;
    end strip_html;{code}
    The loop is there to handle nested HTML.
    Tyler Muth
    http://tylermuth.wordpress.com
    "Applied Oracle Security: Developing Secure Database and Middleware Environments": http://sn.im/aos.book
    Edited by: Tyler on Oct 26, 2010 10:03 AM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

  • Regular Expression in Apple script

    Hi,
    I need to test whether the text "I w'd like to give feedback" is available in UI or not, somehow my framework right now does not read quote( ' : between the characters w and d), so i want to use regular expression to search for the text in the UI, something like below.
    tell application "System Events"
       tell process "Process name"
                     activate
                     if (exists (checkbox "I w(*)d like to give feedback " of (first window whose description is equal to "alert"))) then
                                                           copy true to stdout
                     else
                                                           copy false to stdout
                     end if
       end tell
    end tell
    Any help would really be great.

    One could also employ Ruby Regular Expressions. One can control whether the RE match is returned as a list item, or by value.
    Here is the Ruby syntax, followed by the AppleScript to invoke it.
    return_type, my_string, my_regex = ARGV
    # cannot manipulate frozen variable. The dup method unlocks this state.
    dupe_myregex = my_regex.dup
    # remove \ escape chars so regex will work
    dupe_myregex.gsub!(/\//, "")
    # force string class to regexp class. Trailing slash can be followed by i to ignore case.
    aregex = /#{dupe_myregex}/
    if return_type.include? "L"
       aregex.match(my_string) {|m| p m.captures}
    else
       aregex.match(my_string) {|m| puts m.captures}
    end
    Now the AppleScript to use this code. Unfortunately, when compiling in AppleScript Editor, it goes wild with newlines and tabs, which badly clutters the Ruby code. Certain Ruby characters will need AppleScript escapes to work. Despite the unsightly control characters inserted by AppleScript, the code works.
    set my_string to "See Spot run. See Jane shoot Spot."
    set my_re to "/.?(Spot).+(Jane).?/"
    -- return a list containing matches
    set return_type to "L"
    -- return match values
    -- set return_type to "V"
    display dialog get_match(return_type, my_string, my_re)
    on get_match(return_type, my_string, my_re)
              set rb to ¬
            "return_type,my_string, my_regex = ARGV\n
             dupe_regex = my_regex.dup\n
             dupe_regex.gsub!(/\\//, \"\")\n
             aregex = /#{dupe_regex}/\n
             if return_type.include? \"L\"\n
                aregex.match(my_string) {|m| p m.captures}\n
             else\n
                aregex.match(my_string) {|m| puts m.captures}\n
             end"
              do shell script "/usr/bin/ruby -e " & rb's quoted form & space & ¬
      return_type's quoted form & space & my_string's quoted form & space & my_re's quoted form
    end get_match
    The above code using the "L" return_type will display ["Spot", "Jane"]. With the "V" option, it returns Spot\nJane.

  • Help with Java and Regular Expression

    Hello,
    I have one line of Java and regular expression that I got from the web :
    String[] opt;
    opt = line.split("\\s+(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
    If "line" contains :
    1 "Hello World"
    this code would give me :
    opt[0] : 1
    opt[1] : "Hello World"
    which is almost what I would like to do except for the double quotes. I wonder if someone could please help me change the regular expression so that it would return :
    opt[0] : 1
    opt[1] : Hello World
    Thank you and Best Regards,
    Chris

    It looks to me like this is a line from a space delimited file so you should use one of the free CSV parsers such as http://opencsv.sourceforge.net/ . These will handle the quotes properly even if a field actually contains a quote.

  • Pattern matching regular expressions

    I'm attempting to determine if a string matches a pattern of containing less than 100 alphanumeric characters a-z or 0-9 case insensitive. So my regular expression string looks like:
    "^[a-zA-Z0-9]{0,100}$"And I use something like...
    Pattern pattern = Pattern.compile( regexString );I'd like to modify my regex string to include the email 'at' symbol "@". So that the at symbol will be allowed. But my understanding of regex is very limited. How do I include an "or at symbol" in my regex expression?
    Thanks for your help.

    * Code by sabre150
    private static final Pattern emailMatcher;
        static
            // Build up the regular expression according to RFC821
            // http://www.ietf.org/rfc/rfc0821.txt
            // <x> ::= any one of the 128 ASCII characters (no exceptions)
            String x_ = "\u0000-\u007f";
            // <special> ::= "<" | ">" | "(" | ")" | "[" | "]" | "\" | "."
            //              | "," | ";" | ":" | "@"  """ | the control
            //              characters (ASCII codes 0 through 31 inclusive and
            //              127)
            String special_ = "<>()\\[\\]\\\\\\.,;:@\"\u0000-\u001f\u007f";
            // <c> ::= any one of the 128 ASCII characters, but not any
            //             <special> or <SP>
            String c_ = "[" + x_ + "&&" + "[^" + special_ + "]&&[^ ]]";
            // <char> ::= <c> | "\" <x>
            String char_ = "(?:" + c_ + "|\\\\[" + x_ + "])";
            // <string> ::= <char> | <char> <string>
            String string_ = char_ + "+";
            // <dot-string> ::= <string> | <string> "." <dot-string>
            String dot_string_ = string_ + "(?:\\." + string_ + ")*";
            // <q> ::= any one of the 128 ASCII characters except <CR>,
            //               <LF>, quote ("), or backslash (\)
            String q_ = "["+x_+"$$[^\r\n\"\\\\]]";
            // <qtext> ::=  "\" <x> | "\" <x> <qtext> | <q> | <q> <qtext>
            String qtext_ = "(?:\\\\[" + x_ + "]|" + q_ + ")+";
            // <quoted-string> ::=  """ <qtext> """
            String quoted_string_ = "\"" + qtext_ + "\"";
            // <local-part> ::= <dot-string> | <quoted-string>
            String local_part_ = "(?:(?:" + dot_string_ + ")|(?:" + quoted_string_ + "))";
            // <a> ::= any one of the 52 alphabetic characters A through Z
            //              in upper case and a through z in lower case
            String a_ = "[a-zA-Z]";
            // <d> ::= any one of the ten digits 0 through 9
            String d_ = "[0-9]";
            // <let-dig> ::= <a> | <d>
            String let_dig_ = "[" + a_ + d_ + "]";
            // <let-dig-hyp> ::= <a> | <d> | "-"
            String let_dig_hyp_ = "[-" + a_ + d_ + "]";
            // <ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>
            // String ldh_str_ = let_dig_hyp_ + "+";
            // RFC821 looks wrong since the production "<name> ::= <a> <ldh-str> <let-dig>"
            // forces a name to have at least 3 characters and country codes such as
            // uk,ca etc would be illegal! I shall change this to make the
            // second term of <name> optional by make a zero length ldh-str allowable.
            String ldh_str_ = let_dig_hyp_ + "*";
            // <name> ::= <a> <ldh-str> <let-dig>
            String name_ = "(?:" + a_ + ldh_str_ + let_dig_ + ")";
            // <number> ::= <d> | <d> <number>
            String number_ = d_ + "+";
            // <snum> ::= one, two, or three digits representing a decimal
            //              integer value in the range 0 through 255
            String snum_ = "(?:[01]?[0-9]{2}|2[0-4][0-9]|25[0-5])";
            // <dotnum> ::= <snum> "." <snum> "." <snum> "." <snum>
            String dotnum_ = snum_ + "(?:\\." + snum_ + "){3}"; // + Dotted quad
            // <element> ::= <name> | "#" <number> | "[" <dotnum> "]"
            String element_ = "(?:" + name_ + "|#" + number_ + "|\\[" + dotnum_ + "\\])";
            // <domain> ::=  <element> | <element> "." <domain>
            String domain_ = element_ + "(?:\\." + element_ + ")*";
            // <mailbox> ::= <local-part> "@" <domain>
            String mailbox_ = local_part_ + "@" + domain_;
            emailMatcher = Pattern.compile(mailbox_);
            System.out.println("Email address regex = " + emailMatcher);
        }Wow. Sheesh, sabre150 that's pretty impressive. I like it for two reasons. First it avoids some false negatives that I would have gotten using the regex I mentioned. Like, [email protected] is a valid email address which my regex pattern has rejected and yours accepts. It's unusual but it's valid. And second I like the way you have compartmentalized each rule so that changes, if any custom changes are desired, are easier to make. Like if I want to specifically aim for a particular domain for whatever reason. And you've commented it so that it is easier to read, for someone like myself who knows almost nothing about regex.
    Thanks, Good stuff!

  • Regular expressions using jsp

    Hello Techies,
    I am having a text area inside a jsp file. When ever the end user gives input it must be submitted to the server.
    The real problem is when the end user gives special characters and double-quotes in this text area, jsp is preforming in abnormal way.
    Can u guys tell me how to resolve this issue. i.e how to solve regular expressions in jsp.
    Even though the end user gives special characters or double quotes we should take only the content.
    Looking forward for u reply.
    regards,
    Ramu

    The real problem is when the end user gives special characters and double-quotes in this text area, jsp is preforming in abnormal way.What do you mean with "jsp is preforming in abnormal way" ?
    A arbitrary JSP does not care about double-quotes.
    Can u guys tell me how to resolve this issue. i.e how to solve regular expressions in jsp.Usually the same way as you do it in a poj class.
    Nobody here knows how your JSP looks like and who handles the request when the form is submited or even who generates the model of the data displayed by your JSP.
    You have to provide us with more details.

Maybe you are looking for