Find text by regular expression

I'm struggling with a problem. I have some files that contain a string like:
"name": "someName"
I can find these using egrep like so:
egrep -r '"name": "[^"]*"' /some/folder/path/
This works admirably, except that it returns more than I really want. It gives me something like this:
/path/to/Chrome/extension/manifest.json:   "name": "Extension Name"
What I really want is just the name itself, eg:
Extension Name
I don't care about anything else. I'm running these searches from an AppleScript, which already knows what folder it's looking in, and just needs to know 1) if a match was found, and if so, 2) what the name was.
How can I do this? Is there a way to do a search that only returns a particular portion of the results, or would I need to run the results through something else to filter it further?
Thanks in advance!

MrHoffman wrote:
If I've guessed correctly at what you're doing
Heh... I imagine you have. I wasn't sure who might frequent this forum, so didn't want to make any assumptions.
You may or may not know that I have an EtreCheck-like script that I use in conjunction with my Adware Removal Tool. It has proven to be extremely helpful in locating new adware, but it still somewhat limited in getting information on Firefox and Chrome extensions. It has proven difficult to get names, especially for Chrome extensions. I'm working hard on trying to improve it.
An excerpt from a sample manifest.json file in a Chrome extension - modified slightly for clarity - looks like this:
   "manifest_version": 2,
   "name": "Some extension name",
   "offline_enabled": true,
   "version": "6.3"
I'm using egrep to search for this "name" key so that I don't have to make too many assumptions about the internal structure of the extension folder. I know the manifest.json file is in there somewhere, and will have this key in it, so this is an easy way to find that name. To just display the name, the simple egrep I'm using is adequate, but in some cases there's special data in that name string that indicates where to find a localized name, so I need to be able to do something special in those cases, and need just the name string and nothing else.
The solution turns out to be a combination of several responses. The addition of the -o and -h flags gets me close to what I want, and passing it through cut with a double-quote as the delimiter works to trim it down perfectly. So now I can do:
egrep -roh '"name": "([^"]*)"' /path/to/folder/ | cut -d \" -f 4
The egrep will return:
"name": "Some extension name"
Passing it through cut with the above parameters gives exactly what I need... just:
Some extension name
Thanks to all for the advice!

Similar Messages

  • Find text using regular expression and add highlight annotation

    Hi Friends
                       Is it possible to find text using regular expression and add highlight annotation using plugin

    A plugin can use the PDWordFinder to get a list of the words on a page, and their location. That's all that the API offers for searching. Of course, you can use a regular expression library to work with that word list.

  • Finding URLs using regular expression.

    I have an requirement where user will type some text containing URLs like "Please visit this site http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747. Thank you". This text has to be modified as below before saving it to the database.
    "Please visit this site <a href='http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747'>http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747</a>. Thank you"
    I am using regular expression (http|https)://.+?\\s which marks the end of the url with a white space character.This pattern doesn't work if the URL is located at the end of the string since there will be no space at the end.
    For example if the string is "Please visit this site http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747" the regex will fail.
    My acutal problem is to find the URL irrespective its position within the string.
    Pattern urlPattern = Pattern.compile("(http|https)://.+?\\s", Pattern.CASE_INSENSITIVE);
    Matcher matcher = urlPattern.matcher(plainText);
    Map stringIndexMap = new HashMap();
    //Searching the input string for urlPattern...
    while(matcher.find()) {
    String urlString = matcher.group();
    //Storing the urls in a hashmap with their indices as keys....
    stringIndexMap.put(new Integer(matcher.start()), urlString.trim());
    Set keySet = stringIndexMap.keySet();
    Iterator it = keySet.iterator();
    //Iterating over the hashmap containing urls...
    while(it.hasNext()) {
    String urlString = (String) stringIndexMap.get(it.next());
    * Replacing the url string in the input text with <a href="#" onclick="window.open('<urlString>')"
    * using String index
    clickableURLString.replace(clickableURLString.indexOf(urlString),
    clickableURLString.indexOf(urlString) + urlString.length(),
    "<a href=\"#\" onclick=\"window.open('" + urlString
    + "')\">" + urlString + "</a>");
    return clickableURLString.toString();

    The end of the input is '$' as a regex.
    import java.util.regex.*;
    public class Prasanna{
      public static void main(String[] args){
        String text
    = "Please visit this site http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747";
    //    String regex = "(http|https)://.+?(?:\\s|$)"; // this works
        String regex = "(http|https)://[^ ]+";          // this also works
        Pattern pat = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
        Matcher mat = pat.matcher(text);
        while (mat.find()){
          System.out.println(mat.group());
    }

  • Find/replace and regular expression problem

    Hello, i'm using find and replace with a regular expression
    for the first time. I have it checkmarked and it's finding my text
    but it's missing (not highlighting) the ')' at the end of the line.
    Here's my code:
    [($[0-9]+<font size="-2">US</font>)]
    it's supposed to find everything inside the square brackets -
    but it misses the closing parenthesis after </font>. I need
    to find this string and replace with nothing to remove the string
    from any/all pages. Is there a reason why it's missing the closing
    parenthesis? I was actually able to add a few more parenthesis
    (e.g. "))))") before OR after the closing square bracket and it
    still found the original text minus the closing bracket and the
    extra parenthesis didn't prevent the text from being found.
    Any help is appreciated!
    James...

    WyattEA wrote:
    > Hello, i'm using find and replace with a regular express
    for the first time. I
    > have it checkmarked and it's finding my text but it's
    missing (not
    > highlighting) the ')' at the end of the line. Here's my
    code:
    >
    > [($[0-9]+<font size="-2">US</font>)]
    That's not how square brackets work
    Try:
    \(\$\d+<font size="-2">US</font>\)
    A left parens, followed by the dollar sign, followed by at
    least one
    digit, followed by <font size="-2">US</font>,
    followed by a right parens.
    Mick
    >
    > it's supposed to find everything inside the square
    brackets - but it misses
    > the closing parenthesis after </font>. I need to
    find this string and replace
    > with nothing to remove the string from any/all pages. Is
    there a reason why
    > it's missing the closing parenthesis? I was actually
    able to add a few more
    > parenthesis (e.g. "))))") before OR after the closing
    square bracket and it
    > still found the original text minus the closing bracket
    and the extra
    > parenthesis didn't prevent the text from being found.
    >
    > Any help is appreciated!
    >
    > James...
    >

  • How to find sunstring with regular expression?

    How can I find a substring in a string with a regular expression?
    Example:
    I have a original string "<tr><th>RecordId: </th><td valign=middle>A4711</td></tr>"
    Now i want to extract the value "A4711" from this string with a regular expression. Everything except "A4711" is fixed, the id "A4711" itself is dynamic. How is it possible to get the substring "A4711" of the original string with a regular expression?

    i wrote a little method with the infos above to get such results:
         * Get all substrings of a string that matches a regular expression.
         * @param original String to inspect.
         * @param regExp Regular expression as search criteria.
         * @return All matches of <i>regExp</i> or null if one input parameter is null.
        public static String[] getSubstrings(String original, String regExp) {
            String[] result = null;
            if (original != null && regExp != null) {
                Pattern pattern = Pattern.compile(regExp);
                Matcher matcher = pattern.matcher(original);
                boolean matchFound = matcher.find();
                Vector matches = new Vector();
                while (matchFound) {
                    String match = matcher.group();        
                    matches.addElement(match);
                    matchFound = matcher.find();
                }//next match
                int count = matches.size();
                result = new String[count];
                for (int i = 0; i < count; i++) {
                    result[i] = (String) matches.elementAt(i);
                }//next match
            }//else: input unavailable
            return result;
        }//getSubstrings()

  • VISA Find resource function regular expression

    Hi guys,
    I've been trying to get which Serial port is a GPS receiver connected to using the VISA Find Resource Function with no luck. The idea is to use a regular expression similar to
    ASRL?*INSTR{VI_ATTR_ASRL_BAUD == 9600}
    but instead of looking for baud rate, I want to search the value
    ASRL3 (COM3 - GNSS Receiver)
    as seen in MAX/VISA Test Panel. The attribute name is VI_ATTR_INTF_INST_NAME.
    Something like ASRL?*INSTR{VI_ATTR_INTF_INST_NAME == ASRL? (COM? - GNSS Receiver)} should work, but it's not.
    How should I write the expression?
    Thanks!
    Best regards,
    Néstor
    LabVIEW 2011 + Windows 7 32bits SP1

    Hi Dennis,
    thanks for answering.
    I haven't assigned the name, MAX did it. I just opened MAX and there it was, COM3 (In Devices and interfaces/ASRL3/settings/name). I suppose it gets directly the Windows serial port name, what name do you mean exactly?
    Anyway, what I'm interested in getting is the Port description, that would tell me the connected device in that port. I could build a small loop and look for the interface description of each serial device, but I found the VISA Find Resourde function and it seems a more simple and direct way to get what I want.
    If I can list all serial devices with 9600 baud rate as in the previous example, why not do the same with the instrument name? I clearly see it when I open the VISA Test Panel. I am maybe missing something?
    Best regards,
    Néstor
    LabVIEW 2011 + Windows 7 32bits SP1

  • Find/Replace Using Regular Expressions

    Can someone help me with this...I am using Regular expressions to
    FIND:
    http.*lid=([^&"]*)[^"]*
    REPLACE:
    $set(\1,ID_id,code)$
    So that in the following it will change this:
    a href="http://www.test.com/shc/s/home_10153_12605?lid=Search" rilt="Search"
    To this:
    a href="$set(Search,ID_id,code)$" rilt="Search
    Those expressions  work in Notepad++ but when i use dreamweaver it just replaces the http... with "$set(\1,ID_id,code)$" and doesnt reference the "search"
    Any help?
    Thanks

    Let me begin by saying I'm a complete idiot with DW's Reg Ex.   I use Search Specific Tag whenever possible.  See screenshot below.
    Try this on your Current Document to see if it works. Then make a back-up copy of site before attempting it on Entire Local Site as you cannot "Undo" this process.
    Good luck,
    Nancy O.

  • How i can validate a entered text against regular expressions ?

    Thank you for reading my post.
    how i can validate an entered text to checkk its syntax to ensure that it is a domain name ?
    I think i should use RE , but i do not know how i can do this.
    Thank you

    If you want to validate at client side, you need to create a javascript function (validation) and add it to the "onBlur" attribute of the TextField component In the propertysheet (Set it via JavaScrip->onBlur in the property sheet). To put the Actual Java Script, you need to edit the JSP page. If you need to do it server side, create a custom validator.
    http://developers.sun.com/prodtech/javatools/jscreator/learning/tutorials/2/customvalidator.html
    - WInston
    http://blogs.sun.com/winston

  • Regular expressions in Format Definition add-on

    Hello experts,
    I have a question about regular expressions. I am a newbie in regular expressions and I could use some help on this one. I tried some 6 hours, but I can't get solve it myself.
    Summary of my problem:
    In SAP Business One (patch level 42) it is possible to use bank statement processing. A file (full of regular expressions) is to be selected, so it can match certain criteria to the bank statement file. The bank statement file consists of a certain pattern (look at the attached code snippet).
    :61:071222D208,00N026
    :86:P  12345678BELASTINGDIENST       F8R03782497                $GH
    $0000009                         BETALINGSKENM. 123456789123456
    0 1234567891234560                                            
    :61:071225C758,70N078
    :86:0116664495 REGULA B.V. HELPMESTRAAT 243 B 5371 AM HARDCITY HARD
    CITY 48772-54314                                                  
    :61:071225C425,05N078
    :86:0329883585 J. MANSSHOT PATTRIOTISLAND 38 1996 PT HELMEN BIJBETA
    LING VOOR RELOOP RMP1 SET ORDERNR* 69866 / SPOEDIG LEVEREN    
    :61:071225C850,00N078
    :86:0105327212 POSE TELEFOONSTRAAT 43 6448 SL S-ROTTERDAM MIJN OR
    DERNR. 53846 REF. MAIL 21-02
    - I am in search of the right type of regular expression that is used by the Format Definition add-on (javascript, .NET, perl, JAVA, python, etc.)
    Besides that I need the regular expressions below, so the Format Definition will match the right lines from my bankfile.
    - a regular expression that selects lines starting with :61: and line :86: including next lines (if available), so in fact it has to select everything from :86: till :61: again.
    - a regular expression that selects the bank account number (position 5-14) from lines starting with :86:
    - a regular expression that selects all other info from lines starting with :86: (and following if any), so all positions that follow after the bank account number
    I am looking forward to the right solutions, I can give more info if you need any.

    Hello Hendri,
    Q1:I am in search of the right type of regular expression that is used by the Format Definition add-on (javascript, .NET, perl, JAVA, pythonetc.)
    Answer: Format Definition uses .Net regular expression.
    You may refer the following examples. If necessary, I can send you a guide about how to use regular expression in Format Defnition. Thanks.
    Example 6
    Description:
    To match a field with an optional field in front. For example, u201C:61:0711211121C216,08N051NONREFu201D or u201C:61:071121C216,08N051NONREFu201D, which comprises of a record identification u201C:61:u201D, a date in the form of YYMMDD, anther optional date MMDD, one or two characters to signify the direction of money flow, a numeric amount value and some other information. The target to be matched is the numeric amount value.
    Regular expression:
    (?<=:61:\d(\d)?[a-zA-Z]{1,2})((\d(,\d*)?)|(,\d))
    Text:
    :61:0711211121C216,08N051NONREF
    Matches:
    1
    Tips:
    1.     All the fields in front of the target field are described in the look behind assertion embraced by (?<= and ). Especially, the optional field is embraced by parentheses and then a u201C?u201D  (question mark). The sub expression for amount is copied from example 1. You can compose your own regular expression for such cases in the form of (?<=REGEX_FOR_FIELDS_IN_FRONT)(REGEX_FOR_TARGET_FIELD), in which REGEX_FOR_FIELDS_IN_FRONT and REGEX_FOR_TARGET_FIELD are respectively the regular expression for the fields in front and the target field. Keep the parentheses therein.
    Example 7
    Description:
    Find all numbers in the free text description, which are possibly document identifications, e.g. for invoices
    Regular expression:
    (?<=\b)(?<!\.)\d+(?=\b)(?!\.)
    Text:
    :86:GIRO  6890316
    ENERGETICA NATURA BENELU
    AFRIKAWEG 14
    HULST
    3187-A1176
    TRANSACTIEDATUM* 03-07-2007
    Matches:
    6
    Tips:
    1.     The regular expression given finds all digits between word boundaries except those with a prior dot or following dot; u201C.u201D (dot) is escaped as \.
    2.     It may find out some inaccurate matches, like the date in text. If you want to exclude u201C-u201D (hyphen) as prior or following character, resemble the case for u201C.u201D (dot), the regular expression becomes (?<=\b)(?<!\.)(?<!-)\d+(?=\b)(?!\.)(?!-). The matches will be:
    :86:GIRO  6890316
    ENERGETICA NATURA BENELU
    AFRIKAWEG 14
    HULST
    3187-A1176
    TRANSACTIEDATUM* 03-07-2007
    You may lose some real values like u201C3187u201D before the u201C-u201D.
    Example 8
    Description:
    Find BP account number in 9 digits with a prior u201CPu201D or u201C0u201D in the first position of free text description
    Regular expression:
    (?<=^(P|0))\d
    Text:
    0000006681 FORTIS ASR BETALINGSCENTRUM BV
    Matches:
    1
    Tips:
    1.     Use positive look behind assertion (?<=PRIOR_KEYWORD) to express the prior keyword.
    2.     u201C^u201D stands for that match starts from the beginning of the text. If the text includes the record identification, you may include it also in the look behind assertion. For example,
    :86:0000006681 FORTIS ASR BETALINGSCENTRUM BV
    The regular expression becomes
    (?<=:86:(P|0))\d
    Example 9
    Description:
    Following example 8, to find the possible BP name after BP account number, which is composed of letter, dot or space.
    Regular expression:
    (?<=^(P|0)\d)[a-zA-Z. ]*
    Text:
    0000006681 FORTIS ASR BETALINGSCENTRUM BV
    Matches:
    1
    Tips:
    1.     In this case, put BP account number regular expression into the look behind assertion.
    Example 10
    Description:
    Find the possible document identifications in a sub-record of :86: record. Sub-record is like u201C?00u201D, u201C?10u201D etc.  A possible document identification sub-record is made up of the following parts:
    u2022     keyword u201CREu201D, u201CRGu201D, u201CRu201D, u201CINVu201D, u201CNRu201D, u201CNOu201D, u201CRECHNu201D or u201CRECHNUNGu201D, and
    u2022     an optional group made up of following:
         a separator of either a dot, hyphen or slash, and
         an optional space, and
         an optional string starting with keyword u201CNRu201D or u201CNOu201D followed by a separator of either a dot, hyphen or slash, and
         an optional space
    u2022     and finally document identification in digits
    Regular expression:
    (?<=\?\d(RE|RG|R|INV|NR|NO|RECHN|RECHNUNG)((\.|-|/)\s?((NR|NO)(\.|-|/))?\s?)?)\d+
    Kind Regards
    -Yatsea

  • Regular expressions in ActionScript??

    I have been looking at the Adobe publication Programming Action Script (pdf) and it
    specifies ECMA-262 3rd edition specification. But the specification don't seem to
    state exactly what type of regular expression engine and version is used.
    Is it POSIX, or PERL compatible regular expressions (or both)?
    I have read and used the classic O'Reilly text Mastering Regular Expressions
    and coded regular expressions in javascript/php/etc (anywhere regular expressions
    could be use, Apache configuration file, other server config files, etc etc etc)
    There is a difference in the type of engine used, where as performance is
    concerned, as well as the range of syntax valid in a particular implementation.
    Thank You
    JK

    http://www.regular-expressions.info/javascript.html

  • Carriage Return - Regular Expression

    Hi guys,
    I'm looking for an effective method to speed up the extreme optimization process in my work (finally to not do it manually).
    The particulary issue is to find a good regular expression to replace the carriage returns in the source code with nothing.
    I searched on the net, and many sources converge on the RegExr tool: http://gskinner.com/RegExr/
    I tried to set up an expression to solve my problem but it doesn't work. The expression that was generated by the tool is:
    Find: /\r/g Replace: (none) 
    When i enter the expression in Dreamweaver Find & Replace panel (with regular expression option checked and match case, etc. unchecked), it seems to not produce any valid change on the code.
    I'm sure that i'm wrong something.
    Anyone have suggestion?
    Thanks all for help.

    I don't understand the point of this.  There's very little to be gained from removing white space from code. And if you do this to JavaScript, you'll very likely break the code.
    Safer method, go to Edit > Preferences > Code Format > click on Tag Libraries and define how you want your code formatted.  Then  apply with Command > Apply Source Formatting.
    Nancy O.

  • Regular Expression Challenge

    Hello, everybody!
    I'm trying to come up with a regular expression to validate a identifier that starts and ends with double quotes. That's what I have until now:
    \"[a-zA-Z_][\w]*\"
    This regex is working. It matches identifiers like:
    "name"
    "name_1"
    "_name123"but the challenge that I couldn't solve is this: I would like to augment this regex to also accept escaped double quotes inside the string itself, like this:
    "na""me"     (correct, double quote inside escaped by another double quote)
    "name""""_1" (correct, two double quotes inside escaped by double quotes)
    """_name123" (correct, double quote inside escaped by another double quote)and reject strings that miss the escape double quote, like the following:
    "na"me"
    "name""" "_1"
    "" "_name123"as you can see, besides the start and end quotes, if there's a double quote inside the identifier, it has to be followed +immediately+ by another double quote, it has to be a pair, no matter how many and where. I tried to find such a regular expression but I didn't have success. I couldn't find a way to say that a double quote inside the identifier has to be followed by another double quote.
    Any help would be appreciated.
    Thank you in advance.
    Marcos

    Marcos_AntonioPS wrote:
    r035198x wrote:
    Marcos_AntonioPS wrote:
    ..If you are not willing/able to learn about them and make an attempt then you will find it difficult to get help here.
    You will also find it difficult to get help if you insult regulars when they try to advise you on following the posting guidelines by chosing an appropriate thread title.r035198x, I didn't insult regulars. But you insulted me. I consider saying that I was trying to make someone here do 'my homework' an insult. I think that the worst insults in life are the ones that we say in a sarcastic manner, like yours.
    MarcosI can't think of one instance where a 'challenge' has been issued in these forums that was not an indirect request for someone's homework to be done. Yours could be the first but even then is stinks of "I can't be bothered to learn about look-ahead so can some kind person solve my problem for me?". Issuing a 'challenge' like this is an insult to us. It assumes we are gullible enough to do your work for you.
    You can get somewhere towards a solution using look-ahead but you will have to place restrictions since regex can't count.

  • Using Regular Expressions to Find Quoted Text

    I have run into a couple problems with the following code.
    1) Slash-Star and Slash-Slash commented text must be ignored.
    2) It does not detect backslashed quotes, or if that backslash is backslashed.
    Can this be accomplished with Regular Expressions, or should I implement this using if/indexOf logic?
    Thank You in advance,
    Brian
        * Finds position of next quoted string in a line
        * of source code.
        * If no strings exist, then a Pointer position of
        * (0,0) is returned.
        * @param startPos position to start search from
        * @param argText  the line of text to search
        * @returns next string position
       public Pointer getQuotedStringPosition(int startPos, String aString) {
          String argText = new String( aString );
          Pattern p = Pattern.compile("[\"][^\"]+[\"]");
          Matcher m = p.matcher( argText.substring(startPos); );
          if( m.find() )
             return new Pointer( m.start() + startPos, m.end() + startPos );
          else
             return new Pointer( 0, 0 ); // indicates nothing was found
       }

    YATArchivist was right about the regular expressions.
    I think I've got it but somebody test it if you want. Let me know what you find.
    I've included a barebones Position class as well...
    import java.util.regex.*;
    import java.io.*;
    import java.util.*;
      @author Joshua A. Logan, Jr.
    public class RegexTest
       private static final String SLASH_SLASH = "(//.*)";
       private static final String SLASH_STAR =
                               "(/\\*(?:[^\\*]|(?:\\*(?!/)))+(\\*/)?)";
       private static final Pattern COMMENT_PATTERN =
                         Pattern.compile( SLASH_SLASH + "|" + SLASH_STAR );
       private static final Pattern QUOTED_STRING_PATTERN =
                      Pattern.compile( "\"  ( (?:(\\\\.) | [^\\\"])*+  )     \"",
                                       Pattern.COMMENTS );
       // Breaking the above regular expression down, you'd have:
       //   "  ( (?: (\\ .)  |  [^\\ "]  ) *+ )   "
       //   ^          ^     ^     ^       ^      ^
       //   |          |     |     |       |      |
       //   1          2     3     4       5      6
       // which matches:
       // 1) The starting quote...
       // Followed by something that is either:
       // 2) some escaped sequence ( e.g. _\n_  or even _\"_ ),
       // 3)                ...or...
       // 4) a character that is neither a _\_ nor a _"_ .
       // 5) Keep searching this as much as possible, w/o giving up
       //                    any found text at the end.
       //        Note: the text found would be in group(1)
       // 6) Finally, find the ending quote!!
       public static Position [] getQuotedStringPosition( final String text )
          Matcher cm = COMMENT_PATTERN.matcher( text ),
                  qm = QUOTED_STRING_PATTERN.matcher( text );
          final int len = text.length();
          int startPos = 0;
          List positions = new ArrayList();
          while ( startPos < len )
             if ( cm.find(startPos) )
                int commStart = cm.start(),
                    commEnd   = cm.end();
                // are we starting @ a comment?
                if ( commStart == startPos )
                   startPos = commEnd;
                else if ( qm.find(startPos) )
                   // Search for unescaped strings in here.
                   int stringStart = qm.start(1),
                       stringEnd   = qm.end(1);
                   // Is the quote start after comment start?
                   if ( stringStart > commStart )
                      startPos = commEnd; // restart search after comment end...
                   else if ( (stringEnd > commEnd) ||
                             (stringEnd < commStart) )
                      // In this case, the "comment" is actually part of
                      // the quoted string. We found a match.
                      positions.add( new Position(text, qm.group(1),
                                                  stringStart,
                                                  stringEnd) );
                      int quoteEnd = qm.end();
                      startPos = quoteEnd;
                   else
                      throw new IllegalStateException( "illegal case" );
                else
                   startPos = commEnd;
             else
                // no comments were found. Search for unescaped strings.
                int quoteEnd = len;
                if ( qm.find( startPos ) ) {
                   quoteEnd = qm.end();
                   positions.add( new Position(text,
                                               qm.group(1),
                                               qm.start(1),
                                               qm.end(1)) );
                startPos = quoteEnd;
          return positions.isEmpty() ? Position.EMPTY_ARRAY
                                     : (Position[])positions.toArray(
                                              Position.EMPTY_ARRAY);
       public static void main( String [] args )
          try
             BufferedReader br = new BufferedReader(
                      new InputStreamReader(System.in) );
             String input = null;
             final String prompt = "\nText (q to quit): ";
             System.out.print( prompt );
             while ( (input = br.readLine()) != null )
                if ( input.equals("q") ) return;
                Position [] matches = getQuotedStringPosition( input );
                // What does it do?
                for ( int i = 0, max = matches.length; i < max; i++ )
                   System.out.println( "-->" + matches[i] );
                System.out.print( prompt );
          catch ( Exception e )
             System.out.println ( "Exception caught: " + e.getMessage () );
    class Position
       public Position( String target,
                        String match,
                        int start,
                        int end )
          this.target = target;
          this.match = match;
          this.start = start;
          this.end = end;
       public String toString()
          return "match==" + match + ",{" + start + "," + end + "}";
       final String target;
       final int start;
       final int end;
       final String match;
       public static final Position [] EMPTY_ARRAY = { };
    }

  • Regular expression - find if string does NOT contain text....

    I have a string that I want to tokenize. The string can contain basically anything. I want to produce tokens for each "word" found, and for each "<=" or "," found. There does not need to be whitespace around a "<=" or a "," to consider it a token. So for example:
    joe schmoe<=jack, jane
    should become
    joe
    schmoe
    <=
    jack
    jane
    As a constraint, I do not want to use StringTokenizer at all, as "its use is discouraged in new code". http://java.sun.com/j2se/1.4.2/docs/api/java/util/StringTokenizer.html
    Here's the code I plan on using for this:
        public String[] getWords(String input) {
            Matcher matcher = WORD_PATTERN.matcher(input);
            ArrayList<String> words = new ArrayList<String>();
            while (matcher.find()) {
                words.add(matcher.group());
            return (String[]) words.toArray(new String[0]);
        }The trick, though, is coming up with a working regular expression. The closest I've found yet is:
    ([^\s]|^(,)|^(<=))+|,|<=
    but that produces the following:
    joe
    schmoe<=jack,
    jane
    I think what I need is to be able to find if a string does not contain the substring "<=" or "," using a regular expression. Anyone know how to do this, or another way to do this using regular expressions?

    Try:
    * Tokenizer.java
    * version 1.0
    * 01/06/2005
    package samples;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    * @author notivago
    public class StrangeTokenizer {
        public static void main(String[] args) {
            String text = "joe schmoe<=jack, jane";
            Pattern pattern = Pattern.compile( "((?:<=)|(?:,)|(?:\\w+))");
            Matcher matcher = pattern.matcher(text);
            while( matcher.find() ) {
                System.out.println( "Item: " + matcher.group(1) );
    }May the code be with you.

  • Find text Between tags with a Regular Expression

    I am trying to find specif text -- table names - within a
    <cfquery> tag in all my cfm files. I am using an extend find
    function in Homesite (I think Dreamweaver has the same
    functionality). This expression works:
    <[Cc][fF][qQ][uU][eE][rR][Yy]
    [^>]*>[^>]*(EventName|AttendeeName)[^>]*</[Cc][fF][qQ][uU][eE][rR][Yy]>
    for find the text EventName or AttendeeName. However, if
    there are other cf tags like <cfif> within the
    <cfquery>, then the tag/text is not found.
    Can anyone help? It is a useful expression to have if you are
    trying to transfer applications developed on a windows machine to
    a, say, linux machine, and have table name sensititvity issues with
    mySql.

    quote:
    Originally posted by:
    Newsgroup User
    Thanks for all the help. Comments below.
    > Thanks, but it:
    > 1) Captures everything between the first and last query
    in a script if there
    > is more than one cfquery in the script
    Oops: sorry. Stick a question mark after the asterisks to
    stop the matches
    being greedy.
    Used this:
    <[Cc][fF][qQ][uU][eE][rR][Yy].*?(EventName|AttendeeName)[^>].*?</[Cc][fF][qQ][uU][eE][rR][ Yy]>
    and got some finds again with multiple queries and some
    errors as mentioned below.
    > 2) It produces some regular expression errors in
    Homesite.
    Can't help you there. Sounds like HS's regex processor is
    bung: there's
    nothing non-standard or tricky about that regex (which might
    cause
    compatibility issues; JS vs PERL vs Java, etc).
    HS on the whole is bung (IMO). Have you considered using a
    text editor
    that is... err... *current*? ;-)
    No, can you suggest one. Just use HS for years and it does
    most of what I want.
    What sort of errors is it giving?
    Regular expression error No 17. Bad expression format or
    internal error.
    > The reason for this is I am developing on a windows
    machine with mysql and
    > want to use the application online on a linux machine
    where table names are
    > case sensitive. My code was not always faithful to that
    since in windows you
    > can be sloppy!
    Have you seen this:
    http://dev.mysql.com/doc/refman/5.0/en/identifier-case-sensitivity.html
    It might be a better approach anyhow.
    Adam

Maybe you are looking for

  • How to create new RF transaction in R/3?

    Hi Guys, Can anyone tell me how to create new RF transaction In WM. If possible step by step. Thanks heeps. Narahari

  • Strange date format from XML

    Hi There! We have an XML that we read into an XMLtype - and at a time we want to display it in a normal date type.... Its formatted as this: '2007-02-20T09:30:47.0Z' (Zero 0) - I guess it means no timezone in some ISO 86* standard select TO_date('200

  • Time, airport, battery meter will not display.

    Hey I have an ibook g4 that does not want to display the battery meter, sound volume, time & day. for some reason. I have looked all over system prefs and got nowhere fast. If someone can let me know whats up...please do. I have tried to right click

  • Query HP Asset Tags

    I know I can add an Asset Tag to the BIOS. I need to query that Asset Tag. I added this to a report query...v_GS_SYSTEM_ENCLOSURE.SMBIOSAssetTag0 (I think it's the right choice). select v_R_System.Name0, v_GS_PC_BIOS.Manufacturer0, v_GS_PC_BIOS.Seria

  • Panasonic Blueray --- SKYPE not working properly

    I have tried to use Skype in Blue-ray player (DMP-BDT210) but it keeps saying I have too many contacts to use Skype. I purchased this unit basically for Skype. How many contacts can I have to be able to use it? I mean, what is the point to have a Sky