Escaping entire String in a Regular Expression

How does one escape an entire String in a Pattern? I thought that prefixing it with "\\Q" and postfixing with "\\E" would do the trick but this is functioning strangely. It also ignores the possibility of a "\\Q" and/or "\\E" within the String. I guess one could escape every meta-character but this seems like overkill. Is there a utility method somewhere? Why is it that:
System.out.println("xx\\xx".replaceAll("\\Qx\\x\\E", "a"));yields xax but
System.out.println("xx\\xx".replaceAll("\\Q\\\\E", "a"));outputs xx\xx?

It looks like you've uncovered a bug. Specifically, when the Pattern parser sees the backslash that you're trying to match, it looks at the next character to see if it's an 'E'. It isn't, so the parser consumes both characters, then starts looking for the \E sequence again. The next character, of course, is the 'E', but the parser just sees it as a literal 'E' now. End result: instead of a Pattern for a single backslash, you get a Pattern for a backslash, followed by a backslash, followed by an 'E', as demonstrated here:    System.out.println("xx\\\\Exx".replaceAll("\\Q\\\\E", "a")); prints xxaxx.
Since you're escaping the whole regex, you can work around the bug by leaving off the \E:    System.out.println("xx\\xx".replaceAll("\\Q\\", "a")); prints xxaxx.
Or you can do what I do and use this method to escape strings instead of \B ... \E:  public static String quotemeta(String str)
    if (str.length() == 0)
      return "";
    StringBuffer buf = new StringBuffer();
    for (int i = 0; i < str.length(); i++)
      char c = str.charAt(i);
      if ("\\[](){}.*+?$^|".indexOf(c) != -1)
        buf.append('\\');
      buf.append(c);
    return buf.toString();
  }(This is why I've never run afoul of this bug, though I use regexes a lot.)
I'll submit this to BugParade if you like. While I'm at it, I can submit an RFE to have them add a quotemeta method to Pattern.

Similar Messages

  • Converting String Characters into Regular Expression Automatically?

    Hi guys.... is there any program or sample coding which is available to convert string characters into regular expression automatically when the program is run?
    Example:
    String Character Input: fnffffffffffnnnnnnnnnffffffnnfnnnnnnnnnfnnfnfnfffnfnfnfnfnfnnnnd
    When the program runs, it automatically convert into this :
    Regular Expression Output: f*d

    hey guys.... i am sorry for not providing all the information that you guys need as i was rushing off to urgent meeting... for my string characters i only have a to n.. all these characters are collected from sensors and stored inside database... from many demos i have done... i found out that every demo has different strings of characters collected and these string of characters will not match with the regular expressions that i had created due to several unwanted inputs and stuff... i have a lot of different types of plan activities and therefore a lot of regular expressions.... if i put [a-z|0-9]*... it will capture all characters but in the same time it will be showing 1 plan only.... therefore, i am finding ways to get the strings i collected and let it form into regular expression by themselves in the program so that it will appear as different plans as output with comparing with the regular expression that i had created.... is there any way to do so?
    please post again if there is any questions u are still not familiar with... thank you...

  • How to create a list of string if a regular expression is given ?

    Hi folks,
    I have a regular expression say abcd[a-z]\\\.[0-9] . ( please ignore one '\')
    For this string i know that
    following string matches successfully
    1. abca.0
    2. abcb.1
    3. abcz.9 ......etc n number of combination are possible.
    is there any algorithm which will create some randomn strings from a regular expression.
    input to algorithm : some string pattern
    output to algorithm : some matching strings ( can be a single or an array of matching strings)
    Thanks in advance..
    Sethu
    Edited by: Sethumadhavan on Apr 16, 2008 6:32 AM

    Can u please give little more explanation...
    If i get some some values i can exit with the values ... and from the values i got i can ignore the duplicates ...
    But i am not getting the basic algorithm to get list of strings.....( DFA? or NFA?)
    thanks
    sethu

  • String validation without regular expressions

    Hello all
    I'm facing a little problem, basically i have to make a method that validates an input String "a name"
    Numbers and symbols are not allowed, but white spaces are.
    The method has to be implemented without the use of JFormattedTextField or regular expressions.
    What i'm doing right now is this:
    public boolean validate(String name){
       char[] arr=name.toCharArray();
        for(Char c:arr){
          if(!Character.isLetter(c)){
           return false;
      return true;
    }That isLetter() method is very useful but it sees the white spaces are "non letters".
    I am a bit lost at this point, i'm trying a lot of methods of String and Character but nothing seems to work
    do you have any advice?
    Thx

    enrico wrote:
    That isLetter() method is very useful but it sees the white spaces are "non letters".
    I am a bit lost at this point, i'm trying a lot of methods of String and Character but nothing seems to work
    do you have any advice?Yes: don't try to do it all in one expression. 'If' statements allow you to use '&&' and '||' to connect expressions, so use them.
    Second: Work out what you want to do BEFORE you start programming.
    In this case, you need to know exactly which characters you want to allow, +and when+ (see baftos' examples above).
    Third:for(Char c:arr){is meaningless (unless you've defined a class called 'Char').
    Accuracy is important.
    Winston

  • String splitting with regular expressions

    Hello everyone
    I need some help in splitting the string using regular expressions
    Suppose my String is : abc def "ghi jkl mno" pqr stu
    after splitting the reulsting string array should contain the elements
    abc
    def
    ghi jkl mno
    pqr
    stu
    what my regular expression should be

    Since this is essentially the same as parsing CSV data, you might want to download a CSV parser and adapt it to your need. But if you want to use regexes, split() is not the way to go. This approach should work for your sample data:
    Pattern p = Pattern.compile("\"[^\"]*+\"|\\S+");
    Matcher m = p.matcher(input);
    while (m.find())
      System.out.println(m.group());
    }

  • Splitting html ul tags and their content into string arrays using regular expression

    <ul data-role="listview" data-filter="true" data-inset="true">
    <li data-role="list-divider"></li><li><a href="#"><h3>
    my title
    </h3><p><strong></strong></p></a></li>
    </ul>
    <ul data-role="listview" data-filter="true" data-inset="true">
    <li data-role="list-divider"></li><li>test.</li>
    </ul>
    I need to be able to slip this html into two arrays hold the entire <ul></ul> tag. Please help.
    Thanks.

    Hi friend.
    This forum is to discuss problems of C# development. Your question is not related to the topic of this forum.
    You'll need to post it in the dedicated Archived Forums N-R  > Regular Expressions
     for better support. Thanks for understanding.
    Best Regards,
    Kristin

  • Need advice on negating a whole string line with regular expression

    Hi All,
    I am not able to ignore / get rid of the following line even though my Java 6 (Windows XP) String Pattern matching has not taken cater for it:
    *% Cleared: 61%*
    Below is the existing Java String Pattern matching in the simple program:
    Pattern pattern = Pattern.compile("(^.*[A-Z][a-z]*){1,2} \\d{0,4}/?\\d{0,4} ([A-Z][a-z]*){1,2} St|Rd|Av|Sq|Cl|Pl|Cr|Gr|Dr|Hwy|Pde|Wy|La \\d br [h|u|t] \\$\\d+,\\d+|\\$\\d*\\,\\d+,\\d+ ([A-Z][a-z]*){1,}.*$");This pattern is working for valid strings.
    The following pattern has included "^(?!.*\.\.).*$" into the existing one but had no luck still:
    Pattern pattern = Pattern.compile("^(?!.*\.\.).*$|((^.*[A-Z][a-z]*){1,2} \\d{0,4}/?\\d{0,4} ([A-Z][a-z]*){1,2} St|Rd|Av|Sq|Cl|Pl|Cr|Gr|Dr|Hwy|Pde|Wy|La \\d br [h|u|t] \\$\\d+,\\d+|\\$\\d*\\,\\d+,\\d+ ([A-Z][a-z]*){1,}.*$)");This picked up other rubbish including "*% Cleared: 61%*".
    I am looking for a single regular expression that applies to the whole line.
    I am quite new to regular expression but has read through Regular Expressions Cookbook (Oreilly - 2009) and is still not familiar with advance functions such as lookahead / lookbehind...
    Your assistance would be appreciated.
    Thanks,
    Jack

    Hi Winston,
    I am still digesting the material from the regular expression book and will take sometime to become proficient with it.
    It seems that using groupCount() to eliminate the unwanted text does not work in this case, since all the lines returned the same value. Ie 3 posted earlier. This may be because the patterns are complex and only a few were grouped together. Otherwise, could you provide an example using the string posted as opposed to a hyperthetic one. In the meantime, at least one solution have been found by defining an additional special pattern “\\A[^%].*\\Z”, before combining / intersecting both existing and the new special pattern to get the best of both world. Another approach that should also work is to evaluate the size of String.split() and only accept those lines with a minimum number of tokens.
    Anyhow, I have come a crossed another minor stumbling block in the mean time with the following line, where some hidden characters is preventing the existing pattern from reading it:
    o;?Mervan Bay 40 Boyde St 7 br t $250,000 X West Park AE
    Below is the existing regular expression that works for other lines with the same pattern but not for special hidden characters such as “o;?”:
    \\A([A-Z][a-z]*){1,2} [0-9]{0,4}/?[0-9]{0,4}-?[0-9]{0,4} ([A-Z][a-z]*){1,2} St|Rd|Av|Sq|Cl|Pl|Cr|Gr|Dr|Hwy|Pde|Wy|La [0-9] br [h|u|t] \\$\\d+,\\d+|\\$\\d*\\,\\d+,\\d+ ([A-Z][a-z]*){1,}\\ZIs it possible to come up with a regular expression to ignore them so that this line could be picked up? Would also like to know whether I could combine both the special pattern “\\A[^%].*\\Z” with existing one as opposed to using 2 separate patterns altogether?
    Many thanks,
    Jack

  • String replace using regular expressions

    I'm not very good at regular expressions, but I would like my script to replace
         <a href="somepage.html">
    by
    <a href="event:somepage">
    How do I do this?  Thanks in advance!

    Replacing a string that matches a certain pattern with another string is one of the more common RegEx tasks. There is documentation on using them here:
    http://livedocs.adobe.com/flex/3/html/help.html?content=12_Using_Regular_Expressions_01.ht ml
    hth,
    matt horn
    flex docs

  • String.matches() question - regular expression help

    How come the following code's if condition returns false?
    String someFile="Dr. Phil.pdf";
    if (someFile.matches("[.][Pp][Dd][Ff]$")) {
      System.out.println("File is a pdf file.");
    }When I change the the matches method to matches(".*[Pp][Dd][Ff]$") it works, so does that mean it has to match the entire string to return true? If so, how can I determine if a partial match occured?
    If partial matching isn't feasible, then can someone help me look determine if this is the best matching pattern to use:
    matches(".*[.][Pp][Dd][Ff]$")Thanks.

    The documentation is your friend.
    [String.matches(regex)|http://java.sun.com/javase/6/docs/api/java/lang/String.html#matches(java.lang.String)] says:
    An invocation of this method of the form str.matches(regex) yields exactly the same result as the expression
    Pattern.matches(regex, str)And [Pattern.matches(regex, str)|http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html#matches(java.lang.String, java.lang.CharSequence)] says
    behaves in exactly the same way as the expression
    Pattern.compile(regex).matcher(input).matches()And [Matcher.matches()|http://java.sun.com/javase/6/docs/api/java/util/regex/Matcher.html#matches()] says
    Attempts to match the entire region against the pattern.

  • Changeparticular characters in a string by using regular expressions ...

    Hello Everyone,
    I am trying to write a function by using oracles regular expression function REGEXP_REPLACE but I could not succed till now.
    My problem as follows, I have a text in a column for example let say 'sdfsdf Sdfdfs Sdfd' I want replace all s and S characters with X and make the text look like 'XdfXdf XdfdfX Xdfd'.
    Is it possible by using regular expressions in oracle ?
    Can you give me some clues ?
    Thank you

    SSU wrote:
    Hello Everyone,
    I am trying to write a function by using oracles regular expression function REGEXP_REPLACE but I could not succed till now.
    My problem as follows, I have a text in a column for example let say 'sdfsdf Sdfdfs Sdfd' I want replace all s and S characters with X and make the text look like 'XdfXdf XdfdfX Xdfd'.
    Is it possible by using regular expressions in oracle ?
    Can you give me some clues ?
    Thank you
    SQL> SELECT
      2  regexp_replace('sdfsdf Sdfdfs Sdfd','s|S','X') from dual;
    REGEXP_REPLACE('SD
    XdfXdf XdfdfX XdfdRegards,
    Achyut

  • String extract using regular expression

    Hi
    I have text like this "<a>45</a><ct>Hi</ct><R>45 85</R><H>Here</H>" .I want to extract using regular expression or any techniques the text between <R> and </R> also need to replace the space with pipe between 45 and 85 like "45|85"
    Edited by: vishnu prakash on Mar 2, 2012 4:42 AM

    Hi,
    Here's one way:
    REPLACE ( REGEXP_REPLACE ( txt
                    , '.*<R>(.*)</R>.*'
                    , '\1'
         , '|'
         )This assumes there is only one <R> tag in txt.
    Always say which version of Oracle you're using. The expression above will work in Oralce 10 and up, but starting in Oracle 11 you can use REGEXP_SUBSTR rather than the less intuitive REGEXP_REPLACE.
    Edited by: Frank Kulash on Mar 2, 2012 7:48 AM

  • String Manipulation using Regular Expression

    Hello Guys,
    I stuck in a situation wherein I want to extract specific data  from a  column of the table .
    Below are the values for a particular column wherein I want to  ignore  values  along with bracket  which are in bracket and which are like .pdf,.doc .
    Tris(dibenzylideneacetone)dipalladium (0) 451CDHA.pdf
    AM57001A(ASRM549CDH).DOC
    AM23021A Identity of sulfate (draft)
    PG-1183.E.2 (0.25 mg FCT)
    AS149656A (DEV AERO APPL HFA WHT PROVENTIL)
    Stability report (RSR) Annex2 semi-solid form (internal information)
    TSE(Batch#USLF000332)-242CDH, Lancaster synthesis.pdf
    TR3018520A Addendum 1 (PN 3018520)
    AM10311A Particle size air-jet sieving (constant sieving) (draft)
    ASE00099B Addendum (PN E000099) 90 mesh
    AM37101_312-99 (Z11c) Palladium by DCP.doc
    PS21001A_1H-NMR.doc (PN 332-00)
    AM68311A (Q-One CP 33021.02) Attachment
    AM68202-1A (BioReliance no. 02.102006) Attachment
    I want below output for above values for column 
    Trisdipalladium451CDHA
    AM57001A
    AM23021A Identity of sulfate
    PG-1183.E.2
    Thanks in advance

    Like this?
    SQL> with t
      2  as
      3  (
      4  select 'Tris(dibenzylideneacetone)dipalladium (0) 451CDHA.pdf' str from dual
      5  union all
      6  select 'AM57001A(ASRM549CDH).DOC' str from dual
      7  union all
      8  select 'AM23021A Identity of sulfate (draft)' str from dual
      9  union all
    10  select 'PG-1183.E.2 (0.25 mg FCT)' str from dual
    11  union all
    12  select 'AS149656A (DEV AERO APPL HFA WHT PROVENTIL)' str from dual
    13  union all
    14  select 'Stability report (RSR) Annex2 semi-solid form (internal information)' str from dual
    15  union all
    16  select 'TSE(Batch#USLF000332)-242CDH, Lancaster synthesis.pdf' str from dual
    17  union all
    18  select 'TR3018520A Addendum 1 (PN 3018520)' str from dual
    19  union all
    20  select 'AM10311A Particle size air-jet sieving (constant sieving) (draft)' str from dual
    21  union all
    22  select 'ASE00099B Addendum (PN E000099) 90 mesh' str from dual
    23  union all
    24  select 'AM37101_312-99 (Z11c) Palladium by DCP.doc' str from dual
    25  union all
    26  select 'PS21001A_1H-NMR.doc (PN 332-00)' str from dual
    27  union all
    28  select 'AM68311A (Q-One CP 33021.02) Attachment' str from dual
    29  union all
    30  select 'AM68202-1A (BioReliance no. 02.102006) Attachment' str from dual
    31  )
    32  select str
    33      , regexp_replace(str, '(\([^)]+\))|(\..{3})') str_new
    34    from t;
    STR                                                                    STR_NEW
    Tris(dibenzylideneacetone)dipalladium (0) 451CDHA.pdf                  Trisdipalladium  451CDHA
    AM57001A(ASRM549CDH).DOC                                              AM57001A
    AM23021A Identity of sulfate (draft)                                  AM23021A Identity of sulfate
    PG-1183.E.2 (0.25 mg FCT)                                              PG-1183
    AS149656A (DEV AERO APPL HFA WHT PROVENTIL)                            AS149656A
    Stability report (RSR) Annex2 semi-solid form (internal information)  Stability report  Annex2 semi-solid form
    TSE(Batch#USLF000332)-242CDH, Lancaster synthesis.pdf                  TSE-242CDH, Lancaster synthesis
    TR3018520A Addendum 1 (PN 3018520)                                    TR3018520A Addendum 1
    AM10311A Particle size air-jet sieving (constant sieving) (draft)      AM10311A Particle size air-jet sieving
    ASE00099B Addendum (PN E000099) 90 mesh                                ASE00099B Addendum  90 mesh
    AM37101_312-99 (Z11c) Palladium by DCP.doc                            AM37101_312-99  Palladium by DCP
    PS21001A_1H-NMR.doc (PN 332-00)                                        PS21001A_1H-NMR
    AM68311A (Q-One CP 33021.02) Attachment                                AM68311A  Attachment
    AM68202-1A (BioReliance no. 02.102006) Attachment                      AM68202-1A  Attachment
    14 rows selected.

  • Regular expression - escape characters

    Hi. Is there an escape character for "?", "[", "]", "{", "}" for regular expression? I tried to do the following: "[^[]?{}]*" (the string cannot contain a question mark, left or right bracket, or left or right curly brace). However, I get an error stating unexpected character.
    thanks,
    Paul.

    Hi. Is there an escape character for "?", "[", "]",
    "{", "}" for regular expression? I tried to do the
    following: "[^[]?{}]*" (the string cannot contain a
    question mark, left or right bracket, or left or right
    curly brace). However, I get an error stating
    unexpected character.
    You should only have to escape the characters that cause a problem in the character class, rather than everything so the following should work.
    "[^\\[\\]?{}]*"

  • How to split a string with regular expression

    Hi.
    I need to split a string with a regular expression.
    Example
    String = "this is; a test";rune haavik;12345;
    And I want the output to be:
    "this is; a test"
    rune haavik
    12345
    If I use this code:
    private void test1()
    String str = "\"this is; a test\";rune haavik;12345;";
    int i=0;
    String[] tmp = str.split(";");
    while(i<tmp.length)
    System.out.println(tmp);
    i++;
    Then it splits also in the "" text.
    Regards
    Rune haavik

    Rune haavik:
    The most effective way to achieve the end result is, I believe, to read the characters one by one, using a flag that indicates if we are inside quotation or not.
    Well, if we are in a mind game, then the following should do.
           String[] tmp = str.split(";(?![^\"]*\";)");

  • Regular expressions... they are not regular! =)

    So,
    I've been pulling my hair out with regular expressions. I'm sure there is a logical explanation to this, but i've read a bunch of explanations and i THOUGHT i understood this, but i don't. Here goes:
    I have a string "2010PETE". I tried matching it to "\\d{1,}" (this is how i entered it in Java). This returns FALSE. HOWEVER, it seems to me the above should be TRUE because it says that a greedy quantifier with {1,} searches for the the preceding character AT LEAST N times, where in this case n=1, so i interpret this as "If a digit (\\d) is found at least once within the string, then this string matches the regular expression. This does NOT seem to be the case.
    Can someone clear this up for me?

    THANK YOU. i think that is what i was missing, the part about
    "would only match if the input consisted of at least one digit, possibly multiple digits, and nothing else."
    I read the documentation and some of it didn't seem to be clear on that point.
    i'll play around with this and see how far i can get. if i still have questions i will post some code for sure, and try to get a nice, rounded set of examples.
    thanks!
    ONE OTHER QUESTION I JUST THOUGHT OF: does the .matches() method match expressions when some substring of the String matches, or does it have to match the entire String? So, if i have the String "123ABC", and i ask to match "1 or more letters" will it fail because there are non-letters in the String, but then pass if i add "1 or more letters AND 1 or more digits"? so, in the latter every character in the String is accounted for in the search, as opposed to the first. Is that correct, or are there ways to JUST match some substring in the String instead of the whole thing? i WILL make some examples too... but does that make sense?
    Edited by: pedron on Jan 12, 2012 3:23 PM

Maybe you are looking for