Escaping entire String in a Regular Expression

How does one escape an entire String in a Pattern? I thought that prefixing it with "\\Q" and postfixing with "\\E" would do the trick but this is functioning strangely. It also ignores the possibility of a "\\Q" and/or "\\E" within the String. I guess one could escape every meta-character but this seems like overkill. Is there a utility method somewhere? Why is it that:
System.out.println("xx\\xx".replaceAll("\\Qx\\x\\E", "a"));yields xax but
System.out.println("xx\\xx".replaceAll("\\Q\\\\E", "a"));outputs xx\xx?

It looks like you've uncovered a bug. Specifically, when the Pattern parser sees the backslash that you're trying to match, it looks at the next character to see if it's an 'E'. It isn't, so the parser consumes both characters, then starts looking for the \E sequence again. The next character, of course, is the 'E', but the parser just sees it as a literal 'E' now. End result: instead of a Pattern for a single backslash, you get a Pattern for a backslash, followed by a backslash, followed by an 'E', as demonstrated here:    System.out.println("xx\\\\Exx".replaceAll("\\Q\\\\E", "a")); prints xxaxx.
Since you're escaping the whole regex, you can work around the bug by leaving off the \E:    System.out.println("xx\\xx".replaceAll("\\Q\\", "a")); prints xxaxx.
Or you can do what I do and use this method to escape strings instead of \B ... \E: public static String quotemeta(String str)
    if (str.length() == 0)
      return "";
    StringBuffer buf = new StringBuffer();
    for (int i = 0; i < str.length(); i++)
      char c = str.charAt(i);
      if ("\\[](){}.*+?$^|".indexOf(c) != -1)
        buf.append('\\');
      buf.append(c);
    return buf.toString();
}(This is why I've never run afoul of this bug, though I use regexes a lot.)
I'll submit this to BugParade if you like. While I'm at it, I can submit an RFE to have them add a quotemeta method to Pattern.

Similar Messages

Converting String Characters into Regular Expression Automatically?

Hi guys.... is there any program or sample coding which is available to convert string characters into regular expression automatically when the program is run?
Example:
String Character Input: fnffffffffffnnnnnnnnnffffffnnfnnnnnnnnnfnnfnfnfffnfnfnfnfnfnnnnd
When the program runs, it automatically convert into this :
Regular Expression Output: f*d

hey guys.... i am sorry for not providing all the information that you guys need as i was rushing off to urgent meeting... for my string characters i only have a to n.. all these characters are collected from sensors and stored inside database... from many demos i have done... i found out that every demo has different strings of characters collected and these string of characters will not match with the regular expressions that i had created due to several unwanted inputs and stuff... i have a lot of different types of plan activities and therefore a lot of regular expressions.... if i put [a-z|0-9]*... it will capture all characters but in the same time it will be showing 1 plan only.... therefore, i am finding ways to get the strings i collected and let it form into regular expression by themselves in the program so that it will appear as different plans as output with comparing with the regular expression that i had created.... is there any way to do so?
please post again if there is any questions u are still not familiar with... thank you...

How to create a list of string if a regular expression is given ?

Hi folks,
I have a regular expression say abcd[a-z]\\\.[0-9] . ( please ignore one '\')
For this string i know that
following string matches successfully
1. abca.0
2. abcb.1
3. abcz.9 ......etc n number of combination are possible.
is there any algorithm which will create some randomn strings from a regular expression.
input to algorithm : some string pattern
output to algorithm : some matching strings ( can be a single or an array of matching strings)
Thanks in advance..
Sethu
Edited by: Sethumadhavan on Apr 16, 2008 6:32 AM

Can u please give little more explanation...
If i get some some values i can exit with the values ... and from the values i got i can ignore the duplicates ...
But i am not getting the basic algorithm to get list of strings.....( DFA? or NFA?)
thanks
sethu

String validation without regular expressions

Hello all
I'm facing a little problem, basically i have to make a method that validates an input String "a name"
Numbers and symbols are not allowed, but white spaces are.
The method has to be implemented without the use of JFormattedTextField or regular expressions.
What i'm doing right now is this:
public boolean validate(String name){
   char[] arr=name.toCharArray();
    for(Char c:arr){
      if(!Character.isLetter(c)){
       return false;
return true;
}That isLetter() method is very useful but it sees the white spaces are "non letters".
I am a bit lost at this point, i'm trying a lot of methods of String and Character but nothing seems to work
do you have any advice?
Thx

enrico wrote:
That isLetter() method is very useful but it sees the white spaces are "non letters".
I am a bit lost at this point, i'm trying a lot of methods of String and Character but nothing seems to work
do you have any advice?Yes: don't try to do it all in one expression. 'If' statements allow you to use '&&' and '||' to connect expressions, so use them.
Second: Work out what you want to do BEFORE you start programming.
In this case, you need to know exactly which characters you want to allow, +and when+ (see baftos' examples above).
Third:for(Char c:arr){is meaningless (unless you've defined a class called 'Char').
Accuracy is important.
Winston

String splitting with regular expressions

Hello everyone
I need some help in splitting the string using regular expressions
Suppose my String is : abc def "ghi jkl mno" pqr stu
after splitting the reulsting string array should contain the elements
abc
def
ghi jkl mno
pqr
stu
what my regular expression should be

Since this is essentially the same as parsing CSV data, you might want to download a CSV parser and adapt it to your need. But if you want to use regexes, split() is not the way to go. This approach should work for your sample data:
Pattern p = Pattern.compile("\"[^\"]*+\"|\\S+");
Matcher m = p.matcher(input);
while (m.find())
System.out.println(m.group());
}

Splitting html ul tags and their content into string arrays using regular expression

<ul data-role="listview" data-filter="true" data-inset="true">
<li data-role="list-divider"></li><li><a href="#"><h3>
my title
</h3><p><strong></strong></p></a></li>
</ul>
<ul data-role="listview" data-filter="true" data-inset="true">
<li data-role="list-divider"></li><li>test.</li>
</ul>
I need to be able to slip this html into two arrays hold the entire <ul></ul> tag. Please help.
Thanks.

Hi friend.
This forum is to discuss problems of C# development. Your question is not related to the topic of this forum.
You'll need to post it in the dedicated Archived Forums N-R > Regular Expressions
for better support. Thanks for understanding.
Best Regards,
Kristin

Need advice on negating a whole string line with regular expression

Hi All,
I am not able to ignore / get rid of the following line even though my Java 6 (Windows XP) String Pattern matching has not taken cater for it:
*% Cleared: 61%*
Below is the existing Java String Pattern matching in the simple program:
Pattern pattern = Pattern.compile("(^.*[A-Z][a-z]*){1,2} \\d{0,4}/?\\d{0,4} ([A-Z][a-z]*){1,2} St|Rd|Av|Sq|Cl|Pl|Cr|Gr|Dr|Hwy|Pde|Wy|La \\d br [h|u|t] \\$\\d+,\\d+|\\$\\d*\\,\\d+,\\d+ ([A-Z][a-z]*){1,}.*$");This pattern is working for valid strings.
The following pattern has included "^(?!.*\.\.).*$" into the existing one but had no luck still:
Pattern pattern = Pattern.compile("^(?!.*\.\.).*$|((^.*[A-Z][a-z]*){1,2} \\d{0,4}/?\\d{0,4} ([A-Z][a-z]*){1,2} St|Rd|Av|Sq|Cl|Pl|Cr|Gr|Dr|Hwy|Pde|Wy|La \\d br [h|u|t] \\$\\d+,\\d+|\\$\\d*\\,\\d+,\\d+ ([A-Z][a-z]*){1,}.*$)");This picked up other rubbish including "*% Cleared: 61%*".
I am looking for a single regular expression that applies to the whole line.
I am quite new to regular expression but has read through Regular Expressions Cookbook (Oreilly - 2009) and is still not familiar with advance functions such as lookahead / lookbehind...
Your assistance would be appreciated.
Thanks,
Jack

Hi Winston,
I am still digesting the material from the regular expression book and will take sometime to become proficient with it.
It seems that using groupCount() to eliminate the unwanted text does not work in this case, since all the lines returned the same value. Ie 3 posted earlier. This may be because the patterns are complex and only a few were grouped together. Otherwise, could you provide an example using the string posted as opposed to a hyperthetic one. In the meantime, at least one solution have been found by defining an additional special pattern “\\A[^%].*\\Z”, before combining / intersecting both existing and the new special pattern to get the best of both world. Another approach that should also work is to evaluate the size of String.split() and only accept those lines with a minimum number of tokens.
Anyhow, I have come a crossed another minor stumbling block in the mean time with the following line, where some hidden characters is preventing the existing pattern from reading it:
o;?Mervan Bay 40 Boyde St 7 br t $250,000 X West Park AE
Below is the existing regular expression that works for other lines with the same pattern but not for special hidden characters such as “o;?”:
\\A([A-Z][a-z]*){1,2} [0-9]{0,4}/?[0-9]{0,4}-?[0-9]{0,4} ([A-Z][a-z]*){1,2} St|Rd|Av|Sq|Cl|Pl|Cr|Gr|Dr|Hwy|Pde|Wy|La [0-9] br [h|u|t] \\$\\d+,\\d+|\\$\\d*\\,\\d+,\\d+ ([A-Z][a-z]*){1,}\\ZIs it possible to come up with a regular expression to ignore them so that this line could be picked up? Would also like to know whether I could combine both the special pattern “\\A[^%].*\\Z” with existing one as opposed to using 2 separate patterns altogether?
Many thanks,
Jack

String replace using regular expressions

I'm not very good at regular expressions, but I would like my script to replace
<a href="somepage.html">
by
<a href="event:somepage">
How do I do this? Thanks in advance!

Replacing a string that matches a certain pattern with another string is one of the more common RegEx tasks. There is documentation on using them here:
http://livedocs.adobe.com/flex/3/html/help.html?content=12_Using_Regular_Expressions_01.ht ml
hth,
matt horn
flex docs

String.matches() question - regular expression help

How come the following code's if condition returns false?
String someFile="Dr. Phil.pdf";
if (someFile.matches("[.][Pp][Dd][Ff]$")) {
System.out.println("File is a pdf file.");
}When I change the the matches method to matches(".*[Pp][Dd][Ff]$") it works, so does that mean it has to match the entire string to return true? If so, how can I determine if a partial match occured?
If partial matching isn't feasible, then can someone help me look determine if this is the best matching pattern to use:
matches(".*[.][Pp][Dd][Ff]$")Thanks.

The documentation is your friend.
[String.matches(regex)|http://java.sun.com/javase/6/docs/api/java/lang/String.html#matches(java.lang.String)] says:
An invocation of this method of the form str.matches(regex) yields exactly the same result as the expression
Pattern.matches(regex, str)And [Pattern.matches(regex, str)|http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html#matches(java.lang.String, java.lang.CharSequence)] says
behaves in exactly the same way as the expression
Pattern.compile(regex).matcher(input).matches()And [Matcher.matches()|http://java.sun.com/javase/6/docs/api/java/util/regex/Matcher.html#matches()] says
Attempts to match the entire region against the pattern.

Changeparticular characters in a string by using regular expressions ...

Hello Everyone,
I am trying to write a function by using oracles regular expression function REGEXP_REPLACE but I could not succed till now.
My problem as follows, I have a text in a column for example let say 'sdfsdf Sdfdfs Sdfd' I want replace all s and S characters with X and make the text look like 'XdfXdf XdfdfX Xdfd'.
Is it possible by using regular expressions in oracle ?
Can you give me some clues ?
Thank you

SSU wrote:
Hello Everyone,
I am trying to write a function by using oracles regular expression function REGEXP_REPLACE but I could not succed till now.
My problem as follows, I have a text in a column for example let say 'sdfsdf Sdfdfs Sdfd' I want replace all s and S characters with X and make the text look like 'XdfXdf XdfdfX Xdfd'.
Is it possible by using regular expressions in oracle ?
Can you give me some clues ?
Thank you
SQL> SELECT
2 regexp_replace('sdfsdf Sdfdfs Sdfd','s|S','X') from dual;
REGEXP_REPLACE('SD
XdfXdf XdfdfX XdfdRegards,
Achyut

String extract using regular expression

Hi
I have text like this "<a>45</a><ct>Hi</ct><R>45 85</R><H>Here</H>" .I want to extract using regular expression or any techniques the text between <R> and </R> also need to replace the space with pipe between 45 and 85 like "45|85"
Edited by: vishnu prakash on Mar 2, 2012 4:42 AM

Hi,
Here's one way:
REPLACE ( REGEXP_REPLACE ( txt
                , '.*<R>(.*)</R>.*'
                , '\1'
     , '|'
     )This assumes there is only one <R> tag in txt.
Always say which version of Oracle you're using. The expression above will work in Oralce 10 and up, but starting in Oracle 11 you can use REGEXP_SUBSTR rather than the less intuitive REGEXP_REPLACE.
Edited by: Frank Kulash on Mar 2, 2012 7:48 AM

String Manipulation using Regular Expression

Hello Guys,
I stuck in a situation wherein I want to extract specific data from a column of the table .
Below are the values for a particular column wherein I want to ignore values along with bracket which are in bracket and which are like .pdf,.doc .
Tris(dibenzylideneacetone)dipalladium (0) 451CDHA.pdf
AM57001A(ASRM549CDH).DOC
AM23021A Identity of sulfate (draft)
PG-1183.E.2 (0.25 mg FCT)
AS149656A (DEV AERO APPL HFA WHT PROVENTIL)
Stability report (RSR) Annex2 semi-solid form (internal information)
TSE(Batch#USLF000332)-242CDH, Lancaster synthesis.pdf
TR3018520A Addendum 1 (PN 3018520)
AM10311A Particle size air-jet sieving (constant sieving) (draft)
ASE00099B Addendum (PN E000099) 90 mesh
AM37101_312-99 (Z11c) Palladium by DCP.doc
PS21001A_1H-NMR.doc (PN 332-00)
AM68311A (Q-One CP 33021.02) Attachment
AM68202-1A (BioReliance no. 02.102006) Attachment
I want below output for above values for column
Trisdipalladium451CDHA
AM57001A
AM23021A Identity of sulfate
PG-1183.E.2
Thanks in advance

Like this?
SQL> with t
2 as
3 (
4 select 'Tris(dibenzylideneacetone)dipalladium (0) 451CDHA.pdf' str from dual
5 union all
6 select 'AM57001A(ASRM549CDH).DOC' str from dual
7 union all
8 select 'AM23021A Identity of sulfate (draft)' str from dual
9 union all
10 select 'PG-1183.E.2 (0.25 mg FCT)' str from dual
11 union all
12 select 'AS149656A (DEV AERO APPL HFA WHT PROVENTIL)' str from dual
13 union all
14 select 'Stability report (RSR) Annex2 semi-solid form (internal information)' str from dual
15 union all
16 select 'TSE(Batch#USLF000332)-242CDH, Lancaster synthesis.pdf' str from dual
17 union all
18 select 'TR3018520A Addendum 1 (PN 3018520)' str from dual
19 union all
20 select 'AM10311A Particle size air-jet sieving (constant sieving) (draft)' str from dual
21 union all
22 select 'ASE00099B Addendum (PN E000099) 90 mesh' str from dual
23 union all
24 select 'AM37101_312-99 (Z11c) Palladium by DCP.doc' str from dual
25 union all
26 select 'PS21001A_1H-NMR.doc (PN 332-00)' str from dual
27 union all
28 select 'AM68311A (Q-One CP 33021.02) Attachment' str from dual
29 union all
30 select 'AM68202-1A (BioReliance no. 02.102006) Attachment' str from dual
31 )
32 select str
33      , regexp_replace(str, '($[^)]+$)|(\..{3})') str_new
34    from t;
STR                                                                    STR_NEW
Tris(dibenzylideneacetone)dipalladium (0) 451CDHA.pdf                  Trisdipalladium 451CDHA
AM57001A(ASRM549CDH).DOC                                              AM57001A
AM23021A Identity of sulfate (draft)                                  AM23021A Identity of sulfate
PG-1183.E.2 (0.25 mg FCT)                                              PG-1183
AS149656A (DEV AERO APPL HFA WHT PROVENTIL)                            AS149656A
Stability report (RSR) Annex2 semi-solid form (internal information) Stability report Annex2 semi-solid form
TSE(Batch#USLF000332)-242CDH, Lancaster synthesis.pdf                  TSE-242CDH, Lancaster synthesis
TR3018520A Addendum 1 (PN 3018520)                                    TR3018520A Addendum 1
AM10311A Particle size air-jet sieving (constant sieving) (draft)      AM10311A Particle size air-jet sieving
ASE00099B Addendum (PN E000099) 90 mesh                                ASE00099B Addendum 90 mesh
AM37101_312-99 (Z11c) Palladium by DCP.doc                            AM37101_312-99 Palladium by DCP
PS21001A_1H-NMR.doc (PN 332-00)                                        PS21001A_1H-NMR
AM68311A (Q-One CP 33021.02) Attachment                                AM68311A Attachment
AM68202-1A (BioReliance no. 02.102006) Attachment                      AM68202-1A Attachment
14 rows selected.

Regular expression - escape characters

Hi. Is there an escape character for "?", "[", "]", "{", "}" for regular expression? I tried to do the following: "[^[]?{}]*" (the string cannot contain a question mark, left or right bracket, or left or right curly brace). However, I get an error stating unexpected character.
thanks,
Paul.

Hi. Is there an escape character for "?", "[", "]",
"{", "}" for regular expression? I tried to do the
following: "[^[]?{}]*" (the string cannot contain a
question mark, left or right bracket, or left or right
curly brace). However, I get an error stating
unexpected character.
You should only have to escape the characters that cause a problem in the character class, rather than everything so the following should work.
"[^\\[\\]?{}]*"

How to split a string with regular expression

Hi.
I need to split a string with a regular expression.
Example
String = "this is; a test";rune haavik;12345;
And I want the output to be:
"this is; a test"
rune haavik
12345
If I use this code:
private void test1()
String str = "\"this is; a test\";rune haavik;12345;";
int i=0;
String[] tmp = str.split(";");
while(i<tmp.length)
System.out.println(tmp);
i++;
Then it splits also in the "" text.
Regards
Rune haavik

Rune haavik:
The most effective way to achieve the end result is, I believe, to read the characters one by one, using a flag that indicates if we are inside quotation or not.
Well, if we are in a mind game, then the following should do.
String[] tmp = str.split(";(?![^\"]*\";)");

Regular expressions... they are not regular! =)

So,
I've been pulling my hair out with regular expressions. I'm sure there is a logical explanation to this, but i've read a bunch of explanations and i THOUGHT i understood this, but i don't. Here goes:
I have a string "2010PETE". I tried matching it to "\\d{1,}" (this is how i entered it in Java). This returns FALSE. HOWEVER, it seems to me the above should be TRUE because it says that a greedy quantifier with {1,} searches for the the preceding character AT LEAST N times, where in this case n=1, so i interpret this as "If a digit (\\d) is found at least once within the string, then this string matches the regular expression. This does NOT seem to be the case.
Can someone clear this up for me?

THANK YOU. i think that is what i was missing, the part about
"would only match if the input consisted of at least one digit, possibly multiple digits, and nothing else."
I read the documentation and some of it didn't seem to be clear on that point.
i'll play around with this and see how far i can get. if i still have questions i will post some code for sure, and try to get a nice, rounded set of examples.
thanks!
ONE OTHER QUESTION I JUST THOUGHT OF: does the .matches() method match expressions when some substring of the String matches, or does it have to match the entire String? So, if i have the String "123ABC", and i ask to match "1 or more letters" will it fail because there are non-letters in the String, but then pass if i add "1 or more letters AND 1 or more digits"? so, in the latter every character in the String is accounted for in the search, as opposed to the first. Is that correct, or are there ways to JUST match some substring in the String instead of the whole thing? i WILL make some examples too... but does that make sense?
Edited by: pedron on Jan 12, 2012 3:23 PM

Escaping entire String in a Regular Expression

Similar Messages

Maybe you are looking for