Encoding in Regular Expressions
Can anyone tell me how to use Encoding in Regular Expressions to read Chinese Character..........................thanks
Sarang.
You need to read
http://java.sun.com/docs/books/tutorial/extra/regex/index.html
and then you can use the \uxxxx to define the characters you need. A very very simple example is
Pattern p = Pattern.compile("[\u0030-\u0070]+");
Matcher m = p.matcher("A\u0055BCDE");
if (m.matches())
System.out.println(m.group());
}
Similar Messages
-
Regular expressions and characters as č, �, ť ...
Hi. I have this problem with regular expressions. The next piece of code is work only with string without diacritics. If I use as pattern character with diacritic the matcher is not find any occurence of a character. The code:
String pat = "č"; // "a"
String text = "naj čalamada"
Pattern pattern = Pattern.compile(pat, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) { }If I uncomment character 'a' its occurence will be founded but occurence of 'č' is not found. :(
Can you tell me what is wrong ??
If I use flag Pattern.CASE_INSENSITIVE | Pattern.CANON_EQ nothing will be changed.
Thanks for all replays.The file encoding is same as system encoding and it is utf-8.
How can I specify encoding as an inout parameter when I compile it?
Piece of code or concrete funcions help me more... -
Best Regular expression ?
What is the best regular expression to represent the following string in java ?
<?xml version = '1.0' encoding = 'UTF-8'?>
Thanks in advance !What about that exact same string? depending on your regular expression implementation you'll probably need to escape some characters.. So something like: <\?xml version = '1.0' encoding = 'UTF-8'\?>
-
Email Regular Expression with a String.Match()
I'm currently using a RichTextEditor for a user to build HTML
for a site. However, I want the application to scan for emails and
encode them so they are protected from spam bots when they go to
the live site. I've written a regular expression to find an email
and it seems to work, but it only returns one email at a time from
the string. I have had to revert to a while loop to traverse the
string until I'm satisfied. I don't particularly like that method
and would like to just do one String.match() query to retrieve all
of the emails. Can anyone see something here that I'm missing?Try adding the global flag (g):
var emailPattern:RegExp =
/[a-z][\w.-]+@\w[\w.-]+\.[\w.-]*[a-z][a-z]+/g;
TS -
How to display regular expression text
I cannot seem to get the text of a regular expression to display in a text input. The regex is defined as:
Bindable]
private var myRegExp:RegExp=/^.*(?=.{10,32})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[.!@#$%*^&()+]).*$/;
In my form, I want to display the actual expression. The code I am using to do so is:
<mx:FormItem label="regExpLabel">
<mx:TextInput id="regExpTextInput" text="{myRegExp}" />
</mx:FormItem>
When I run the app, the data displayed in the regExpTextInput filed is /^.*(?=.{10,32})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[.!@#$%*^&()+]).*$/
What I want to display is only the regular expression, not including the slashes required by Flex at the beginning and end of the string. The data I want displayed is ^.*(?=.{10,32})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[.!@#$%*^&()+]).*$
I tried changing my variable type to String and leaving out the beginning and ending slashes, but that did not work either. Any thoughts?
Thanks!Hi, I am unable to understand your requirement properly but here is one demo of the using regular expression. In this application I am allowing only those character to be enter in the Textbox which are taken as the valid filename. Hope this will help you.
I think In your requirement first of all you have to take it as string type. And the second is that use \\ before every special character if you are writing it in the script. As I have done. Pls let me know if you
Have any issue with the code below.
<?xml version="1.0" encoding="utf-8"?>
<mx:Application xmlns:mx="http://www.adobe.com/2006/mxml" layout="absolute">
<mx:Script>
<![CDATA[
[Bindable]
private var regExpForFile : String
= "0-9A-Za-z\\&\\`\\~\\!\\@\\#\\$\\%\\^\\(\\)\\-\\_\\=\\+\\]\\}\[\\'\\;\\,\\.\\{ ";
]]>
</mx:Script>
<mx:VBox horizontalAlign="center" width="350">
<mx:Label text="Enter File Name"/>
<mx:TextInput id="fileName" width="170" height="20" restrict="{regExpForFile}"/>
</mx:VBox>
</mx:Application>
with Regards,
Shardul Singh Bartwal -
Strange regular expression difference when using an URL
Hello all,
I took some time and finally was able to come up with a regular expression that express the condition of a parameter in a URL. Baiscally, I want to remove the parameter and everything that belongs to this parameter in the URL so I can construct another link free from this parameter.
I want it to match either a number(todeletetable=10) or a list of numbers seperated with a comma (todeletetable=10,11,12,13,) in the URL
I used the following pattern :
Pattern.compile("&todeletetable=(\\d+,)+&|&todeletetable=\\d+&");When testing this locally, it works very well.
String toMatch = "method=allocate&todeletetable=153622,153623,153579,&sortDirection=asc"
matcher.replaceAll("&") -> returns "method=allocate&sortDirection=asc"However, when playing with HTML links, the comma is encoded into %2C. Therefore, I added this expression to the pattern :
Pattern.compile("&todeletetable=(\\d+,)+&|&todeletetable=(\\d+%2C)+&|&todeletetable=\\d+&");
String toMatch = "method=allocate&todeletetable=153622%2C153623%2C153579%2C&sortDirection=asc"
matcher.replaceAll("&") ->returns "method=allocate&sortDirection=asc" locally but over the web, I get "method=allocate&%2C153623%2C153579%2C&sortDirection=asc"It seems that the %2C is not detected in the regular expression in the case of an URL. I don't know what to do to avoid this.
Please help me if you can with any hint or solution.
Thank you
RaphasUse your original pattern but URL decode the URL first using class URLDecoder.
P.S. I have not checked your regex BUT it looks over complicated. -
Logical AND in Java Regular Expressions
I'm trying to implement logical AND using Java Regular Expressions.
I couldn't figure out how to do it after reading Java docs and textbooks. I can do something like "abc.*def", which means that I'm looking for strings which have "abc", then anything, then "def", but it is not "pure" logical AND - I will not find "def.*abc" this way.
Any ideas, how to do it ?
BakenFirst off, looks like you're really talking about an "OR", not an "AND" - you want it to match abc.*def OR def.*abc right? If you tried to match abc.*def AND def.*abc nothing would ever match that, as no string can begin with both "abc" and "def", just like no numeric value can be both 2 and 5.
Anyway, maybe regex isn't the right tool for this job. Can you not simply programmatically match it yourself using String methods? You want it to match if the string "starts with" abc and "ends with" def, or vice-versa. Just write some simple code. -
Hello..
I wanted to write a regular expression to match the foll string..
<!--endclickprintexclude--><!--startclickprintexclude--> <!--endclickprintexclude-->
<p> <b>NEW ORLEANS, Louisiana (CNN) </b>
-- Two years after Hurricane Katrina devastated coastal areas of Louisiana and Mississippi, residents say much of America has forgotten their plight.
</p> <!--startclickprintexclude-->
I tried doing..
Matcher matcher= Pattern.compile("<!--endclickprintexclude--> <p><b>([^<^>]+?)</p><!--startclickprintexclude-->", Pattern.CASE_INSENSITIVE).matcher(story);
Its not working...
is there any other soln?Theres probably a better way to do this but here's a way that works.
import java.util.regex.*;
public class RegexTester{
public static void main(String[] args){
String text =
"<!--endclickprintexclude--><!--startclickprintexclude--> <!--endclickprintexclude-->" +
"<p> <b>NEW ORLEANS, Louisiana (CNN) </b>" +
"-- Two years after Hurricane Katrina devastated coastal areas of Louisiana and Mississippi," +
"residents say much of America has forgotten their plight." +
"</p> <!--startclickprintexclude-->";
String regex = ">((?:\\s*[\\S&&[^<>]]+\\s*)*?)<";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(text);
while(m.find()){
System.out.println("Match: '" + m.group(1) + "'");
} -
Help in regular expression matching
I have three expressions like
1) [(y2009)(y2011)]
2) [(y2008M5)(y2011M3)] or [(y2009M5)(y2010M12)]
3) [(y2009M1d20)(y2011M12d31)]
i want regular expression pattern for the above three expressions
I am using :
REGEXP_LIKE(timedomainexpression, '???[:digit:]{4}*[:digit:]{1,2}???[:digit:]{4}*[:digit:]{1,2}??', 'i');
but its giving results for all above expressions while i want different expression for each.
i hav used * after [:digit:]{4}, when i am using ? or . then its giving no results. Please help in this situation ASAP.
ThanksI dont get your question Can you post your desired output? and also give some sample data.
Please consider the following when you post a question.
1. New features keep coming in every oracle version so please provide Your Oracle DB Version to get the best possible answer.
You can use the following query and do a copy past of the output.
select * from v$version 2. This forum has a very good Search Feature. Please use that before posting your question. Because for most of the questions
that are asked the answer is already there.
3. We dont know your DB structure or How your Data is. So you need to let us know. The best way would be to give some sample data like this.
I have the following table called sales
with sales
as
select 1 sales_id, 1 prod_id, 1001 inv_num, 120 qty from dual
union all
select 2 sales_id, 1 prod_id, 1002 inv_num, 25 qty from dual
select *
from sales 4. Rather than telling what you want in words its more easier when you give your expected output.
For example in the above sales table, I want to know the total quantity and number of invoice for each product.
The output should look like this
Prod_id sum_qty count_inv
1 145 2 5. When ever you get an error message post the entire error message. With the Error Number, The message and the Line number.
6. Next thing is a very important thing to remember. Please post only well formatted code. Unformatted code is very hard to read.
Your code format gets lost when you post it in the Oracle Forum. So in order to preserve it you need to
use the {noformat}{noformat} tags.
The usage of the tag is like this.
<place your code here>\
7. If you are posting a *Performance Related Question*. Please read
{thread:id=501834} and {thread:id=863295}.
Following those guide will be very helpful.
8. Please keep in mind that this is a public forum. Here No question is URGENT.
So use of words like *URGENT* or *ASAP* (As Soon As Possible) are considered to be rude. -
Hi
I want to retrieve the data if the data contains a character or a space or '-' thru select query .
Please help me in writing the combination of 3 with regular expression.
Thanks!!VT wrote:
Hi,
Try this
SELECT *
FROM <TABLE> WHERE REGEXP_LIKE(<COLUMN>, '[a-z -][A-Z -]');cheers
VTThat won't work as it's expecting at least two characters with the first having to be a-z (lower case) or space or "-" followed by A-Z (upper case) or space or "-".
The correct way is either:
[a-zA-Z -]or
[[:alpha:] -]using the alpha set is often preferable as it can work differently with different character sets/languages rather than restricting to just the a-zA-Z ranges.
Generating a reference for your own database characterset/language can be useful...
SQL> select level-1 as asc_code, decode(chr(level-1), regexp_substr(chr(level-1), '[[:print:]]'), CHR(level-1)) as chr,
2 decode(chr(level-1), regexp_substr(chr(level-1), '[[:graph:]]'), 1) is_graph,
3 decode(chr(level-1), regexp_substr(chr(level-1), '[[:blank:]]'), 1) is_blank,
4 decode(chr(level-1), regexp_substr(chr(level-1), '[[:alnum:]]'), 1) is_alnum,
5 decode(chr(level-1), regexp_substr(chr(level-1), '[[:alpha:]]'), 1) is_alpha,
6 decode(chr(level-1), regexp_substr(chr(level-1), '[[:digit:]]'), 1) is_digit,
7 decode(chr(level-1), regexp_substr(chr(level-1), '[[:cntrl:]]'), 1) is_cntrl,
8 decode(chr(level-1), regexp_substr(chr(level-1), '[[:lower:]]'), 1) is_lower,
9 decode(chr(level-1), regexp_substr(chr(level-1), '[[:upper:]]'), 1) is_upper,
10 decode(chr(level-1), regexp_substr(chr(level-1), '[[:print:]]'), 1) is_print,
11 decode(chr(level-1), regexp_substr(chr(level-1), '[[:punct:]]'), 1) is_punct,
12 decode(chr(level-1), regexp_substr(chr(level-1), '[[:space:]]'), 1) is_space,
13 decode(chr(level-1), regexp_substr(chr(level-1), '[[:xdigit:]]'), 1) is_xdigit
14 from dual
15 connect by level <= 256
16 /
ASC_CODE C IS_GRAPH IS_BLANK IS_ALNUM IS_ALPHA IS_DIGIT IS_CNTRL IS_LOWER IS_UPPER IS_PRINT IS_PUNCT IS_SPACE IS_XDIGIT
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 1 1
10 1 1
11 1 1
12 1 1
13 1 1
14 1
15 1
16 1
17 1
18 1
19 1
20 1
21 1
22 1
23 1
24 1
25 1
26 1
27 1
28 1
29 1
30 1
31 1
32 1 1 1
33 ! 1 1 1
34 " 1 1 1
35 # 1 1 1
36 $ 1 1 1
37 % 1 1 1
38 & 1 1 1
39 ' 1 1 1
40 ( 1 1 1
41 ) 1 1 1
42 * 1 1 1
43 + 1 1 1
44 , 1 1 1
45 - 1 1 1
46 . 1 1 1
47 / 1 1 1
48 0 1 1 1 1 1
49 1 1 1 1 1 1
50 2 1 1 1 1 1
51 3 1 1 1 1 1
52 4 1 1 1 1 1
53 5 1 1 1 1 1
54 6 1 1 1 1 1
55 7 1 1 1 1 1
56 8 1 1 1 1 1
57 9 1 1 1 1 1
58 : 1 1 1
59 ; 1 1 1
60 < 1 1 1
61 = 1 1 1
62 > 1 1 1
63 ? 1 1 1
64 @ 1 1 1
65 A 1 1 1 1 1 1
66 B 1 1 1 1 1 1
67 C 1 1 1 1 1 1
68 D 1 1 1 1 1 1
69 E 1 1 1 1 1 1
70 F 1 1 1 1 1 1
71 G 1 1 1 1 1
72 H 1 1 1 1 1
73 I 1 1 1 1 1
74 J 1 1 1 1 1
75 K 1 1 1 1 1
76 L 1 1 1 1 1
77 M 1 1 1 1 1
78 N 1 1 1 1 1
79 O 1 1 1 1 1
80 P 1 1 1 1 1
81 Q 1 1 1 1 1
82 R 1 1 1 1 1
83 S 1 1 1 1 1
84 T 1 1 1 1 1
85 U 1 1 1 1 1
86 V 1 1 1 1 1
87 W 1 1 1 1 1
88 X 1 1 1 1 1
89 Y 1 1 1 1 1
90 Z 1 1 1 1 1
91 [ 1 1 1
92 \ 1 1 1
93 ] 1 1 1
94 ^ 1 1 1
95 _ 1 1 1
96 ` 1 1 1
97 a 1 1 1 1 1 1
98 b 1 1 1 1 1 1
99 c 1 1 1 1 1 1
100 d 1 1 1 1 1 1
101 e 1 1 1 1 1 1
102 f 1 1 1 1 1 1
103 g 1 1 1 1 1
104 h 1 1 1 1 1
105 i 1 1 1 1 1
106 j 1 1 1 1 1
107 k 1 1 1 1 1
108 l 1 1 1 1 1
109 m 1 1 1 1 1
110 n 1 1 1 1 1
111 o 1 1 1 1 1
112 p 1 1 1 1 1
113 q 1 1 1 1 1
114 r 1 1 1 1 1
115 s 1 1 1 1 1
116 t 1 1 1 1 1
117 u 1 1 1 1 1
118 v 1 1 1 1 1
119 w 1 1 1 1 1
120 x 1 1 1 1 1
121 y 1 1 1 1 1
122 z 1 1 1 1 1
123 { 1 1 1
124 | 1 1 1
125 } 1 1 1
126 ~ 1 1 1
127 1
128 Ç 1 1 1
etc.
{code} -
Help in query using regular expression
HI,
I need a help to get the below output using regular expression query. Please help me.
SELECT REGEXP_SUBSTR ('PWRPKG(P/W+P/L+CC)', '[^+]+', 1, lvl) val, lvl
FROM DUAL,(SELECT LEVEL lvl FROM DUAL
CONNECT BY LEVEL <=(SELECT MAX ( LENGTH ('PWRPKG(P/W+P/L+CC)') - LENGTH (REPLACE ('PWRPKG(P/W+P/L+CC)','+',NULL))+ 1) FROM DUAL));
I need the output as
correct result:
==============
val lvl
P/W 1
P/L 2
CC 3
But i tried the above it is not coming the above result. Please help me where i did a mistake.
Thanks in advanceFrank gave you a solution in your other thread. You could simplify it if you are on 11g:
SQL> select * from table_x
2 /
TXT
TECHPKG(INTELLI CC+FRT SONAR)
PWRPKG(P/W+P/L+CC)
select txt,
regexp_substr(
txt,
'(.*\()*([^+)]+)',
1,
column_value,
null,
2
) element,
column_value element_number
from table_x,
table(
cast(
multiset(
select level
from dual
connect by level <= regexp_count(txt,'\+') + 1
as sys.OdciNumberList
order by rowid,
column_value
TXT ELEMENT ELEMENT_NUMBER
TECHPKG(INTELLI CC+FRT SONAR) INTELLI CC 1
TECHPKG(INTELLI CC+FRT SONAR) FRT SONAR 2
PWRPKG(P/W+P/L+CC) P/W 1
PWRPKG(P/W+P/L+CC) P/L 2
PWRPKG(P/W+P/L+CC) CC 3
SQL> SY. -
Query help in regular expression
Hi all,
SELECT * FROM emp11
WHERE INSTR(ENAME,'A',1,2) >0;
Please let me know the equivalent query using regular expressions.
i have tried this after going through oracle regular expressions documentation.
SELECT * FROM emp11
WHERE regexp_LIKE(ename,'A{2}')
Any help in this regard would be highly appreciated .
Thanks,
P Prakashplease go here
Introduction to regular expressions ...
Thanks,
P Prakash -
Urgent!!! Problem in regular expression for matching braces
Hi,
For the example below, can I write a regular expression to store getting key, value pairs.
example: ((abc def) (ghi jkl) (a ((b c) (d e))) (mno pqr) (a ((abc def))))
in the above example
abc is key & def is value
ghi is key & jkl is value
a is key & ((b c) (d e)) is value
and so on.
can anybody pls help me in resolving this problem using regular expressions...
Thanks in advance"((key1 value1) (key2 value2) (key3 ((key4 value4)
(key5 value5))) (key6 value6) (key7 ((key8 value8)
(key9 value9))))"
I want to write a regular expression in java to parse
the above string and store the result in hash table
as below
key1 value1
key2 value2
key3 ((key4 value4) (key5 value5))
key4 value4
key5 value5
key6 value6
key7 ((key8 value8) (key9 value9))
key8 value8
key9 value9
please let me know, if it is not possible with
regular expressions the effective way of solving itYes, it is possible with a recursive regular expression.
Unfortunately Java does not provide a recursive regular expression construct.
$_ = "((key1 value1) (key2 value2) (key3 ((key4 value4) (key5 value5))) (key6 value6) (key7 ((key8 value8) (key9 value9))))";
my $paren;
$paren = qr/
[^()]+ # Not parens
|
(??{ $paren }) # Another balanced group (not interpolated yet)
/x;
my $r = qr/^(.*?)\((\w+?) (\w+?|(??{$paren}))\)\s*(.*?)$/;
while ($_) {
match()
# operates on $_
sub match {
my @v;
@v = m/$r/;
if (defined $v[3]) {
$_ = $v[2];
while (/\(/) {
match();
print "\"",$v[1],"\" \"",$v[2],"\"";
$_ = $v[0].$v[3];
else { $_ = ""; }
C:\usr\schodtt\src\java\forum\n00b\regex>perl recurse.pl
"key1" "value1"
"key2" "value2"
"key4" "value4"
"key5" "value5"
"key3" "((key4 value4) (key5 value5))"
"key6" "value6"
"key8" "value8"
"key9" "value9"
"key7" "((key8 value8) (key9 value9))"
C:\usr\schodtt\src\java\forum\n00b\regex> -
Bracket in Regular Expression constant?
I am a bit puzzled by the behavior I am experiencing in LV 2011. I hope to get some light from experts out there.
I am trying to parse a messy ASCII header file and after having split it into individual lines (strings), I use the "Match Regular Expression" function to remove some of the info before the substantial information.
Some of the strings include square brackets ([, ]), which are special characters for the function, therefore, as documented in the help, one needs to precede them with a backslash.
Example:
I want to parse the following line:
#PR [PR_DEV,I,2]
One way (which I am using because of considerations related to the rest of the header) is the the following:
Note that the first string constant is using "Code Display" whereas the second one is using "Normal Display".
Why did I not put a backslash in front of the bracket in the first string, you may ask? Well, I did, but it disappeared after I typed the other characters. And reverting to "Normal Display" did not restore it.
Of course, the first version does not parse the input string correctly, whereas the second one does it fine.
In other words, the custom display string (which is convenient for cryptic codes such as \s* or to distinguish between space and tab...or simply ENTER tabs!) seems to mess up with the \[ combo (likewise with the \] one).
It is not a huge deal. I can use the "Normal Display" mode, but I tend to think that this qualifies as a hidden "feature". And again, it is still a pain in the ... when dealing with special characters such as tabs, etc...
Solved!
Go to Solution.I think that [ is a special character which needs to be preceded by a backslash, but it is not one of the defined backslash characters (like \s). So, you need to put in two \\ to get one \ while in '\' Codes Display.
You can put in any character by using \xx where the xx is a hex character using only upper case letters for A..F. I converted the strings to byte arrays and tried to see what made the arrays match and the Match work.
Lynn -
Need help with regular expression
I'm trying to use the java.util.regex package to extract URLs from html files.
The URLs that I am interested in extracting from the HTML look like the following:
<font color="#008000">http://forum.java.sun.com -
So, the URL is always preceeded by:
<font color="#008000">
and then followed by a space character and then a hyphen character. I want to be able to put all these URLs in a Vector object. This doesn't seem like it should be too difficult but for some reason I can't get anywhere with it. Any help would be greatly appreciated. Thanks!hi gupta am not sure of the java syntax but i can tell u about the regular expression...try this....
<font color="#008000">(http:\/\/[a-zA-Z0-9.]+) [-]
i dont know the java methods to call...just the reg exp...
Sanjay Acharya
Maybe you are looking for
-
How to find out the infoProvider for a given dimension table?
Experts: In RSA1, I want to find out the infoProvider for a given dimension table. But I am not sure how to display the tables linked to a given infoProvider. Could you provide a way to display all tables linked to a given infoProvider? Thanks a lot!
-
Acrobat Requires Other CS4 Ap To Launch?
Hello, Background: I'm using JAMF's Casper suite to push out CS4 to my user base, but not all of our users get the entire suite. Essentially, we create a package that contains all of the Adobe "support" files as one giant fileset, this goes to everyo
-
Best practice for RAC RMAN backup
I have a 10gR2 RAC db that is 3.5TB. It is a 2 node cluster on AIX. Each instance is db1 on node1, and db2 on node2, the "world" database name is db. Currently we are doing our backup from 1 node of the cluster, specifying that it connect to to db2,
-
Latest PATCH for java engine 2004s_SR1 ( 7.00, SP-Number: 06)
hello, I have installed NW 2004s_SR1 in a clustered environment. I am looking for "Latest <u><b>PATCH for java engine</b>"</u>. Current j2EE version details are as: Specifies the version of the system <b>Cluster-Version: 7.00 PatchLevel <
-
Best/recommended place to buy a MBP?
Is there a recommended/good/best place to buy Macs? People are like "i'd never let BestBuy/geeksquad touch my mac"