Regular Expressions in ABAP
Hi, all!
Are there any possibilities to make use of regular expressions in 4.6C (FMs, classes)?
Regards,
Maxim.
Hi Maxim and all others whoever may read this ,
try the following code - but be patient and leave my (c) where it is:::
You may also have a look at the specialities of JavaScipt RegEx.
Yours,
Johannes
* an Example Call:
DATA return_value TYPE string.
DATA: match type ztmatch,
lastindex TYPE i,
leftcontext TYPE string,
rightcontext TYPE string,
index TYPE i,
searchstring TYPE string,
modifier TYPE string,
regex TYPE string,
found TYPE boolean,
error_message type string.
regex = 'b+(a)*(b+)'.
searchstring = 'abbbbabbaa'.
modifier = ''.
CALL METHOD ztr_bw_tools=>regex
IMPORTING
LASTINDEX = lastindex
LEFTCONTEXT = leftcontext
RIGHTCONTEXT = rightcontext
INDEX = index
FOUND = found
MATCH = match
RETURN_VALUE = return_value
ERROR_MESSAGE = error_message
CHANGING
SEARCHSTRING = searchstring
MODIFIER = modifier
REGEX = regex
Changing SEARCHSTRING TYPE STRING DEFAULT '' "string to be regex applicated
Changing MODIFIER TYPE STRING DEFAULT '' "/gims/
Changing REGEX TYPE STRING DEFAULT '' "regular expression
Exporting LASTINDEX TYPE I
Exporting LEFTCONTEXT TYPE STRING
Exporting RIGHTCONTEXT TYPE STRING
Exporting INDEX TYPE I
Exporting FOUND TYPE BOOLEAN "boolean variable (X=true, -=false, space=unknown)
Exporting MATCH TYPE ZTMATCH "For use with regular expressions
Exporting RETURN_VALUE TYPE STRING
Exporting ERROR_MESSAGE TYPE STRING
method REGEX .
* (c) by Johannes Rumpf - 2006 -
* Matching-Table of part matches of brackets
*DATA: BEGIN OF ztmatch,
* comp TYPE string,
* END OF ztmatch.
DATA source TYPE string.
DATA js_processor TYPE REF TO cl_java_script.
js_processor = cl_java_script=>create( ).
* JavaScript --> ABAP variablen Mapping
js_processor->bind( EXPORTING name_obj = ' '
name_prop = 'regex'
CHANGING data = regex ).
js_processor->bind( EXPORTING name_obj = ' '
name_prop = 'searchstring'
CHANGING data = searchstring ).
js_processor->bind( EXPORTING name_obj = ' '
name_prop = 'modifier'
CHANGING data = modifier ).
js_processor->bind( EXPORTING name_obj = ' '
name_prop = 'index'
CHANGING data = index ).
js_processor->bind( EXPORTING name_obj = 'abap'
name_prop = 'match'
CHANGING data = match ).
js_processor->bind( EXPORTING name_obj = ' '
name_prop = 'lastindex'
CHANGING data = lastindex ).
js_processor->bind( EXPORTING name_obj = ' '
name_prop = 'leftcontext'
CHANGING data = leftcontext ).
js_processor->bind( EXPORTING name_obj = ' '
name_prop = 'rightcontext'
CHANGING data = rightcontext ).
js_processor->bind( EXPORTING name_obj = ' '
name_prop = 'found'
CHANGING data = found ).
* eine Leerzeile hinzufügen
DATA: wa like line of match.
wa-comp = ' '.
append wa to match.
* JavaScript Code *REGEX*
CONCATENATE
'var re = new RegExp(regex, modifier);'
'var m = re.exec(searchstring);'
' if (m == null) {'
' found = false;'
' } else {'
' found = true; '
' index = m.index;'
' lastindex = m.lastIndex;'
' leftcontext = m.leftContext;'
' rightcontext = m.righContext; '
' var len = abap.match.length;'
' for (i = 0; i < m.length; i++) {'
' abap.match[len-1].comp = m<i>;'
' abap.match.appendLine();'
' len++;'
INTO source SEPARATED BY cl_abap_char_utilities=>cr_lf.
return_value = js_processor->evaluate( source ).
error_message = js_processor->LAST_ERROR_MESSAGE.
endmethod.
Similar Messages
-
Regular Expressions - unsetting greedy possible?
Hi,
I'm currently working on a parser and got some problems with the regular expressions in ABAP.
Lets say I want to calculate (22)*(33).
The RegExp \(.+\) finds everything between brackets - the problem is, that the engine finds everything between the first opening and the last closing bracket (actually it should find the first opening and the first closing bracket).
Is there a way to tell the engine to work ungreedy?
Thanks for your help
ChrisHi Prashant,
unfortunately this won't work either.
I'd better give some more information on the topic to increase understanding.
In order to calculate this string mathmatically I created a function working recursively. It calculates the (math)value of a string.
So lets say we want to calculate (22)*(33), the function is supposed to work this way:
math ( "(22)*(33)")
-> math ("2+2") (Calculating and returning value: 4)
The formula now is "4*(3+3)"
-> math("3+3") (Calculating and returning value: 6)
The formula now is "4*6"
-> math("4*6") (calculating and returning value 24)
Thus ABAP does not know ungreedy searches in regular expressions, the function would work this way:
math ( "(22)*(33)")
->math( "22)*(33" ) (using the wrong brackets...)
... leading to a math error.
Your solution, Prashant, would work for the first recursive call. Then, the formula would be "(22)*(32)" again.
Thanks though
Regards
Christian -
Hi All,
Can someone help me with a regular expression to match and return a pattern from a string
The pattern is anything within parenthesis and three characters long!
Eg: (B6P)
That is I have to get the first matching from the string
Eg2.
From string 'ADFGDFG(4GH)ghjghj(HH6)FGHghfhgfGFHJ(DFGFG)GFHFG sdfg dfgdf
I need to get (4GH)
Eg3
From String 'SDFG(SD6GD)FGDFGDFGsdfgdfg(ghj)(fghgffh)gfhgfhgfDFGDFHG
I need to get (ghi)
Can someone help?. Any quick help will be appreciated heavily
Thanks, Sudeep..Sudeep,
you can use that logic to build a subroutine for your problem like:
REPORT Z_STRING_MANIPULATION.
PARAMETERS: p_string TYPE string,
p_count TYPE I,
p_length TYPE I.
PERFORM z_return_string USING p_count p_length CHANGING p_string.
WRITE: p_string.
FORM z_return_string USING p_count TYPE I
p_length TYPE I
CHANGING p_return TYPE string.
p_return = p_return+p_count(p_length).
ENDFORM.
other solution ( if I got your picture ) is as follows:
REPORT Z_STRING_MANIPULATION.
DATA: pos TYPE i,
v_dif TYPE i,
v_return TYPE string,
pos_2 TYPE i.
PARAMETERS: p_text TYPE string DEFAULT 'SDFG(SD6GD)FGDFGDFGsdfgdfg(ghj)(fghgffh)gfhgfhgfDFGDFHG'.
DO.
SEARCH p_text FOR '('.
IF sy-subrc = 0.
pos = sy-fdpos.
ENDIF.
SEARCH p_text FOR ')'.
IF sy-subrc = 0.
pos_2 = sy-fdpos.
ENDIF.
v_dif = pos_2 - pos.
IF v_dif = 4.
v_return = p_text+pos(5).
WRITE: v_return.
exit.
ELSE.
pos_2 = pos_2 + 1.
p_text = p_text+pos_2.
ENDIF.
ENDDO.
Regards. -
Validate Email by regular Expression... Need Help
Dear All,
Requirement:
validate the email ID entered & throw error message, if it is invalid.
DATA c_mailpattern TYPE c LENGTH 60 VALUE
'[a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4} '.
** If @ is present, more than once. Error out
find ALL OCCURRENCES OF '@' in P_email
MATCH COUNT v_count.
if v_count > 1.
v_badpattern = 1.
endif.
** If , is present, once, Error out
find ALL OCCURRENCES OF ',' in P_Email
MATCH COUNT v_count.
if v_count > 0.
v_badpattern = v_badpattern + 1.
endif.
FIND REGEX c_mailpattern IN P_Email IGNORING CASE .
IF sy-subrc <> 0 OR v_badpattern > 0.
Write:/ p_EMAIL, 'has invalid Email format'.
ENDIF.
though this works fine, tester needs me to catch, if domain name has "app.com.com" as invalid email id.
above regex fails in such case.
I searched & found
{messageID=3706355}
messageID=1657369}{
https://wiki.sdn.sap.com/wiki/display/Snippets/E-MAIL+Validation
doesn't help.
I found this regex in a perl program.
[a-z0-9!#$%&'{size:14}*+{size:14}/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
Can I get help to modify this into ABAP String?
1) I can't bypass the boldened text using Escape characters like #* or '' Can some one help me assign this regex-string into a string variable?
2) This regex is longer than allowed length for a literal.
It can be split into 2 strings, then concatenated & checked.
Edited by: Mallikarjuna J on May 16, 2011 8:23 PM
Edited by: Mallikarjuna J on May 16, 2011 8:26 PMThanks Sebastian, Pratik & Keshav for the replies.
SX_INTERNET_ADDRESS_TO_NORMAL doesn't validate a wrong email ID. It only splits the internet address into mail & domain.
Prathik,
just .com.com is not the point, Bad input could be .net.ent or .net.com or so....
Amol, Thanks, but I keep receiving Error, not found in the 41 line response I get
I think we need to check not line 2 but line 28.
Taking cue from Prathik, I'm planning to put this
*** ls_inputmail-mail is the email-id entered by user.
************ Check for Valid Regular Expression
***** DOT(.) is allowed more than once,
***** @ is allowed only once,
***** , is not allowed.
** If @ is present, more than once. Error out
find ALL OCCURRENCES OF '@' in ls_input_mail-mail
MATCH COUNT v_count.
if v_count > 1.
v_badpattern = 1.
endif.
** If , is present, once, Error out
find ALL OCCURRENCES OF ',' in ls_input_mail-mail
MATCH COUNT v_count.
if v_count > 0.
v_badpattern = v_badpattern + 1.
endif.
** Find if domain part i.e., after @ has errors.
SPLIT ls_input_mail-mail at '@' into v_mailpart v_domain.
* there's a dot in the domain.
if v_domain Co '.' .
* last 2 char can only be country name, not anything else.
SPLIT v_domain at '@' into v_domain1 v_domain2.
* v_domain2 can only be a country name, else error out
select single landx from t005 into v_country
where landx = v_domain2.
if sy-subrc <> 0.
v_badpattern = v_badpattern + 1.
endif.
ENDIF.
FIND REGEX c_mailpattern IN ls_input_mail-mail IGNORING CASE .
IF sy-subrc <> 0 OR v_badpattern > 0.
Write:/ ls_inputmail-mail, 'has invalid email format'.
ENDIF.
However, I was wondering, if there was a way to use escapae characters & make the beow string as a valid regex variable to check email id.
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
Nevertheless, Thanks Friends for all your inputs.
Edited by: Mallikarjuna J on May 17, 2011 2:23 PM -
Logical AND in Java Regular Expressions
I'm trying to implement logical AND using Java Regular Expressions.
I couldn't figure out how to do it after reading Java docs and textbooks. I can do something like "abc.*def", which means that I'm looking for strings which have "abc", then anything, then "def", but it is not "pure" logical AND - I will not find "def.*abc" this way.
Any ideas, how to do it ?
BakenFirst off, looks like you're really talking about an "OR", not an "AND" - you want it to match abc.*def OR def.*abc right? If you tried to match abc.*def AND def.*abc nothing would ever match that, as no string can begin with both "abc" and "def", just like no numeric value can be both 2 and 5.
Anyway, maybe regex isn't the right tool for this job. Can you not simply programmatically match it yourself using String methods? You want it to match if the string "starts with" abc and "ends with" def, or vice-versa. Just write some simple code. -
Hello..
I wanted to write a regular expression to match the foll string..
<!--endclickprintexclude--><!--startclickprintexclude--> <!--endclickprintexclude-->
<p> <b>NEW ORLEANS, Louisiana (CNN) </b>
-- Two years after Hurricane Katrina devastated coastal areas of Louisiana and Mississippi, residents say much of America has forgotten their plight.
</p> <!--startclickprintexclude-->
I tried doing..
Matcher matcher= Pattern.compile("<!--endclickprintexclude--> <p><b>([^<^>]+?)</p><!--startclickprintexclude-->", Pattern.CASE_INSENSITIVE).matcher(story);
Its not working...
is there any other soln?Theres probably a better way to do this but here's a way that works.
import java.util.regex.*;
public class RegexTester{
public static void main(String[] args){
String text =
"<!--endclickprintexclude--><!--startclickprintexclude--> <!--endclickprintexclude-->" +
"<p> <b>NEW ORLEANS, Louisiana (CNN) </b>" +
"-- Two years after Hurricane Katrina devastated coastal areas of Louisiana and Mississippi," +
"residents say much of America has forgotten their plight." +
"</p> <!--startclickprintexclude-->";
String regex = ">((?:\\s*[\\S&&[^<>]]+\\s*)*?)<";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(text);
while(m.find()){
System.out.println("Match: '" + m.group(1) + "'");
} -
Help in regular expression matching
I have three expressions like
1) [(y2009)(y2011)]
2) [(y2008M5)(y2011M3)] or [(y2009M5)(y2010M12)]
3) [(y2009M1d20)(y2011M12d31)]
i want regular expression pattern for the above three expressions
I am using :
REGEXP_LIKE(timedomainexpression, '???[:digit:]{4}*[:digit:]{1,2}???[:digit:]{4}*[:digit:]{1,2}??', 'i');
but its giving results for all above expressions while i want different expression for each.
i hav used * after [:digit:]{4}, when i am using ? or . then its giving no results. Please help in this situation ASAP.
ThanksI dont get your question Can you post your desired output? and also give some sample data.
Please consider the following when you post a question.
1. New features keep coming in every oracle version so please provide Your Oracle DB Version to get the best possible answer.
You can use the following query and do a copy past of the output.
select * from v$version 2. This forum has a very good Search Feature. Please use that before posting your question. Because for most of the questions
that are asked the answer is already there.
3. We dont know your DB structure or How your Data is. So you need to let us know. The best way would be to give some sample data like this.
I have the following table called sales
with sales
as
select 1 sales_id, 1 prod_id, 1001 inv_num, 120 qty from dual
union all
select 2 sales_id, 1 prod_id, 1002 inv_num, 25 qty from dual
select *
from sales 4. Rather than telling what you want in words its more easier when you give your expected output.
For example in the above sales table, I want to know the total quantity and number of invoice for each product.
The output should look like this
Prod_id sum_qty count_inv
1 145 2 5. When ever you get an error message post the entire error message. With the Error Number, The message and the Line number.
6. Next thing is a very important thing to remember. Please post only well formatted code. Unformatted code is very hard to read.
Your code format gets lost when you post it in the Oracle Forum. So in order to preserve it you need to
use the {noformat}{noformat} tags.
The usage of the tag is like this.
<place your code here>\
7. If you are posting a *Performance Related Question*. Please read
{thread:id=501834} and {thread:id=863295}.
Following those guide will be very helpful.
8. Please keep in mind that this is a public forum. Here No question is URGENT.
So use of words like *URGENT* or *ASAP* (As Soon As Possible) are considered to be rude. -
Hi
I want to retrieve the data if the data contains a character or a space or '-' thru select query .
Please help me in writing the combination of 3 with regular expression.
Thanks!!VT wrote:
Hi,
Try this
SELECT *
FROM <TABLE> WHERE REGEXP_LIKE(<COLUMN>, '[a-z -][A-Z -]');cheers
VTThat won't work as it's expecting at least two characters with the first having to be a-z (lower case) or space or "-" followed by A-Z (upper case) or space or "-".
The correct way is either:
[a-zA-Z -]or
[[:alpha:] -]using the alpha set is often preferable as it can work differently with different character sets/languages rather than restricting to just the a-zA-Z ranges.
Generating a reference for your own database characterset/language can be useful...
SQL> select level-1 as asc_code, decode(chr(level-1), regexp_substr(chr(level-1), '[[:print:]]'), CHR(level-1)) as chr,
2 decode(chr(level-1), regexp_substr(chr(level-1), '[[:graph:]]'), 1) is_graph,
3 decode(chr(level-1), regexp_substr(chr(level-1), '[[:blank:]]'), 1) is_blank,
4 decode(chr(level-1), regexp_substr(chr(level-1), '[[:alnum:]]'), 1) is_alnum,
5 decode(chr(level-1), regexp_substr(chr(level-1), '[[:alpha:]]'), 1) is_alpha,
6 decode(chr(level-1), regexp_substr(chr(level-1), '[[:digit:]]'), 1) is_digit,
7 decode(chr(level-1), regexp_substr(chr(level-1), '[[:cntrl:]]'), 1) is_cntrl,
8 decode(chr(level-1), regexp_substr(chr(level-1), '[[:lower:]]'), 1) is_lower,
9 decode(chr(level-1), regexp_substr(chr(level-1), '[[:upper:]]'), 1) is_upper,
10 decode(chr(level-1), regexp_substr(chr(level-1), '[[:print:]]'), 1) is_print,
11 decode(chr(level-1), regexp_substr(chr(level-1), '[[:punct:]]'), 1) is_punct,
12 decode(chr(level-1), regexp_substr(chr(level-1), '[[:space:]]'), 1) is_space,
13 decode(chr(level-1), regexp_substr(chr(level-1), '[[:xdigit:]]'), 1) is_xdigit
14 from dual
15 connect by level <= 256
16 /
ASC_CODE C IS_GRAPH IS_BLANK IS_ALNUM IS_ALPHA IS_DIGIT IS_CNTRL IS_LOWER IS_UPPER IS_PRINT IS_PUNCT IS_SPACE IS_XDIGIT
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 1 1
10 1 1
11 1 1
12 1 1
13 1 1
14 1
15 1
16 1
17 1
18 1
19 1
20 1
21 1
22 1
23 1
24 1
25 1
26 1
27 1
28 1
29 1
30 1
31 1
32 1 1 1
33 ! 1 1 1
34 " 1 1 1
35 # 1 1 1
36 $ 1 1 1
37 % 1 1 1
38 & 1 1 1
39 ' 1 1 1
40 ( 1 1 1
41 ) 1 1 1
42 * 1 1 1
43 + 1 1 1
44 , 1 1 1
45 - 1 1 1
46 . 1 1 1
47 / 1 1 1
48 0 1 1 1 1 1
49 1 1 1 1 1 1
50 2 1 1 1 1 1
51 3 1 1 1 1 1
52 4 1 1 1 1 1
53 5 1 1 1 1 1
54 6 1 1 1 1 1
55 7 1 1 1 1 1
56 8 1 1 1 1 1
57 9 1 1 1 1 1
58 : 1 1 1
59 ; 1 1 1
60 < 1 1 1
61 = 1 1 1
62 > 1 1 1
63 ? 1 1 1
64 @ 1 1 1
65 A 1 1 1 1 1 1
66 B 1 1 1 1 1 1
67 C 1 1 1 1 1 1
68 D 1 1 1 1 1 1
69 E 1 1 1 1 1 1
70 F 1 1 1 1 1 1
71 G 1 1 1 1 1
72 H 1 1 1 1 1
73 I 1 1 1 1 1
74 J 1 1 1 1 1
75 K 1 1 1 1 1
76 L 1 1 1 1 1
77 M 1 1 1 1 1
78 N 1 1 1 1 1
79 O 1 1 1 1 1
80 P 1 1 1 1 1
81 Q 1 1 1 1 1
82 R 1 1 1 1 1
83 S 1 1 1 1 1
84 T 1 1 1 1 1
85 U 1 1 1 1 1
86 V 1 1 1 1 1
87 W 1 1 1 1 1
88 X 1 1 1 1 1
89 Y 1 1 1 1 1
90 Z 1 1 1 1 1
91 [ 1 1 1
92 \ 1 1 1
93 ] 1 1 1
94 ^ 1 1 1
95 _ 1 1 1
96 ` 1 1 1
97 a 1 1 1 1 1 1
98 b 1 1 1 1 1 1
99 c 1 1 1 1 1 1
100 d 1 1 1 1 1 1
101 e 1 1 1 1 1 1
102 f 1 1 1 1 1 1
103 g 1 1 1 1 1
104 h 1 1 1 1 1
105 i 1 1 1 1 1
106 j 1 1 1 1 1
107 k 1 1 1 1 1
108 l 1 1 1 1 1
109 m 1 1 1 1 1
110 n 1 1 1 1 1
111 o 1 1 1 1 1
112 p 1 1 1 1 1
113 q 1 1 1 1 1
114 r 1 1 1 1 1
115 s 1 1 1 1 1
116 t 1 1 1 1 1
117 u 1 1 1 1 1
118 v 1 1 1 1 1
119 w 1 1 1 1 1
120 x 1 1 1 1 1
121 y 1 1 1 1 1
122 z 1 1 1 1 1
123 { 1 1 1
124 | 1 1 1
125 } 1 1 1
126 ~ 1 1 1
127 1
128 Ç 1 1 1
etc.
{code} -
Help in query using regular expression
HI,
I need a help to get the below output using regular expression query. Please help me.
SELECT REGEXP_SUBSTR ('PWRPKG(P/W+P/L+CC)', '[^+]+', 1, lvl) val, lvl
FROM DUAL,(SELECT LEVEL lvl FROM DUAL
CONNECT BY LEVEL <=(SELECT MAX ( LENGTH ('PWRPKG(P/W+P/L+CC)') - LENGTH (REPLACE ('PWRPKG(P/W+P/L+CC)','+',NULL))+ 1) FROM DUAL));
I need the output as
correct result:
==============
val lvl
P/W 1
P/L 2
CC 3
But i tried the above it is not coming the above result. Please help me where i did a mistake.
Thanks in advanceFrank gave you a solution in your other thread. You could simplify it if you are on 11g:
SQL> select * from table_x
2 /
TXT
TECHPKG(INTELLI CC+FRT SONAR)
PWRPKG(P/W+P/L+CC)
select txt,
regexp_substr(
txt,
'(.*\()*([^+)]+)',
1,
column_value,
null,
2
) element,
column_value element_number
from table_x,
table(
cast(
multiset(
select level
from dual
connect by level <= regexp_count(txt,'\+') + 1
as sys.OdciNumberList
order by rowid,
column_value
TXT ELEMENT ELEMENT_NUMBER
TECHPKG(INTELLI CC+FRT SONAR) INTELLI CC 1
TECHPKG(INTELLI CC+FRT SONAR) FRT SONAR 2
PWRPKG(P/W+P/L+CC) P/W 1
PWRPKG(P/W+P/L+CC) P/L 2
PWRPKG(P/W+P/L+CC) CC 3
SQL> SY. -
Query help in regular expression
Hi all,
SELECT * FROM emp11
WHERE INSTR(ENAME,'A',1,2) >0;
Please let me know the equivalent query using regular expressions.
i have tried this after going through oracle regular expressions documentation.
SELECT * FROM emp11
WHERE regexp_LIKE(ename,'A{2}')
Any help in this regard would be highly appreciated .
Thanks,
P Prakashplease go here
Introduction to regular expressions ...
Thanks,
P Prakash -
Urgent!!! Problem in regular expression for matching braces
Hi,
For the example below, can I write a regular expression to store getting key, value pairs.
example: ((abc def) (ghi jkl) (a ((b c) (d e))) (mno pqr) (a ((abc def))))
in the above example
abc is key & def is value
ghi is key & jkl is value
a is key & ((b c) (d e)) is value
and so on.
can anybody pls help me in resolving this problem using regular expressions...
Thanks in advance"((key1 value1) (key2 value2) (key3 ((key4 value4)
(key5 value5))) (key6 value6) (key7 ((key8 value8)
(key9 value9))))"
I want to write a regular expression in java to parse
the above string and store the result in hash table
as below
key1 value1
key2 value2
key3 ((key4 value4) (key5 value5))
key4 value4
key5 value5
key6 value6
key7 ((key8 value8) (key9 value9))
key8 value8
key9 value9
please let me know, if it is not possible with
regular expressions the effective way of solving itYes, it is possible with a recursive regular expression.
Unfortunately Java does not provide a recursive regular expression construct.
$_ = "((key1 value1) (key2 value2) (key3 ((key4 value4) (key5 value5))) (key6 value6) (key7 ((key8 value8) (key9 value9))))";
my $paren;
$paren = qr/
[^()]+ # Not parens
|
(??{ $paren }) # Another balanced group (not interpolated yet)
/x;
my $r = qr/^(.*?)\((\w+?) (\w+?|(??{$paren}))\)\s*(.*?)$/;
while ($_) {
match()
# operates on $_
sub match {
my @v;
@v = m/$r/;
if (defined $v[3]) {
$_ = $v[2];
while (/\(/) {
match();
print "\"",$v[1],"\" \"",$v[2],"\"";
$_ = $v[0].$v[3];
else { $_ = ""; }
C:\usr\schodtt\src\java\forum\n00b\regex>perl recurse.pl
"key1" "value1"
"key2" "value2"
"key4" "value4"
"key5" "value5"
"key3" "((key4 value4) (key5 value5))"
"key6" "value6"
"key8" "value8"
"key9" "value9"
"key7" "((key8 value8) (key9 value9))"
C:\usr\schodtt\src\java\forum\n00b\regex> -
Bracket in Regular Expression constant?
I am a bit puzzled by the behavior I am experiencing in LV 2011. I hope to get some light from experts out there.
I am trying to parse a messy ASCII header file and after having split it into individual lines (strings), I use the "Match Regular Expression" function to remove some of the info before the substantial information.
Some of the strings include square brackets ([, ]), which are special characters for the function, therefore, as documented in the help, one needs to precede them with a backslash.
Example:
I want to parse the following line:
#PR [PR_DEV,I,2]
One way (which I am using because of considerations related to the rest of the header) is the the following:
Note that the first string constant is using "Code Display" whereas the second one is using "Normal Display".
Why did I not put a backslash in front of the bracket in the first string, you may ask? Well, I did, but it disappeared after I typed the other characters. And reverting to "Normal Display" did not restore it.
Of course, the first version does not parse the input string correctly, whereas the second one does it fine.
In other words, the custom display string (which is convenient for cryptic codes such as \s* or to distinguish between space and tab...or simply ENTER tabs!) seems to mess up with the \[ combo (likewise with the \] one).
It is not a huge deal. I can use the "Normal Display" mode, but I tend to think that this qualifies as a hidden "feature". And again, it is still a pain in the ... when dealing with special characters such as tabs, etc...
Solved!
Go to Solution.I think that [ is a special character which needs to be preceded by a backslash, but it is not one of the defined backslash characters (like \s). So, you need to put in two \\ to get one \ while in '\' Codes Display.
You can put in any character by using \xx where the xx is a hex character using only upper case letters for A..F. I converted the strings to byte arrays and tried to see what made the arrays match and the Match work.
Lynn -
Need help with regular expression
I'm trying to use the java.util.regex package to extract URLs from html files.
The URLs that I am interested in extracting from the HTML look like the following:
<font color="#008000">http://forum.java.sun.com -
So, the URL is always preceeded by:
<font color="#008000">
and then followed by a space character and then a hyphen character. I want to be able to put all these URLs in a Vector object. This doesn't seem like it should be too difficult but for some reason I can't get anywhere with it. Any help would be greatly appreciated. Thanks!hi gupta am not sure of the java syntax but i can tell u about the regular expression...try this....
<font color="#008000">(http:\/\/[a-zA-Z0-9.]+) [-]
i dont know the java methods to call...just the reg exp...
Sanjay Acharya -
Litte help with regular expression?
Greetings all,
I have a simple regular expression "(\\w+)\\s(\\w+)\\s(.+)"
Which I want to match against the strings like "Acetobacter pasteurianus LMD22.1"
But this always fails whenever there is a dot (.) character like "LMD22.1" in above string.
How to solve this ?
Thanks in advance.Shouldn't that be Acinetobacter?
edit: nope, I'm wrong, you're right.
Edited by: Encephalopathic on Apr 7, 2009 7:34 PM -
Help regarding regular expression
HI All ,
Please see the following string
String s = "IF ((NOT NUM4 IS ALPHABETIC ) AND NUM3 IS ALPHABETIC-UPPER AND (NUM5 IS GREATER OR EQUAL TO 3) AND (NUM5 IS NOT GREATER THAN 3) AND (NUM3 GREATER THAN 46) AND (NUM5 GREATER THAN NUM3) OR NUM3 LESS THAN 78) .";
My problem is: i want to capture the part of this line which contains "ALPHABETIC ,ALPHABETIC-UPPER for ex :NOT NUM4 IS ALPHABETIC , NUM3 IS ALPHABETIC-UPPER.from that I have to capture the word num4 , num3 which are in these phrases only ;from the whole string whereever it exists along with the phrase,Can any one help me out by suggesting something.num4 and num3 are variable namesI suspect you're right, Sabre, but I can't resist...
import java.util.regex.*;
* A rewriter does a global substitution in the strings passed to its
* 'rewrite' method. It uses the pattern supplied to its constructor, and is
* like 'String.replaceAll' except for the fact that its replacement strings
* are generated by invoking a method you write, rather than from another
* string. This class is supposed to be equivalent to Ruby's 'gsub' when given
* a block. This is the nicest syntax I've managed to come up with in Java so
* far. It's not too bad, and might actually be preferable if you want to do
* the same rewriting to a number of strings in the same method or class. See
* the example 'main' for a sample of how to use this class.
* @author Elliott Hughes
public abstract class Rewriter
private Pattern pattern;
private Matcher matcher;
* Constructs a rewriter using the given regular expression; the syntax is
* the same as for 'Pattern.compile'.
public Rewriter(String regularExpression)
this.pattern = Pattern.compile(regularExpression);
* Returns the input subsequence captured by the given group during the
* previous match operation.
public String group(int i)
return matcher.group(i);
* Overridden to compute a replacement for each match. Use the method
* 'group' to access the captured groups.
public abstract String replacement();
* Returns the result of rewriting 'original' by invoking the method
* 'replacement' for each match of the regular expression supplied to the
* constructor.
public String rewrite(CharSequence original)
this.matcher = pattern.matcher(original);
StringBuffer result = new StringBuffer(original.length());
while (matcher.find())
matcher.appendReplacement(result, "");
result.append(replacement());
matcher.appendTail(result);
return result.toString();
public static void main(String[] args)
String s = "IF ((NOT NUM4 IS ALPHABETIC ) " +
"AND NUM3 IS ALPHABETIC-UPPER " +
"AND (NUM5 IS GREATER OR EQUAL TO 3) " +
"AND (NUM5 IS NOT GREATER THAN 3) " +
"AND (NUM3 GREATER THAN 46) " +
"AND NUM645 IS ALPHABETIC " +
"AND (NUM5 GREATER THAN NUM3) " +
"OR NUM3 LESS THAN 78 " +
"AND NUM34 IS ALPHABETIC-UPPER " +
"AND NUM92 IS ALPHABETIC-LOWER " +
"AND NUM0987 IS ALPHABETIC-LOWER) .";
String result =
new Rewriter("(NUM\\d+) +IS +(ALPHABETIC(?:-(?:UPPER|LOWER))?)")
public String replacement()
String type = group(2);
if (type.endsWith("UPPER"))
return "Character.isUpper(" + group(1) + ")";
else if (type.endsWith("LOWER"))
return "Character.isLower(" + group(1) + ")";
else
return "Character.isLetter(" + group(1) + ")";
}.rewrite(s);
System.out.println(result);
}
Maybe you are looking for
-
What is the correct adapter to connect my Macbook pro with mini-dvi to my TV with Scart
I have a MacBook Pro with mini-DVI and I want to connect it to my TV which has only a scart thing. Can I use the mini-DVI to Video-Adapter and then a Video to Scart-Adapter? Thanks for your help!
-
Error while testing script in ME22N
HI Friends, I am testing a script in ME22N. When i enter the PO number and press the print preview button I get the following error *Enter rate BAM / EUR rate type M for 20.1..09.0 in the system settings* Please let me know what should be done. Thank
-
I bought an external hard drive for backups to use with Time Machine, but however when I try to connect it with the other windows laptop it doesn't work ? intact it doesn't work on any other device except my MAC ?
-
Unable to see Arch in rEFInd on MacBook Pro 11,3
Hello fellow Arch users! I installed Arch on my MBP as per the wiki, using the latest arch boot-network.iso on a USB stick. The stick booted just fine from rEFInd after using dd to copy the image to the stick. After the installation (sans bootloade
-
Can I create a template and that will allow me to add a region when i use i
Is it possible to create a template that will allow me to add a region when i use the template on my pages? I have a region that allows the user to create content in the region, but now i would like to have another region next to it. thanks Angie