Regular expression usage question

Hi there.
I have a 200 bytes EBCDIC variable record which I need to break down into fields. Fields are positional and are either text, binary numbers, packed-decimal and 64bytes long numbers.
My question is. Can regular expression handle this complex data.
I want to isolate each field into their corresponding format. EBCDIC into ASCII text, binary into java Integer and so on.
The reason for using reqular expression is because the record format could change and regular expression would be easier to modify without having to change the code.
Your words of advice are highly appreciated.
Please advice.
Regards,
Ulises

Regular expressions? I don't think so.
If you have a situation where positions 1-3 might be a binary number like client number, and the format might change so it moves to positions 12-14, then you could certainly write a record-format class to encapsulate that sort of information. In fact that would be a very good idea. But I can't imagine how a regular expression would help in getting a number out of three bytes, for example.

Similar Messages

Jakarta regular expression package question

hi everyone,
I'm trying to figure out how to use the jakarta regular expression package to return an array of matches ?
For example let's say i had StringBuffer object consisting of say 30,000 characters and I needed to return all the matches that started with < and ended with > (like all the xml tags )
anybody know how I could use the package to return an array of matches ?
thanks in advance
stev

okay -- perhaps i should maybe rephrase my question for this package -- perhaps it is more popular ?
anyone know how i can pass a regular expression to it and retrieve a String array
In my specific situation , I have a StringBuffer with about 10,000 characters in it ?
any ideas how i would use regular expressions to look through this Stringbuffer object
thanks
stephen

Regular Expression Pattern Question

I'm building up a file URL String in my program. I'd like to remove all occurrences of "\\" in a String with "/" using the java.lang.String.replaceAll method:
String fileName = "junk.txt";
static String DEFAULT_FILE_URL_ROOT = "file:/" + System.getProperty("user.dir") + "/";
String schemaLocation = DEFAULT_FILE_URL_ROOT + fileName;
schemaLocation.replaceAll(java.io.File.separator, "/");But when I run this, I get the following exception:
java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
^
 at java.util.regex.Pattern.error(Pattern.java:1489)
 at java.util.regex.Pattern.compile(Pattern.java:1281)
 at java.util.regex.Pattern.<init>(Pattern.java:1030)
 at java.util.regex.Pattern.compile(Pattern.java:777)
 at java.lang.String.replaceAll(String.java:1710)
 at forum.jdom.example.DOMValidator.main(Unknown Source)What's the regex pattern I need to match "\\"? I can't find it. - MOD

The file seperator is a string which represents the single character which is a backslash (in windows.)
A single backslash in regex is an escape charater which escapes the next character. But there is no next character in your expression so it doesn't work. You need to replace what you have with the following...
And if you really insist on using the constant then the expression would be...
"\\" + java.io.File.separator

Regular Expression / String question?

Ok, So I want to do a String.split and split by a
(.*)But since parenthese's are regex, how would I search for that without messing up the call?

CButz wrote:
so
$.*$should work?Yup.
But note that if the regex is Java string literal--that is, in your source code in double quotes, as opposed to being read from user input or a config file, you'll need to double the backslash.
"\$.*\$"Also, note that that regex will match (abc)def(ghi) as one group. The abc)def(ghi will match the .*

Regular Expression - Quick Question

Hi Experts,
SELECT REGEXP_SUBSTR('123@123','[^@@]+',1,1) FROM DUAL;
Result:
123 -- Wrong one.
SELECT REGEXP_SUBSTR('123@@123','[^@@]+',1,1) FROM DUAL;
Result:
123 -- Right one.How can i achieve this?
Thanks,

michaels2 wrote:
SQL> with t as
select '123@123' str from dual union
select '123@@123' from dual
select str, regexp_replace(str,'@@.*') x from t
STR X
123@123 123@123
123@@123 123
2 rows selected.
No michael, he needs something like:
SQL> with t as
2 (
3 select '123@@456@@789@@000' str from dual
4 )
5 select regexp_substr(replace(str,'@@','#'),'[^#]+',1,level)
6 from t
7 connect by level<=(length(str)-length(replace(str,'@@','')))/2+1;
REGEXP_SUBSTR(R
123
456
789
000Because he has to extract tokens separated by @@, but I can't find a neat solution...
seems like Oracle is ignoring the grouping operator:
SQL> with t as
2 (select '123@123' str from dual union
3 select '123@@123' from dual)
4 select str, regexp_substr(str,'[^(@@)]+')
5 from t;
STR REGEXP_S
123@123 123
123@@123 123In the first row it should extract the full string, not just the first three characters...
Max
[My Italian Oracle blog| http://oracleitalia.wordpress.com/2010/02/07/aggiornare-una-tabella-con-listruzione-merge/]

Evaluate Regular expression complexity

Hi all
I've a problem on regular expression usage in my application.
I'm using a regular expression to identify objects and fetch them to be served depending on an input string with has to be matched.
each object has a property representing a regular expression to be matched to be candidate for fetching.
my program receive an external input string, then loops on the full objects collection identifying which are the object whose regular expression match the input stream.
doing an example:
obj1) key = "J*SDK"
obj2) key = "Ja*6*"
obj3) key = "JEE*"
if the input string is "Java 6.0 SDK" obj1 and obj2 are cadidate, while obj3 is discarded.
up to now everithing is fine, now here is my question:
i want only one object as output and I want the one best matching my input string.
this means that
-> obj1 is matching 4 chatacter ans has only one wildchar
-> obj2 is matching 3 characters and has two wildcard
so obj2 is discarded since it's regular expression is more complex than the obj1 one
my problem is HOW to evaulate correctly such complexity for each candidate object to be able to choose my best object.
is there some formal rule / api for this?
I'd like to match all wildcards into the regex, but doing this "by hand" would surely result in some bug due to some missing case, so a "third party" API or a grammar rule would be useful.
hoping for you help.
regards
Michele Sacchetti

ok, after days and days of research i came up to this solution:
1) I used this (http://www.brics.dk/automaton) package for regular expression which let me access the internal state automa data
2) use the getShortestExample() method to retrieve the shortest string matching the given regExp
3) evaluate the Levenshtein Distance between the given string and the one to be matched
PROs:
1) the regexp logic is fully handled by the same state machine which take cares of pattern matching in the first phase
2) the library provide me a non-regexp string to be used with comparison (e.g Levenshtein Distance evaluation)
CONs:
1) the methods getShortestExample is unaware of string to be matched, so if i use "aab|aaa" to match "aab" the method gets the first shortest sort alfabetically, that is "aaa", so I get a LD of 1 even if it should be 0, but it's quite a good deal for my application.
@endasil : your solution rely on grouping, and is based on a pre-parsing done manually so it basically went back to the "manual" parsing i wanted to avoid
Another way I'd like to give a try but had to give up was to use ANTLR (www.antlr.org) to create a parser for regular expression and then evaluating the resulting "tree size" of the parser, but wasn't able to find a formal description of RegExp grammar on the net.
do you have any suggestion or comment on my solution (or other to give a try? )

Regular Expressions - Questions to SME

Ralph Benzinger presented an online meetup on the topic of Regular Expressions. The presentation (slides only) can be found <a href="http://www.sdn.sap.comhttp://www.sdn.sap.comhttp://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/866072ca-0b01-0010-54b1-9c02a45ba8aa">here</a>
Unfortunately the recording is not going to be available, but Ralph has been generous enough to agree to answer questions posted to this sticky thread.
cheers,
Marilyn

Hello Peter,
You're welcome!
Alas, I was unable to locate the regex documentation on help.sap.com either. In fact, I'm not even sure it has already been updated for 2004s. I recommend that you use the online documentation within the system, e.g., from transactions SE38 or SE80. Do an index search for "regex", and you'll be directed to REGEX, FIND and REGEX, REPLACE, both of which have extensive subsections on regexes.
The class cx_sy_regex is an exception class that is thrown by FIND, REPLACE and cl_abap_regex in case of an invalid regex, such as ".\1" (there is no capture group that back reference \1 can refer to). If the pattern is known statically, the syntax check will report this error, but for statements like "FIND REGEX pat IN text.", the actual pattern is only known at runtime.
The cx_sy_matcher class (and its subclasses) similarly indicate some invalid states, for example trying to call "cl_abap_matcher->replace_found( )" when the matcher has no current match to replace (e.g., replace_found( ) called twice in a row).
Please let me know if I can provide some additional information.
Regards
Ralph

Question about match regular expression

Colleagues,
Very stupid question. I would like to get substring between "..." symbols. For example, string 02 July from Explosion occurred on "02 July", 2008.
How to do this with single Match Regular Expression?
For example such expression ".*" will give me "02 July":
But I would like to get it without " symbols!
I tried this "[~"]*" and this "[~"].*", then read this and this , and all without success... But I'm sure it should be possible. Can you help me?
Andrey.
PS
This regular expression should give exactly the same output as following construction:

I'm only using 7.0 now, but you can do this with Scan from String...
%[^"]"%[^"]"%[^"]
Message Edited by Phillip Brooks on 07-02-2008 02:47 PM
Now is the right time to use %^<%Y-%m-%dT%H:%M:%S%3uZ>T
If you don't hate time zones, you're not a real programmer.
"You are what you don't automate"
Inplaceness is synonymous with insidiousness
Attachments:
NotPCRE.png ‏20 KB

Help: Regular Expression question??

Hello,
How can I extract the following content using Java Regular expression?
<tr bgcolor="#333333">
 <td class="title" colspan="4" height="18"> SUPER_1 - SUPER_2</td>
</tr>
<tr bgcolor="#333333">
 <td class="match-light" width="45" height="18"> </td>
 <td class="match-light" colspan="3" width="286" align="right">March 19 </td>
</tr>
<tr>
 <td colspan="4" height="1"></td>
</tr>
<tr bgcolor="#cfcfcf">
 <td width="45" height="18"> FT</td>
 <td width="118" align="right">SUPER_3</td>
 <td width="50" align="center"><a class="scorelink" target="details" onclick="showDetails();">999 - 888</a></td>
 <td width="118">SUPER_4</td>
</tr>From the above contents, How can I define a regular expression for extract the "*SUPER_1*", "*SUPER_2*", "*March 19*", "*SUPER_3*", "*999*", "*888*" and "*SUPER_4*" ????
Please help.
Best regards,
Eric

Kayaman wrote:
Why not use a better way than regex, like an actual HTML parser (or XML if you have it well-formed)? People seem to love parsing (or rather, asking help how to parse) HTML with regex for some unknown reason.Indeed.
Read this (hilarious):
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

Regular Expression Question

Hi all,
I am suffering in java regular expression, and I hope you guys can help me out. I want to use the String api ".matches" to find out any string pattern like "xxxx.xxxx" where xxx can be only english word(both upper and lower case). Actually I will use this kind of expression to represent the cross join SQL statement in my java class, like "tableA.name = tableB.name", where they should be english letter only. I tried to use MyString.matches("^[A-Z] + \\. + ^[A-Z]") in my java program, but seem it doesn't work. Can you guys figure out the right expression for me ?? Many thanks
Transistor

Thanks for your prompt response, I tried your code, however, it doesn't work out.
I put your code like the following:
if ( searchCriteria.getStringPair().getValue().trim().matches("[A-Za-z]+\\.[A-Za-z]+") {...some action }.
Seems the java program never reach this expression.
Kindly remind that I wan to expression anything like "xxxxx.xxxxx" where xxxx can be a word.
Myriads of thanks
Transistor

Regular Expression Question, Repetition Operators

These are my success entries for a field;
123456,
123456,123456,
123456,123456,123456,
123456,123456,123456,123456,
"," seperated 6 digits can be repeated unlimited times.
I found on documentation this; "Repetition Operators; {m,} Match at least m times" and for my need i tried this regular expression; "^[[[:digit:]]{6},]{1,}$", but didnt worked :(
Any comments?
Thank you very much :)
Tonguc

repeating exactly 6
{6}
repeating at least 1
+
repeating at least 6
{6,}
ok, your problem is [ instead of (

Regular Expression question I think

My application is receiving HTML as a string and I'm trying to simplify it before displaying. The string could contain one or more substrings similar to this:
<span style="cursor:pointer" onmouseout="hideTooltip()" onmouseover="createTooltip( this,'The quartile that your firm\'s value falls into. Each quartile contains 25% of the values in the Peer Group. The 1st Quartile is always the best. The 4th Quartile is always the worst.', ( findPosX( this ) - 150))">QuartileThe text will not be consistent and it may be in the string as many as 4 times. What I'd like to do is replaceAll so that I end up with
QuartileIs there a way to do this with regular expression? I've tried replaceAll("","") But that takes out everything to the end of the String. I want it to stop at >.
Any way?

replaceAll("","")

Regular expression question (should be an easy one...)

i'm using java to build a parser. im getting an expression, which i split on a white-space.
how can i build a regular-expression that will enable me to split only on unquoted space? example:
for the expression:
(X=33 AND Y=44) OR (Z="hello world" AND T=2)
I will get the following values split:
(X=33
AND
Y=34)
OR
(Z="hello world"
AND
T=2)
and not:
(Z="
hello
world"
thank you very much!

Instead of splitting on whitespace to get a list of tokens, use Matcher.find() to match the tokens themselves: import java.util.*;
import java.util.regex.*;
public class Test
public static void main(String[] args) throws Exception
 String str = "(X=33 AND Y=44) OR (Z=\"hello world\" AND T=2)";
 List<String> tokens = new ArrayList<String>();
 Matcher m = Pattern.compile("[^\\s\"]+(?:\".*?\")?").matcher(str);
 while (m.find())
 tokens.add(m.group());
 System.out.println(tokens);
}{code} The regex I used is based on the assumptions that there will be at most one run of quoted text per token, that it will always appear in the right hand side of an expression, and that the closing quote will always mark the end of the token. If the rules are more complicated (as sabre150 suggested), a more complicated regex will be needed. You might be better off doing the parsing the old-fashioned way, with out regexes.

Question about Regular Expressions, please help!

I have created an app which reads files and extracts certain data using regular expressions in JDK1.4 using Pattern and Matcher classes.
However it needs to run on JDK1.2.2 (dont ask). The regular expression classes are not available in 1.2.2 (the Pattern and Matcher class) so i am looking for something similiar which i can use?
I need something that loops through all the matches found in the file like how Matcher works i.e.
while (matcher.find())
// do this
Help!

http://jakarta.apache.org/regexp/

Question in regular expressions

Hi,
I have this string (abdcerpabdcerpabdcerpaabdcerpabdcerp)
and i want to this string abdcerp and may be followed by one or more a's
So i want to get these results:
abdcerp
abdcerp
abdcerpa
abdcerp
abdcerp
I know regular expressions very well but i failed to generate one that can do so. I tried using this regex (abdcerpa*?) but it failed. it's not working and i dont know why, it's not getting abdcerpa. It only gets abdcerp
can anyone help me with that telling me the reson why did this regex is not good or tell me a regex for doing so. But I need it to be tested cause I already know about the concepts and tried different ways and regexs but failed
thanks
bye

That forum was retired and is now read-only. According to the announcement;
Any future posts on this topic should be put in the
.NET Framework Class Libraries forum.
Regards, Dave Patrick ....
Microsoft Certified Professional
Microsoft MVP [Windows]
Disclaimer: This posting is provided "AS IS" with no warranties or guarantees , and confers no rights.

Regular expression usage question

Similar Messages

Maybe you are looking for