Regular expression confusion
Hi,
I want to use regular expressions to parse a text data file that has the following structure:
key1=value<EOL>
key2=value<EOL>
keyn=value<EOL>
<EOR><EOL>
key1=value<EOL>
keyn=value<EOL>
<EOR><EOL>
etc.
...where <EOR> is a user specified string that defines the end of the record and <EOL> is a user defined line delimiter (i.e. \n). I use the following regular expression to extract a single line [^n]+ which returns a string upto but exclusive of the delimiter itself. I want to be able to do the same for a delimiter that is a string rather than a single character. For example I would like to extract the entire record from the file using a regular expression; that is, all the characters (including line delimiters) upto but exclusive of the <EOR> where <EOR> is a string such as "EOR".
Is there a pattern similar to [^n]+ where I could also specify a string rather than a single character?
Given the following data file (newline characters are shown for clarity)...
MAKE=FORD\n
MODEL=MUSTANG\n
YEAR=1969\n
EOR\n
MAKE=DODGE\n
MODEL=CHARGER\n
YEAR=1973\n
EOR\n
MAKE=CHEVROLET\n
MODEL=CORVETTE\n
YEAR=1977\n
<end of file>
I want to know of a regular expression that will extract an entire record such that the group() method would return, for example, the following string when applied to the start of the file:
MAKE=FORD\n
MODEL=MUSTANG\n
YEAR=1969\n
I might be missing the point of what you want to do here, but I would probably approach it this way (modifying alice's example.)
import java.util.regex.*;
public class Test
public static void main(String[] args)
String str = "MAKE=FORD\n" +
"MODEL=MUSTANG\n" +
"YEAR=1969\n" +
"EOR\n" +
"MAKE=DODGE\n" +
"MODEL=CHARGER\n" +
"YEAR=1973\n" +
"EOR\n" +
"MAKE=CHEVROLET\n" +
"MODEL=CORVETTE\n" +
"YEAR=1977\n";
Pattern p = Pattern.compile("(.*?)(EOR|\\z)", Pattern.MULTILINE | Pattern.DOTALL);
Matcher m = p.matcher(str);
while (m.find())
System.out.println();
System.out.println(m.group(1));
}
Similar Messages
-
Confusion in a regular expression
Hi,
I am getting problem in understanding a regular expression "J.*\\d[0-35-9]-\\d\\d-\\d\\d" . The author says that it represents the pattern "Joe's Birthday is 12-17-77".
My problem is that what is [0-35-9] here. I have practised before [1-9] which represents any no. between 1 and 9 but I m not getting the meaning of [0-35-9]
Can anybody help me?
Best,
Arsalan[0-35-9]
is 0123 56789 but not 4thanx Gino86! I could now understand this regular expression -
Namburi,
When you said you used the Reg Exp tool, did you use it only as
preconfigured by the iMT migrate application wizard?
Because the default configuration of the regular expression tool will only
target the files in your ND project directories. If you wish to target
classes outside of the normal directory scope, you have to either modify the
"Source Directory" property OR create another instance of the regular
expression tool. See the "Tool" menu in the iMT to create additional tool
instances which can each be configured to target different sets of files
using different sets of rules.
Usually, I utilize 3 different sets of rules files on a given migration:
spider2jato.xml
these are the generic conversion rules (but includes the optimized rules for
ViewBean and Model based code, i.e. these rules do not utilize the
RequestManager since it is not needed for code running inside the ViewBean
or Model classes)
I run these rules against all files.
See the file download section of this forum for periodic updates to these
rules.
nonProjectFileRules.xml
these include rules that add the necessary
RequestManager.getRequestContext(). etc prefixes to many of the common
calls.
I run these rules against user module and any other classes that do not are
not ModuleServlet, ContainerView, or Model classes.
appXRules.xml
these rules include application specific changes that I discover while
working on the project. A common thing here is changing import statements
(since the migration tool moves ND project code into different jato
packaging structure, you sometime need to adjust imports in non-project
classes that previously imported ND project specific packages)
So you see, you are not limited to one set of rules at all. Just be careful
to keep track of your backups (the regexp tool provides several options in
its Expert Properties related to back up strategies).
----- Original Message -----
From: <vnamboori@y...>
Sent: Wednesday, August 08, 2001 6:08 AM
Subject: [iPlanet-JATO] Re: Use Of models in utility classes - Pease don't
forget about the regular expression potential
Thanks Matt, Mike, Todd
This is a great input for our migration. Though we used the existing
Regular Expression Mapping tool, we did not change this to meet our
own needs as mentioned by Mike.
We would certainly incorporate this to ease our migration.
Namburi
--- In iPlanet-JATO@y..., "Todd Fast" <toddwork@c...> wrote:
All--
Great response. By the way, the Regular Expression Tool uses thePerl5 RE
syntax as implemented by Apache OROMatcher. If you're doing lotsof these
sorts of migration changes manually, you should definitely buy theO'Reilly
book "Mastering Regular Expressions" and generate some rules toautomate the
conversion. Although they are definitely confusing at first,regular
expressions are fairly easy to understand with some documentation,and are
superbly effective at tackling this kind of migration task.
Todd
----- Original Message -----
From: "Mike Frisino" <Michael.Frisino@S...>
Sent: Tuesday, August 07, 2001 5:20 PM
Subject: Re: [iPlanet-JATO] Use Of models in utility classes -Pease don't
forget about the regular expression potential
Also, (and Matt's document may mention this)
Please bear in mind that this statement is not totally correct:
Since the migration tool does not do much of conversion for
these
utilities we have to do manually.Remember, the iMT is a SUITE of tools. There is the extractiontool, and
the translation tool, and the regular expression tool, and severalother
smaller tools (like the jar and compilation tools). It is correctto state
that the extraction and translation tools only significantlyconvert the
primary ND project objects (the pages, the data objects, and theproject
classes). The extraction and translation tools do minimumtranslation of the
User Module objects (i.e. they repackage the user module classes inthe new
jato module packages). It is correct that for all other utilityclasses
which are not formally part of the ND project, the extraction and
translation tools do not perform any migration.
However, the regular expression tool can "migrate" any arbitrary
file
(utility classes etc) to the degree that the regular expressionrules
correlate to the code present in the arbitrary file. So first andforemost,
if you have alot of spider code in your non-project classes youshould
consider using the regular expression tool and if warranted adding
additional rules to reduce the amount of manual adjustments thatneed to be
made. I can stress this enough. We can even help you write theregular
expression rules if you simply identify the code pattern you wish to
convert. Just because there is not already a regular expressionrule to
match your need does not mean it can't be written. We have notnearly
exhausted the possibilities.
For example if you say, we need to convert
CSpider.getDataObject("X");
To
RequestManager.getRequestContext().getModelManager().getModel(XModel.class);
Maybe we or somebody else in the list can help write that regularexpression if it has not already been written. For instance in thelast
updated spider2jato.xml file there is already aCSpider.getCommonPage("X")
rule:
<!--getPage to getViewBean-->
<mapping-rule>
<mapping-rule-primarymatch>
<![CDATA[CSpider[.\s]*getPage[\s]*\(\"([^"]*)\"]]>
</mapping-rule-primarymatch>
<mapping-rule-replacement>
<mapping-rule-match>
<![CDATA[CSpider[.\s]*getPage[\s]*\(\"([^"]*)\"]]>
</mapping-rule-match>
<mapping-rule-substitute>
<![CDATA[getViewBean($1ViewBean.class]]>
</mapping-rule-substitute>
</mapping-rule-replacement>
</mapping-rule>
Following this example a getDataObject to getModel would look
like this:
<mapping-rule>
<mapping-rule-primarymatch>
<![CDATA[CSpider[.\s]*getDataObject[\s]*\(\"([^"]*)\"]]>
</mapping-rule-primarymatch>
<mapping-rule-replacement>
<mapping-rule-match>
<![CDATA[CSpider[.\s]*getDataObject[\s]*\(\"([^"]*)\"]]>
</mapping-rule-match>
<mapping-rule-substitute>
<![CDATA[getModel($1Model.class]]>
</mapping-rule-substitute>
</mapping-rule-replacement>
</mapping-rule>
In fact, one migration developer already wrote that rule andsubmitted it
for inclusion in the basic set. I will post another upgrade to thebasic
regular expression rule set, look for a "file uploaded" posting.Also,
please consider contributing any additional generic rules that youhave
written for inclusion in the basic set.
Please not, that in some cases (Utility classes in particular)
the rule
application may be more effective as TWO sequention rules ratherthan one
monolithic rule. Again using the example above, it will convert
CSpider.getDataObject("Foo");
To
getModel(FooModel.class);
Now that is the most effective conversion for that code if that
code is in
a page or data object class file. But if that code is in a Utilityclass you
really want:
>
RequestManager.getRequestContext().getModelManager().getModel(FooModel.class
So to go from
getModel(FooModel.class);
To
RequestManager.getRequestContext().getModelManager().getModel(FooModel.class
You would apply a second rule AND you would ONLY run this rule
against
your utility classes so that you would not otherwise affect yourViewBean
and Model classes which are completely fine with the simplegetModel call.
<mapping-rule>
<mapping-rule-primarymatch>
<![CDATA[getModel\(]]>
</mapping-rule-primarymatch>
<mapping-rule-replacement>
<mapping-rule-match>
<![CDATA[getModel\(]]>
</mapping-rule-match>
<mapping-rule-substitute>
<![CDATA[RequestManager.getRequestContext().getModelManager().getModel(]]>
</mapping-rule-substitute>
</mapping-rule-replacement>
</mapping-rule>
A similer rule can be applied to getSession and other CSpider APIcalls.
For instance here is the rule for converting getSession calls toleverage
the RequestManager.
<mapping-rule>
<mapping-rule-primarymatch>
<![CDATA[getSession\(\)\.]]>
</mapping-rule-primarymatch>
<mapping-rule-replacement>
<mapping-rule-match>
<![CDATA[getSession\(\)\.]]>
</mapping-rule-match>
<mapping-rule-substitute>
<![CDATA[RequestManager.getSession().]]>
</mapping-rule-substitute>
</mapping-rule-replacement>
</mapping-rule>
----- Original Message -----
From: "Matthew Stevens" <matthew.stevens@e...>
Sent: Tuesday, August 07, 2001 12:56 PM
Subject: RE: [iPlanet-JATO] Use Of models in utility classes
Namburi,
I will post a document to the group site this evening which has
the
details
on various tactics of migrating these type of utilities.
Essentially,
you
either need to convert these utilities to Models themselves or
keep the
utilities as is and simply use the
RequestManager.getRequestContext.getModelManager().getModel()
to statically access Models.
For CSpSelect.executeImmediate() I have an example of customhelper
method
as a replacement whicch uses JDBC results instead of
CSpDBResult.
matt
-----Original Message-----
From: vnamboori@y... [mailto:<a href="/group/SunONE-JATO/post?protectID=081071113213093190112061186248100208071048">vnamboori@y...</a>]
Sent: Tuesday, August 07, 2001 3:24 PM
Subject: [iPlanet-JATO] Use Of models in utility classes
Hi All,
In the present ND project we have lots of utility classes.
These
classes in diffrent directory. Not part of nd pages.
In these classes we access the dataobjects and do themanipulations.
So we access dataobjects directly like
CSpider.getDataObject("do....");
and then execute it.
Since the migration tool does not do much of conversion forthese
utilities we have to do manually.
My question is Can we access the the models in the postmigration
sameway or do we need requestContext?
We have lots of utility classes which are DataObjectintensive. Can
someone suggest a better way to migrate this kind of code.
Thanks
Namburi
[email protected]
[email protected]
[Non-text portions of this message have been removed]
[email protected]
[email protected]Namburi,
When you said you used the Reg Exp tool, did you use it only as
preconfigured by the iMT migrate application wizard?
Because the default configuration of the regular expression tool will only
target the files in your ND project directories. If you wish to target
classes outside of the normal directory scope, you have to either modify the
"Source Directory" property OR create another instance of the regular
expression tool. See the "Tool" menu in the iMT to create additional tool
instances which can each be configured to target different sets of files
using different sets of rules.
Usually, I utilize 3 different sets of rules files on a given migration:
spider2jato.xml
these are the generic conversion rules (but includes the optimized rules for
ViewBean and Model based code, i.e. these rules do not utilize the
RequestManager since it is not needed for code running inside the ViewBean
or Model classes)
I run these rules against all files.
See the file download section of this forum for periodic updates to these
rules.
nonProjectFileRules.xml
these include rules that add the necessary
RequestManager.getRequestContext(). etc prefixes to many of the common
calls.
I run these rules against user module and any other classes that do not are
not ModuleServlet, ContainerView, or Model classes.
appXRules.xml
these rules include application specific changes that I discover while
working on the project. A common thing here is changing import statements
(since the migration tool moves ND project code into different jato
packaging structure, you sometime need to adjust imports in non-project
classes that previously imported ND project specific packages)
So you see, you are not limited to one set of rules at all. Just be careful
to keep track of your backups (the regexp tool provides several options in
its Expert Properties related to back up strategies).
----- Original Message -----
From: <vnamboori@y...>
Sent: Wednesday, August 08, 2001 6:08 AM
Subject: [iPlanet-JATO] Re: Use Of models in utility classes - Pease don't
forget about the regular expression potential
Thanks Matt, Mike, Todd
This is a great input for our migration. Though we used the existing
Regular Expression Mapping tool, we did not change this to meet our
own needs as mentioned by Mike.
We would certainly incorporate this to ease our migration.
Namburi
--- In iPlanet-JATO@y..., "Todd Fast" <toddwork@c...> wrote:
All--
Great response. By the way, the Regular Expression Tool uses thePerl5 RE
syntax as implemented by Apache OROMatcher. If you're doing lotsof these
sorts of migration changes manually, you should definitely buy theO'Reilly
book "Mastering Regular Expressions" and generate some rules toautomate the
conversion. Although they are definitely confusing at first,regular
expressions are fairly easy to understand with some documentation,and are
superbly effective at tackling this kind of migration task.
Todd
----- Original Message -----
From: "Mike Frisino" <Michael.Frisino@S...>
Sent: Tuesday, August 07, 2001 5:20 PM
Subject: Re: [iPlanet-JATO] Use Of models in utility classes -Pease don't
forget about the regular expression potential
Also, (and Matt's document may mention this)
Please bear in mind that this statement is not totally correct:
Since the migration tool does not do much of conversion for
these
utilities we have to do manually.Remember, the iMT is a SUITE of tools. There is the extractiontool, and
the translation tool, and the regular expression tool, and severalother
smaller tools (like the jar and compilation tools). It is correctto state
that the extraction and translation tools only significantlyconvert the
primary ND project objects (the pages, the data objects, and theproject
classes). The extraction and translation tools do minimumtranslation of the
User Module objects (i.e. they repackage the user module classes inthe new
jato module packages). It is correct that for all other utilityclasses
which are not formally part of the ND project, the extraction and
translation tools do not perform any migration.
However, the regular expression tool can "migrate" any arbitrary
file
(utility classes etc) to the degree that the regular expressionrules
correlate to the code present in the arbitrary file. So first andforemost,
if you have alot of spider code in your non-project classes youshould
consider using the regular expression tool and if warranted adding
additional rules to reduce the amount of manual adjustments thatneed to be
made. I can stress this enough. We can even help you write theregular
expression rules if you simply identify the code pattern you wish to
convert. Just because there is not already a regular expressionrule to
match your need does not mean it can't be written. We have notnearly
exhausted the possibilities.
For example if you say, we need to convert
CSpider.getDataObject("X");
To
RequestManager.getRequestContext().getModelManager().getModel(XModel.class);
Maybe we or somebody else in the list can help write that regularexpression if it has not already been written. For instance in thelast
updated spider2jato.xml file there is already aCSpider.getCommonPage("X")
rule:
<!--getPage to getViewBean-->
<mapping-rule>
<mapping-rule-primarymatch>
<![CDATA[CSpider[.\s]*getPage[\s]*\(\"([^"]*)\"]]>
</mapping-rule-primarymatch>
<mapping-rule-replacement>
<mapping-rule-match>
<![CDATA[CSpider[.\s]*getPage[\s]*\(\"([^"]*)\"]]>
</mapping-rule-match>
<mapping-rule-substitute>
<![CDATA[getViewBean($1ViewBean.class]]>
</mapping-rule-substitute>
</mapping-rule-replacement>
</mapping-rule>
Following this example a getDataObject to getModel would look
like this:
<mapping-rule>
<mapping-rule-primarymatch>
<![CDATA[CSpider[.\s]*getDataObject[\s]*\(\"([^"]*)\"]]>
</mapping-rule-primarymatch>
<mapping-rule-replacement>
<mapping-rule-match>
<![CDATA[CSpider[.\s]*getDataObject[\s]*\(\"([^"]*)\"]]>
</mapping-rule-match>
<mapping-rule-substitute>
<![CDATA[getModel($1Model.class]]>
</mapping-rule-substitute>
</mapping-rule-replacement>
</mapping-rule>
In fact, one migration developer already wrote that rule andsubmitted it
for inclusion in the basic set. I will post another upgrade to thebasic
regular expression rule set, look for a "file uploaded" posting.Also,
please consider contributing any additional generic rules that youhave
written for inclusion in the basic set.
Please not, that in some cases (Utility classes in particular)
the rule
application may be more effective as TWO sequention rules ratherthan one
monolithic rule. Again using the example above, it will convert
CSpider.getDataObject("Foo");
To
getModel(FooModel.class);
Now that is the most effective conversion for that code if that
code is in
a page or data object class file. But if that code is in a Utilityclass you
really want:
>
RequestManager.getRequestContext().getModelManager().getModel(FooModel.class
So to go from
getModel(FooModel.class);
To
RequestManager.getRequestContext().getModelManager().getModel(FooModel.class
You would apply a second rule AND you would ONLY run this rule
against
your utility classes so that you would not otherwise affect yourViewBean
and Model classes which are completely fine with the simplegetModel call.
<mapping-rule>
<mapping-rule-primarymatch>
<![CDATA[getModel\(]]>
</mapping-rule-primarymatch>
<mapping-rule-replacement>
<mapping-rule-match>
<![CDATA[getModel\(]]>
</mapping-rule-match>
<mapping-rule-substitute>
<![CDATA[RequestManager.getRequestContext().getModelManager().getModel(]]>
</mapping-rule-substitute>
</mapping-rule-replacement>
</mapping-rule>
A similer rule can be applied to getSession and other CSpider APIcalls.
For instance here is the rule for converting getSession calls toleverage
the RequestManager.
<mapping-rule>
<mapping-rule-primarymatch>
<![CDATA[getSession\(\)\.]]>
</mapping-rule-primarymatch>
<mapping-rule-replacement>
<mapping-rule-match>
<![CDATA[getSession\(\)\.]]>
</mapping-rule-match>
<mapping-rule-substitute>
<![CDATA[RequestManager.getSession().]]>
</mapping-rule-substitute>
</mapping-rule-replacement>
</mapping-rule>
----- Original Message -----
From: "Matthew Stevens" <matthew.stevens@e...>
Sent: Tuesday, August 07, 2001 12:56 PM
Subject: RE: [iPlanet-JATO] Use Of models in utility classes
Namburi,
I will post a document to the group site this evening which has
the
details
on various tactics of migrating these type of utilities.
Essentially,
you
either need to convert these utilities to Models themselves or
keep the
utilities as is and simply use the
RequestManager.getRequestContext.getModelManager().getModel()
to statically access Models.
For CSpSelect.executeImmediate() I have an example of customhelper
method
as a replacement whicch uses JDBC results instead of
CSpDBResult.
matt
-----Original Message-----
From: vnamboori@y... [mailto:<a href="/group/SunONE-JATO/post?protectID=081071113213093190112061186248100208071048">vnamboori@y...</a>]
Sent: Tuesday, August 07, 2001 3:24 PM
Subject: [iPlanet-JATO] Use Of models in utility classes
Hi All,
In the present ND project we have lots of utility classes.
These
classes in diffrent directory. Not part of nd pages.
In these classes we access the dataobjects and do themanipulations.
So we access dataobjects directly like
CSpider.getDataObject("do....");
and then execute it.
Since the migration tool does not do much of conversion forthese
utilities we have to do manually.
My question is Can we access the the models in the postmigration
sameway or do we need requestContext?
We have lots of utility classes which are DataObjectintensive. Can
someone suggest a better way to migrate this kind of code.
Thanks
Namburi
[email protected]
[email protected]
[Non-text portions of this message have been removed]
[email protected]
[email protected] -
Regular Expressions and Double Byte Characters ?
Is it possible to use Java Regular Expressions to parse
a file that will contain double byte characters ?
For example, I want a regular expression to match the following line
tag="double byte stuff" id="double byte stuff"The comments on the bytes/strings were helpful. Thanks.
But I'm still confused as to what matching pattern could be used.
For example a pattern like:
[A-Za-z]
I assume would not match any double byte characters.
I also assume the following won't work either:
[\\p{Alpah}]
because it is posix - US-ASCII only.
So how do you say "match the tag, then take any characters,
double byte, ascii, whatever, then match the text tag - per the
original example ? -
Problems with java regular expressions
Hi everybody,
Could someone please help me sort out an issue with Java regular expressions? I have been using regular expressions in Python for years and I cannot figure out how to do what I am trying to do in Java.
For example, I have this code in java:
import java.util.regex.*;
String text = "abc";
Pattern p = Pattern.compile("(a)b(c)");
Matcher m = p.matcher(text);
if (m.matches())
int count = m.groupCount();
System.out.println("Groups found " + String.valueOf(count) );
for (int i = 0; i < count; i++)
System.out.println("group " + String.valueOf(i) + " " + m.group(i));
My expectation is that group 0 would capture "abc", group 1 - "a" and group 2 - "c". Yet, I I get this:
Groups found 2
group 0 abc
group 1 a
I have tried other patterns and input text but the issue remains the same: no matter what, I cannot capture any paranthesized expression found in the pattern except for the first one. I tried the same example with Jakarta Regexp 1.5 and that works without any problems, I get what I expect.
I am using Java 1.5.0 on Mac OS X 10.4.
Thank to all who can help.paulcw wrote:
If the group count is X, then there are X plus one groups to go through: 0 for the whole match, then 1 through X for the individual groups.It does seem confusing that the designers chose to exclude the zero-group from group count, but the documentation is clear.
Matcher.groupCount():
Group zero denotes the entire pattern by convention. It is not included in this count. -
I have some sample code for string editing using regular expressions but I'm a little confused as to it's behavior. Here's what I have:
public class WordFixer
protected final String PUNC_MATCH = "[\\d\\p{Punct}]+";
protected final String PUNC_PREFIX = "^" + PUNC_MATCH;
protected final String PUNC_SUFFIX = PUNC_MATCH + "$";
public String fixPrefix (String w)
return w.replaceFirst(PUNC_PREFIX, "");
public String fixSuffix (String w)
return w.replaceFirst(PUNC_SUFFIX, "");
}This replaces all leading and trailing punctuation with "" and it works. However, changing the replaceFirst's to just replace doesn't work. I don't understand why that is. Doesn't the ^ mean in the front, and doesn't the $ mean in the back, so shouldn't this work by just changing replaceFirst to replace?JFactor2004 wrote:
I have some sample code for string editing using regular expressions but I'm a little confused as to it's behavior. Here's what I have:
public class WordFixer
protected final String PUNC_MATCH = "[\\d\\p{Punct}]+";
protected final String PUNC_PREFIX = "^" + PUNC_MATCH;
protected final String PUNC_SUFFIX = PUNC_MATCH + "$";
public String fixPrefix (String w)
return w.replaceFirst(PUNC_PREFIX, "");
public String fixSuffix (String w)
return w.replaceFirst(PUNC_SUFFIX, "");
}This replaces all leading and trailing punctuation with "" and it works. However, changing the replaceFirst's to just replace doesn't work. I don't understand why that is. The first parameter of the replaceFirst(...) is a regex-String. And the ^, $ and [...] stuff all have a special meaning in the regex language. The replace(...) method takes plain-Strings as a parameter, so the String "^\[\\d\\p{Punct}\]+" is interpreted as just that, without any special meaning.
JFactor2004 wrote:
Doesn't the ^ mean in the front, and doesn't the $ mean in the back, Yes, ^ means the start of the String (sometimes the start of a line) and $ means the end of the String (sometimes the end of a line).
JFactor2004 wrote:
so shouldn't this work by just changing replaceFirst to replace?Nope, see the explanation above. -
Regular Expression Character Sets with Pattern and Matcher
Hi,
I am a little bit confused about a regular expressions I am writing, it works in other languages but not in Java.
The regular expressions is to match LaTeX commands from a file, and is as follows:
\\begin{command}([.|\n\r\s]*)\\end{command}
This does not work in Java but does in PHP, C, etc...
The part that is strange is the . character. If placed as .* it works but if placed as [.]* it doesnt. Does this mean that . cannot be placed in a character range in Java?
Any help very much appreciated.
Kind Regards
Paul BainIn PHP it seems that the "." still works as a all character operator inside character classes.
The regular expression posted did not work, but it does if I do:
\\begin{command}((.|[\n\r\s])*)?\\end{command}
Basically what I'm trying to match is a block of LaTeX, so the \\begin{command} and \\end{command} in LaTeX, not regex, although the \\ is a single one in LaTeX. I basically want to match any block which starts with one of those and ends in the end command. so really the regular expression that counts is the bit in the middle, ((.|[\n\r\s])*)?
Am I right it saying that the "?" will prevent the engine matching the first and last \\bein and \\end in the following example:
\\begin{command}
some stuff
\\end{command}
\\begin{command}
some stuff
\\end{command} -
Using Regular Expressions for Completion
I'm trying to build a text completer for a simple little editor. The general idea is that I have a regular expression which describes the syntax of an expression and a set of strings which are all semantically valid cases of the expression (the latter of which is not particularly important to my problem). I would like to be able to determine, using the expression described, whether or not a section of text is capable of beginning a syntactically valid expression, not matching it.
For example, given the expression
"#[A-Za-z0-9]#" the string "#name#" is syntactically valid, whereas the string "#_blarg" is not. What I would like to do is be able to determine that "#partial" has the potential to match the pattern with more input, even if it doesn't yet. Specifically, the eventual use will be in such a case as the string X=#partial+3. If the cursor is positioned before the "+" and my user presses the completion keystroke, I want to recognize that "#partial" is what I need to recognize. Also, positioning the cursor immediately after the "=" and pressing the keystroke will do nothing, since nothing before the "=" is capable of matching the pattern properly.
Is this possible? I don't have to use this exact approach, but it is important that I be able to use the regular expression in detecting a partially completed expression. If I can, the set of regular expressions which already exist in the code can be used to drive the auto completer. Otherwise, I'll have to write a special recognition module for each case; that wouldn't be pretty.
Thanks for your time! I'll provide other information upon request, if it'd help. :)Thank you both for discussing this; it has definitely helped me in reaching a better understanding of uncle_alice's answer to my problem. I've adjusted my code to use this approach and, for the most part, it seems to work.
I say "for the most part" because I am compiling Patterns with the case insensitivity flag. This appears to do horrible, horrible things. Take a look at the following code, modified from uncle_alice's example:
String[] str = {"#test#hello", "#tes", "blargblarg", "", "#test#", "S"};
String rgx = "#[A-Za-z0-9]+#";
Pattern pc = Pattern.compile(rgx);
Pattern pi = Pattern.compile(rgx, Pattern.CASE_INSENSITIVE);
for (String s : str)
System.out.println(" For string: "+s);
for (Pattern p : new Pattern[]{pc, pi}) // once for each pattern
Matcher m = p.matcher(s);
if (m.matches())
System.out.printf("Matched '%s'", m.group());
} else
System.out.print("No match");
System.out.println("; hitEnd = " + m.hitEnd());
}That produces the following output:
For string: #test#hello
No match; hitEnd = false
No match; hitEnd = true
For string: #tes
No match; hitEnd = true
No match; hitEnd = true
For string: blargblarg
No match; hitEnd = false
No match; hitEnd = true
For string:
No match; hitEnd = true
No match; hitEnd = true
For string: #test#
Matched '#test#'; hitEnd = false
Matched '#test#'; hitEnd = false
For string: S
No match; hitEnd = false
No match; hitEnd = trueIt would seem that, with the case-insensitive flag set, hitEnd always returns true unless a match is found. Why is this? I find it quite confusing.
I can adjust my design to accomodate if this problem cannot be circumvented; however, I'd like to understand what has going wrong here. :)
Cheers! Thanks so much for all your help! -
Regular Expression Rage.
Hello,
I'm currently being driven insane by the regular expressions in java. I want to use the matches(String aString) method of String to find out if the user has inputed a wilcard filename such as *.txt or *.bmp etc.
The problem is I can't seem to put a * or . into the expression without it being considered as a special character. I want to match *. followed by one or more characters. It would seem logical that this should be "\*\..*" however the compiler complains that \* and \. are ilegal escape characters. In the API "\{" is given as an example of an escape character but when I try it I get the same compiler error.
Also I was trying to match a single backslash which I thought should be "\\" This compiles okay but when I run the program I get runtime error from Pattern. More confusingly if I use ".*\\.*" it will match with any input.
Am I doing something fundamentally wrong ? Some of my other matching patterns seem to work just fine.Well, for the Java compiler you need to escape \ making it \\.
Additionally the regexp requires you to escape \, so you escape it once for the regexp engine and once for the Java compiler making it \\\\. -
Regular Expressions with Call Policy on VCSe
Hi Guys,
I am working on firming up the call policy on my VCS Expressway to try to better intercept the SIP spam requests it is getting from internet ip numbers. Right now those spam requets are getting rejected by the loop detection but I want to intercept them before they even do a search on the Expressway. It seems that the call policy rules I create without regular expressions are functioning fine but I don't think I have the syntax correct for the regular expressions.
The goal of this rule is to reject any incoming SIP request that has a destination alias format of 7 to 17 digits followed by an @VCSe_IP. so for example it would reject the following attempts: 0123456@VCSe_IP and 0123456789101112@VCSe_IP with one rule.
The policy I created is this: source pattern: unauthenticated user, Destination pattern: \d{7,17}@xx\.xx\.xx\.xx (where xx is the individual octets of the VCSe IP address), Action: reject
However the above policy does not seem to be rejecting the calls before they do a search. I have checked the above expression with the check pattern tool on the VCSe and it comes up with a sucessful match when I try the destination alias of a request that made it through, hence my confusion. Any help you guys could provide would be appreciated.
Thanks,
StevenSteven,
Default Zone access rules do not relate to this at all and you can keep those set to 'No'.
How exactly are you placing the test calls when attempting to verify this?
I created the following CPL rule on my X7 VCS (With 10.10.10.10 being the IP address of my VCS):
Source pattern:
Destination pattern: \d{7,17}@10\.10\.10\.10
Action: Reject
I then proceeded with placing a SIP call from an unregistered C20, calling the URI '[email protected]' while running a diagnostics log on my VCS with Network log level set to 'DEBUG', and captured the following in that log:
Incoming INVITE:
2013-02-22T16:03:36+01:00 vcs02 tvcs: UTCTime="2013-02-22 15:03:36,598" Module="network.sip" Level="INFO": Src-ip="10.x.x.x" Src-port="5060" Detail="Receive Request Method=INVITE, Request-URI=sip:[email protected], Call-ID=9dd19ad75b1063ecf716461b149e9e2a"
2013-02-22T16:03:36+01:00 vcs02 tvcs: UTCTime="2013-02-22 15:03:36,598" Module="network.sip" Level="DEBUG": Src-ip="10.x.x.x" Src-port="5060"
SIPMSG:
|INVITE sip:[email protected] SIP/2.0
Call processing logic, showing CPL matching:
2013-02-22T16:03:36+01:00 vcs02 tvcs: Event="Search Attempted" Service="SIP" Src-alias-type="SIP" Src-alias="10.x.x.x" Dst-alias-type="SIP" Dst-alias="sip:[email protected]" Call-serial-number="0886391c-7d01-11e2-adf5-0050569a08fd" Tag="08863aac-7d01-11e2-bd2e-0050569a08fd" Detail="searchtype:INVITE" Level="1" UTCTime="2013-02-22 15:03:36,601"
2013-02-22T16:03:36+01:00 vcs02 tvcs: Event="Call Attempted" Service="SIP" Src-ip="10.x.x.x" Src-port="5060" Src-alias-type="SIP" Src-alias="sip:10.x.x.x" Dst-alias-type="SIP" Dst-alias="sip:[email protected]" Call-serial-number="0886391c-7d01-11e2-adf5-0050569a08fd" Tag="08863aac-7d01-11e2-bd2e-0050569a08fd" Protocol="UDP" Auth="NO" Level="1" UTCTime="2013-02-22 15:03:36,601"
2013-02-22T16:03:36+01:00 vcs02 tvcs: UTCTime="2013-02-22 15:03:36,602" Module="network.cpl" Level="DEBUG": Remote-ip="10.x.x.x" Remote-port="5060" Detail="CPL: "
2013-02-22T16:03:36+01:00 vcs02 tvcs: UTCTime="2013-02-22 15:03:36,602" Module="network.cpl" Level="DEBUG": Remote-ip="10.x.x.x" Remote-port="5060" Detail="CPL: "
2013-02-22T16:03:36+01:00 vcs02 tvcs: UTCTime="2013-02-22 15:03:36,602" Module="network.cpl" Level="DEBUG": Remote-ip="10.x.x.x" Remote-port="5060" Detail="CPL: matched "
2013-02-22T16:03:36+01:00 vcs02 tvcs: UTCTime="2013-02-22 15:03:36,602" Module="network.cpl" Level="DEBUG": Remote-ip="10.x.x.x" Remote-port="5060" Detail="CPL: "
VCS responds to INVITE with 403 Forbidden:
2013-02-22T16:03:36+01:00 vcs02 tvcs: UTCTime="2013-02-22 15:03:36,616" Module="network.sip" Level="DEBUG": Dst-ip="10.x.x.x" Dst-port="5060"
SIPMSG:
|SIP/2.0 403 Forbidden
As you can see, on my VCS everything seems to work as expected. I'd recommend you capture a similar diagnostics log on your own VCS to check what is different in your test call compared to the output above. -
Match beginning of line with Regular Expression
I'm confused about dreamweaver's treatment of the characters
^ and $ (beginning of line, end of line) in regex searches. It
seems that these characters match the beginning of the file, not
the beginning of the various lines in the file. I would expect it
to work the other way around. A search like:
(^.)
should match every line in the file, so that a find/replace
could be performed at the beginning of each line, like this:
HELLO$1
which would add 'HELLO' at the start of each line in the
file.
Instead, this action only matches the first character of the
file, sticks 'HELLO' in front of it, and then quits (or moves on to
the next file). The endline character $ behaves in a similar
fashion, matching only the end of the file, not the end of each
line.
I've searched, and all the literature about regular
expressions in dreamweaver seems to indicate that I'm expecting the
correct behavior:
www.adobe.com/devnet/dreamweaver/articles/regular_expressions_03.html
quote:
^ Beginning of input or line ^T matches "T" in "This good
earth" but not in "Uncle Tom's Cabin"
$ End of input or line h$ matches "h" in "teach" but not in
"teacher"
Thanks for any insight, folks.Hi Winston,
I am still digesting the material from the regular expression book and will take sometime to become proficient with it.
It seems that using groupCount() to eliminate the unwanted text does not work in this case, since all the lines returned the same value. Ie 3 posted earlier. This may be because the patterns are complex and only a few were grouped together. Otherwise, could you provide an example using the string posted as opposed to a hyperthetic one. In the meantime, at least one solution have been found by defining an additional special pattern “\\A[^%].*\\Z”, before combining / intersecting both existing and the new special pattern to get the best of both world. Another approach that should also work is to evaluate the size of String.split() and only accept those lines with a minimum number of tokens.
Anyhow, I have come a crossed another minor stumbling block in the mean time with the following line, where some hidden characters is preventing the existing pattern from reading it:
o;?Mervan Bay 40 Boyde St 7 br t $250,000 X West Park AE
Below is the existing regular expression that works for other lines with the same pattern but not for special hidden characters such as “o;?”:
\\A([A-Z][a-z]*){1,2} [0-9]{0,4}/?[0-9]{0,4}-?[0-9]{0,4} ([A-Z][a-z]*){1,2} St|Rd|Av|Sq|Cl|Pl|Cr|Gr|Dr|Hwy|Pde|Wy|La [0-9] br [h|u|t] \\$\\d+,\\d+|\\$\\d*\\,\\d+,\\d+ ([A-Z][a-z]*){1,}\\ZIs it possible to come up with a regular expression to ignore them so that this line could be picked up? Would also like to know whether I could combine both the special pattern “\\A[^%].*\\Z” with existing one as opposed to using 2 separate patterns altogether?
Many thanks,
Jack -
AutoVue 20.2.1 - ABV - Regular Expressions
All,
We have implemented the ABV/HotSpots feature and there is a regular expression we have defined that should work but isn't.
Regular Expression:
^[0-9]{5}$
Text:
00345
Using http://www.regular-expressions.info/javascriptexample.html to test it works seems is fine. So a00345, 00345a, 0a0345, test 00345 ALL fail which is what i want.
But when i setup the hotspot definition with the above it doesn't match the cases in the PDF document.
If i remove the ^ and $ then it does match but it matches cases i do not want (example: DVA-AB-12345)
I want 5 number and that is it, nothing before or after.
I am confused as to why the regular expression tester is saying it is fine and then AutoVue HotSpots fails it.
I have thought about case sensitive and and either way true or false we are using numbers so it shouldn't affect things.
Can anyone help?
NickHi Nick,
Currently, there is no way for you to specify which characters to highlight within the regular expression.
Theoretically, it could be done through regex groups: (^|\s)(\d{4})(\s|$), highlight group # 2.
We would need to enhance our API to accept the group index to highlight.
Please log an enhancement request for this feature. I'm sure it will be very useful!
Thanks,
George -
Format string using Regular Expression
Input string output format...
SELECT q'<select ab_c "ABC", efg "EFG" from dual>' str FROM DUAL
Output:
STR
select ab_c "ABC", efg "EFG" from dual
Required output format using regular expression...
STR
select 'ab_c' "ABC", 'efg' "EFG" from dualRegular expressions have many limitations as parsing tools, and you didn't specify the rules you wanted. This expression puts quotes around the non blank string before a quoted string:
SELECT regexp_replace(q'<select ab_c "ABC", efg "EFG" from dual>',
'([^" ]+)( +"[^ ]*")' , '''\1''\2' ) str FROM DUAL;
STR
select 'ab_c' "ABC", 'efg' "EFG" from dual
{code}
It is not robust - a missing " will confuse it, and you should be using bind variables anyway. -
Regular expression to match incremental or consecutive digits
I need to process a string containing all digits to ensure that it does not contain either
(a) a group of 5 or more consecutive identical digits eg. 11111
(b) a group of 5 or more incremental/decremental digits eg 12345, 98765
Also, is there a processing difference between the reluctant and greedy qualifiers?
Case (a) seems to be easy:
Pattern pattern = Pattern.compile("[0-9]{5,}?");but what about (b) ? From the documentation it seems that using capturing groups might be helpful, but they are confusing me.
Finally how do I merge multiple pattern matching strings into one overall regular expression so I can make one pass on the input to see whether it is valid or not?
Thanks
Chrisgive this code a try
public class Test {
static boolean check(String str) {
loop : for (int x = 0, y; x < str.length() - 4; x += y) {
y = 1;
int dif = (str.charAt(x+y) - str.charAt(x)); //assuming you don't whant 90123
if (dif >= -1 && dif <= 1) {
for (; y < 4; y++) {
if ((str.charAt(x+y+1) - str.charAt(x+y)) != dif) {
continue loop;
return true;
return false;
public static void main(String[] args) {
System.out.println(check(args[0]));
} -
Regular expression to covert pascal case to underscores
I am looking for a way to use the SQL regular expression support to convert some pascal text into underscore separated words.
For example:
IsComplianceActionPossible -> Is_Compliance_Action_Possible.
What I am confused on is how to get the capital letter back in the output.
select regexp_replace('IsComplianceActionPossible','[A-Z]','_') from dual
REGEXP_REPLACE('ISCOMPLIAN
_s_ompliance_ction_ossibleI am know I am missing something simple. Any help is appreciated.
Regards, Tonyebrian wrote:
Another option:
SQL> select regexp_replace('IsComplianceActionPossible','(.)([A-Z])','\1_\2') from dual;
REGEXP_REPLACE('ISCOMPLIANCEA
Is_Compliance_Action_Possible
Most likely it would fit OP's needs, howvever it will not work on one letter words in the middle:
SQL> select regexp_replace('HereIAm','(.)([A-Z])','\1_\2') from dual
2 /
REGEXP_R
Here_IAm
SQL> select ltrim(regexp_replace('HereIAm','([A-Z])','_\1'),'_') from dual;
LTRIM(REG
Here_I_Am
SQL> SY.
Maybe you are looking for
-
Why is a single Cross Reference text turning red after doing an Update?
After updating a large book in FrameMaker10, the same cross-reference made in a couple of chapters is turning to red text. It is set as Heading 2. I tried to deleted the heading and re-typing it. I tried to change the style to Body and back t
-
I have several Mac Book Pros that I am responsible for. They will no longer recognize the camera in SmartBoard and in Smart Bridgit since updating to 10.8.5. One of the machines is even running 10.9 and it still won't work. I have teachers who are
-
How to enter HRA without maintaining the infotype 0581
Hi Friends, We would like to maintain HRA without entering the value in 581 infotype. But it not allowing us to maintain like that..can any body let me know the dynamic action is existed for this. Any body can help out on this.. rewards will be given
-
OData data source for PowerPivot
Hi! I am a developer focused on data-oriented business apps. I would like to offer PowerPivot as a reporting/data analysis solution for our application. OData would be a perfect data exchange standard because we need to be able to access business log
-
X3-02 Screen lock - pincode corrupted! Beware! H...
This is terrible. I had set a pin code and have been using it for a few days now. I was on a phonecall (on a different phone) to Nokia tech support, unlocked the screen ok, went into some settings. While still on the call, had to unlock screen again