Regex \\d* vs \\d\\d

I have this issue with regex.
String s = "10 apples 20 oranges 30 pears";
String regx = "(\\d*\\s[a-z]*)*";
When I use the above pattern in any regular expression code for the given string 's', "10 apples 20 oranges 30 pears" is matched
If I change regx to "(\\d\\d\\s[a-z]*)*" then it matches "10 apples" only.
Why does it happen like this? I know * is zero or more but isn't \\d\\d is same as \\d* in this code?
Thanks

Right. You can see it happening if you remove the outer '*' and step through the matches with find(): import java.util.regex.*;
public class Test
public static void main(String args[])
    String s = "10 apples 20 oranges 30 pears";
    Pattern p = Pattern.compile("(\\d*\\s[a-z]*)");
    Matcher m = p.matcher(s);
    while (m.find())
      System.out.printf(">%s<%n", m.group(1));
} Because everything else is controlled by asterisks, the regex is only required to match one whitespace character, and on the second and fourth passes, that's what it does: >10 apples<
<
20 oranges<
<
30 pears< In the second version of the regex, you need to allow for the optional trailing space: "(\\d\\d\\s[a-z]+\\s?)+"); In general, when you need to match at least one of something, use '+' instead of '*'. That will help prevent confusing situations like this.

Similar Messages

Problems with String.split(regex)

Hi! I'm reading from a text file. I do it like this: i read lines in a loop, then I split each line into words, and in a for loop I process ale words. Heres the code:
BufferedReader reader = new BufferedReader(new FileReader(file));
String line;
while ((line = reader.readLine()) != null) {
String[] tokens = line.split(delimiters);
for (String key : tokens) {
doSthToToken();
reader.close();The problem is that if it reads an empty line, such as where is only an "\n", the tokens table has the length == 1, and it processes a string == "". I think the problem is that my regex delimiters is wrong:
String delimiters = "\\p{Space}|\\p{Punct}";
Could anybody tell me what to do?

Ok, so what do you suggest?I suggest you don't worry about it.
Or if you are worried then you need to test the two different solutions and do some timings yourself.
And how do you know the regex lib is so slow and badly written?First of all slowness is all relative. If something takes 1 millisecond vs 4 milliseconds is the user going to notice? Of course not which is why you are wasting your time trying to optimize an equals() method.
A general rule is that any code that is written to be extremely flexible will also be slower than any code that is written for a specific function. Regex are used for complex pattern matching. StringTokenizer was written specifically to split a string based on given delimiters. I must admit I haven't tested both in your exact scenario, but I have tested it in other simple scenarios which is where I got my number.
By the way I was able to write my own "SimpleTokenizer" which was about 30% faster than the StringTokenizer because I was able to make some assumptions. For example I only allowed a single delimiter to be specified vs multiple delimiter handled by the StringTokenizer. Therefore my code could be very specific and efficient. Now think about the code for a complex Regex and how general it must be.

Can we eliminate " from a String using regEx?

Hi,
I have a process and we are calling an external web service.
Within code, I am setting the response from web service(string) to bpm object and using the same object in jsp for presentation.
For Ex.
bpmObject.empName = response from web service.
within the jsp, have a java script where in I am validating the value.
var empNm= "<f:fieldValue att="bpmObject.empName" onlyValue="true"/>";
It works fine but some times when response from web service is having '"' - double quote, it is giving error.
For Ex. bpmObject.empName = FirstName "SecondName" then the javascript error occurs. how can we remove the double quote from this?
Tried bpmObject.empName.replace(from : "\"", @to : " ") but it is still giving the same output FirstName "SecondName"
It works fine when I test by using a String str = "FirstName\"SecondName\"".replace(from : "\"", @to : " ") is giving correct result i.e. FirstName SecondName
Can RegEx help in this?
Thanks

I know its better to handle in jsp/javascript
var empNm= "<f:fieldValue att="bpmObject.empName" onlyValue="true"/>";
but the above line is giving javascript error always.
how to handle the same in bpm fpr code?
Edited by: Sreekant on Mar 3, 2011 1:44 AM

Regex find and replace

I have inherited a boatload of code that I need to "tweak".
Currently, it contains many hundreds of refrences to a 2d
array and
references constants that I want to change to function calls.
i.e.
v(Svc,FutWTMargin)
I want that to be changed into:
v(Svc,getcol("FutWTMargin"))
Now the bit in quotes "FutWTMargin" has many variations, but
the structure
of the original 2d array references are all consistent - it's
just that
there are several hundred of them that I need to change.
Can someone help out with a regex that can change the
FutWTMargin part to
getcol("FutWTMargin") regardless of what the FutWTMargin text
might actually
say?
Note to self... must learn regex at some point!
Cheers,
Rob
http://robgt.com/ [Tutorials and
Extensions]
Firebox stuff:
http://robgt.com/firebox
Skype stuff:
http://robgt.com/skype
Dell stuff:
http://robgt.com/dell
SatNav stuff:
http://robgt.com/satnav

Thanks Mick!
Cheers,
Rob
http://robgt.com/ [Tutorials and
Extensions]
Firebox stuff:
http://robgt.com/firebox
Skype stuff:
http://robgt.com/skype
Dell stuff:
http://robgt.com/dell
SatNav stuff:
http://robgt.com/satnav

Find Replace from Textfile with regex

Hello.
I'm wondering if anyone knows about an existing script that does a find/replace by list like the script "FindChangeByList.jsx" that comes with every InDesign installation.
This consists of tow parts, the script itself with the functionality and a simple textfile where you have simple one-liners capable of find/replace with regex.
the Textfile:
//FindChangeList.txt
//A support file for the InDesign CS4 JavaScript FindChangeByList.jsx
//This data file is tab-delimited, with carriage returns separating records.
//The format of each record in the file is:
//findType<tab>findProperties<tab>changeProperties<tab>findChangeOptions<tab>description
//Where:
//<tab> is a tab character
//findType is "text", "grep", or "glyph" (this sets the type of find/change operation to use).
//findProperties is a properties record (as text) of the find preferences.
//changeProperties is a properties record (as text) of the change preferences.
//findChangeOptions is a properties record (as text) of the find/change options.
//description is a description of the find/change operation
//Very simple example:
//text          {findWhat:"--"}          {changeTo:"^_"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all double dashes and replace with an em dash.
//More complex example:
//text          {findWhat:"^9^9.^9^9"}          {appliedCharacterStyle:"price"}          {include footnotes:true, include master pages:true, include hidden layers:true, whole word:false}          Find $10.00 to $99.99 and apply the character style "price".
//All InDesign search metacharacters are allowed in the "findWhat" and "changeTo" properties for findTextPreferences and changeTextPreferences.
//If you enter backslashes in the findWhat property of the findGrepPreferences object, they must be "escaped"
//as shown in the example below:
//{findWhat:"\\s+"}
grep          {findWhat:" +"}          {changeTo:" "}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all double spaces and replace with single spaces.
grep          {findWhat:"\r "}          {changeTo:"\r"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all returns followed by a space And replace with single returns.
grep          {findWhat:" \r"}          {changeTo:"\r"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all returns followed by a space and replace with single returns.
grep          {findWhat:"\t\t+"}          {changeTo:"\t"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all double tab characters and replace with single tab characters.
grep          {findWhat:"\r\t"}          {changeTo:"\r"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all returns followed by a tab character and replace with single returns.
grep          {findWhat:"\t\r"}          {changeTo:"\r"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all returns followed by a tab character and replace with single returns.
grep          {findWhat:"\r\r+"}          {changeTo:"\r"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all double returns and replace with single returns.
text          {findWhat:" - "}          {changeTo:"^="}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all space-dash-space and replace with an en dash.
text          {findWhat:"--"}          {changeTo:"^_"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all dash-dash and replace with an em dash.
The script:
//FindChangeByList.jsx
//An InDesign CS5.5 JavaScript
@@@BUILDINFO@@@ "FindChangeByList.jsx" 3.0.0 15 December 2009
//Loads a series of tab-delimited strings from a text file, then performs a series
//of find/change operations based on the strings read from the file.
//The data file is tab-delimited, with carriage returns separating records.
//The format of each record in the file is:
//findType<tab>findProperties<tab>changeProperties<tab>findChangeOptions<tab>description
//Where:
//<tab> is a tab character
//findType is "text", "grep", or "glyph" (this sets the type of find/change operation to use).
//findProperties is a properties record (as text) of the find preferences.
//changeProperties is a properties record (as text) of the change preferences.
//findChangeOptions is a properties record (as text) of the find/change options.
//description is a description of the find/change operation
//Very simple example:
//text          {findWhat:"--"}          {changeTo:"^_"}          {includeFootnotes:true, includeMasterPages:true, includeHiddenLayers:true, wholeWord:false}          Find all double dashes and replace with an em dash.
//More complex example:
//text          {findWhat:"^9^9.^9^9"}          {appliedCharacterStyle:"price"}          {include footnotes:true, include master pages:true, include hidden layers:true, whole word:false}          Find $10.00 to $99.99 and apply the character style "price".
//All InDesign search metacharacters are allowed in the "findWhat" and "changeTo" properties for findTextPreferences and changeTextPreferences.
//If you enter backslashes in the findWhat property of the findGrepPreferences object, they must be "escaped"
//as shown in the example below:
//{findWhat:"\\s+"}
//For more on InDesign scripting, go to http://www.adobe.com/products/indesign/scripting/index.html
//or visit the InDesign Scripting User to User forum at http://www.adobeforums.com
main();
function main(){
          var myObject;
          //Make certain that user interaction (display of dialogs, etc.) is turned on.
          app.scriptPreferences.userInteractionLevel = UserInteractionLevels.interactWithAll;
          if(app.documents.length > 0){
                    if(app.selection.length > 0){
                              switch(app.selection[0].constructor.name){
                                        case "InsertionPoint":
                                        case "Character":
                                        case "Word":
                                        case "TextStyleRange":
                                        case "Line":
                                        case "Paragraph":
                                        case "TextColumn":
                                        case "Text":
                                        case "Cell":
                                        case "Column":
                                        case "Row":
                                        case "Table":
                                                  myDisplayDialog();
                                                  break;
                                        default:
                                                  //Something was selected, but it wasn't a text object, so search the document.
                                                  myFindChangeByList(app.documents.item(0));
                    else{
                              //Nothing was selected, so simply search the document.
                              myFindChangeByList(app.documents.item(0));
          else{
                    alert("No documents are open. Please open a document and try again.");
function myDisplayDialog(){
          var myObject;
          var myDialog = app.dialogs.add({name:"FindChangeByList"});
          with(myDialog.dialogColumns.add()){
                    with(dialogRows.add()){
                              with(dialogColumns.add()){
                                        staticTexts.add({staticLabel:"Search Range:"});
                              var myRangeButtons = radiobuttonGroups.add();
                              with(myRangeButtons){
                                        radiobuttonControls.add({staticLabel:"Document", checkedState:true});
                                        radiobuttonControls.add({staticLabel:"Selected Story"});
                                        if(app.selection[0].contents != ""){
                                                  radiobuttonControls.add({staticLabel:"Selection", checkedState:true});
          var myResult = myDialog.show();
          if(myResult == true){
                    switch(myRangeButtons.selectedButton){
                              case 0:
                                        myObject = app.documents.item(0);
                                        break;
                              case 1:
                                        myObject = app.selection[0].parentStory;
                                        break;
                              case 2:
                                        myObject = app.selection[0];
                                        break;
                    myDialog.destroy();
                    myFindChangeByList(myObject);
          else{
                    myDialog.destroy();
function myFindChangeByList(myObject){
          var myScriptFileName, myFindChangeFile, myFindChangeFileName, myScriptFile, myResult;
          var myFindChangeArray, myFindPreferences, myChangePreferences, myFindLimit, myStory;
          var myStartCharacter, myEndCharacter;
          var myFindChangeFile = myFindFile("/FindChangeSupport/FindChangeList.txt")
          if(myFindChangeFile != null){
                    myFindChangeFile = File(myFindChangeFile);
                    var myResult = myFindChangeFile.open("r", undefined, undefined);
                    if(myResult == true){
                              //Loop through the find/change operations.
                              do{
                                        myLine = myFindChangeFile.readln();
                                        //Ignore comment lines and blank lines.
                                        if((myLine.substring(0,4)=="text")||(myLine.substring(0,4)=="grep")|| (myLine.substring(0,5)=="glyph")){
                                                  myFindChangeArray = myLine.split("\t");
                                                  //The first field in the line is the findType string.
                                                  myFindType = myFindChangeArray[0];
                                                  //The second field in the line is the FindPreferences string.
                                                  myFindPreferences = myFindChangeArray[1];
                                                  //The second field in the line is the ChangePreferences string.
                                                  myChangePreferences = myFindChangeArray[2];
                                                  //The fourth field is the range--used only by text find/change.
                                                  myFindChangeOptions = myFindChangeArray[3];
                                                  switch(myFindType){
                                                            case "text":
                                                                      myFindText(myObject, myFindPreferences, myChangePreferences, myFindChangeOptions);
                                                                      break;
                                                            case "grep":
                                                                      myFindGrep(myObject, myFindPreferences, myChangePreferences, myFindChangeOptions);
                                                                      break;
                                                            case "glyph":
                                                                      myFindGlyph(myObject, myFindPreferences, myChangePreferences, myFindChangeOptions);
                                                                      break;
                              } while(myFindChangeFile.eof == false);
                              myFindChangeFile.close();
function myFindText(myObject, myFindPreferences, myChangePreferences, myFindChangeOptions){
          //Reset the find/change preferences before each search.
          app.changeTextPreferences = NothingEnum.nothing;
          app.findTextPreferences = NothingEnum.nothing;
          var myString = "app.findTextPreferences.properties = "+ myFindPreferences + ";";
          myString += "app.changeTextPreferences.properties = " + myChangePreferences + ";";
          myString += "app.findChangeTextOptions.properties = " + myFindChangeOptions + ";";
          app.doScript(myString, ScriptLanguage.javascript);
          myFoundItems = myObject.changeText();
          //Reset the find/change preferences after each search.
          app.changeTextPreferences = NothingEnum.nothing;
          app.findTextPreferences = NothingEnum.nothing;
function myFindGrep(myObject, myFindPreferences, myChangePreferences, myFindChangeOptions){
          //Reset the find/change grep preferences before each search.
          app.changeGrepPreferences = NothingEnum.nothing;
          app.findGrepPreferences = NothingEnum.nothing;
          var myString = "app.findGrepPreferences.properties = "+ myFindPreferences + ";";
          myString += "app.changeGrepPreferences.properties = " + myChangePreferences + ";";
          myString += "app.findChangeGrepOptions.properties = " + myFindChangeOptions + ";";
          app.doScript(myString, ScriptLanguage.javascript);
          var myFoundItems = myObject.changeGrep();
          //Reset the find/change grep preferences after each search.
          app.changeGrepPreferences = NothingEnum.nothing;
          app.findGrepPreferences = NothingEnum.nothing;
function myFindGlyph(myObject, myFindPreferences, myChangePreferences, myFindChangeOptions){
          //Reset the find/change glyph preferences before each search.
          app.changeGlyphPreferences = NothingEnum.nothing;
          app.findGlyphPreferences = NothingEnum.nothing;
          var myString = "app.findGlyphPreferences.properties = "+ myFindPreferences + ";";
          myString += "app.changeGlyphPreferences.properties = " + myChangePreferences + ";";
          myString += "app.findChangeGlyphOptions.properties = " + myFindChangeOptions + ";";
          app.doScript(myString, ScriptLanguage.javascript);
          var myFoundItems = myObject.changeGlyph();
          //Reset the find/change glyph preferences after each search.
          app.changeGlyphPreferences = NothingEnum.nothing;
          app.findGlyphPreferences = NothingEnum.nothing;
function myFindFile(myFilePath){
          var myScriptFile = myGetScriptPath();
          var myScriptFile = File(myScriptFile);
          var myScriptFolder = myScriptFile.path;
          myFilePath = myScriptFolder + myFilePath;
          if(File(myFilePath).exists == false){
                    //Display a dialog.
                    myFilePath = File.openDialog("Choose the file containing your find/change list");
          return myFilePath;
function myGetScriptPath(){
          try{
                    myFile = app.activeScript;
          catch(myError){
                    myFile = myError.fileName;
          return myFile;
This is a very useful and easy to maintain script which even people who cant write scripts (but know how to use regex) can do complex search replace mass replacements.
Would love to find something like this for FrameMaker 12 (as i can't write scripts myself).
regards
daniel

I have visited that site. The first item in the external link says: "You can also configure Firefox to automatically search for text when you type any characters outside of a text field. When typing in a text field these characters should show up in the text field and not trigger the Quick Find bar. "
What I am looking for is the exact opposite. Once my first search is entered in the text box, and the info comes back, I want to start typing the next symbol, and have it automatically show up in the text box, not the Quick Find box. That is how it was working up until a couple of months ago.

How to check special characters in java code using Java.util.regex package

String guid="first_Name;Last_Name";
Pattern p5 = Pattern.compile("\\p{Punct}");
Matcher m5 =p5.matcher(guid);
boolean test=m5.matches();
I want to find out the weather any speacial characters are there in the String guid using regex.
but above code is always returning false. pls suggest.

Pattern.compile ("[^\\w]");The above will match any non [a-zA-Z0-9_] character.
Or you could do
Pattern.compile("[^\\s^\\w]");This should match anything that is not a valid charcter and is not whitespace.

Reading a String Literally - Finding the "\" Character with a Regex

How do I search for "\" characters in a string? Such as..
String text = "Temp\temp.txt";
The problem is that Java will read "\t" as a tab, so that
System.out.println(text);
Will return
Temp emp.txt
Also, searching for the regex "\\\\" will return a null result, presumably because Java interprets the "\t" as a tab character, not as a literal "\" followed by a literal "t".
How do I get Java to read the string without interpreting it?
Thanks in advance

Try also this:
public static void main(String[] args) {
          String test1 = "String with a\\t which is not a tab but a t preceeded with a \\ character";
          String test2 = "String with a \t which is a tab";
          System.out.println("This are the Strings as user sees / enters them:");
          System.out.println(test1);
          System.out.println(test2);
          System.out.println("");
          System.out.println("***********************************************************");
          System.out.println("");
          System.out
                    .println("Splitting first string using a \\\\\\\\ regex which will be interpreted by the regex engine as \\\\ which will represent a \'\\\' character:");
          System.out.println("");
          for (String s : test1.split("\\\\")) {
               System.out.println(s);
          System.out.println("");
          System.out.println("***********************************************************");
          System.out.println("");
          System.out.println("Splitting the second string just the same way:");
          System.out.println("");
          for (String s : test2.split("\\\\")) {
               System.out.println(s);
     }and read the console output.
It should be:
This are the Strings as user sees / enters them:
String with a\t which is not a tab but a t preceeded with a \ character
String with a       which is a tab
Splitting first string using a \\\\ regex which will be interpreted by the regex engine as \\ which will represent a '\' character:
String with a
t which is not a tab but a t preceeded with a
character
Splitting the second string just the same way:
String with a       which is a tab

How to replace regex match into a char value (in the middle of a string)

Hi uncle_alice and other great regex gurus
One of my friends has a peculiar problem and I cant give him a solution.
Using String#replaceAll(), i.e. NOT a Matcher loop, how could we convert matched digit string such as "65" into a char of its numeric value. That is, "65" should be converted into letter 'A'.
Here's the failing code:
public class GetChar{
public static void main(String[] args){
    String orig = "this is an LF<#10#> and this is an 'A'<#65#>";
    String regx = "(<#)(\\d+)#>";
    //expected result : "this is an LF\n and this is an 'A'A"
    String result = orig.replaceAll(regx, "\\u00$2");
    // String result = orig.replaceAll(regx, "\\\\u00$2"); //this also doesn't work
    System.out.println(result);

I don't know that we have lost anything substantial.i think its just that the kind of task this is
especially useful for is kind of a blind-spot in the
range of things java is a good-fit for (?)
for certain tasks (eg process output munging) an
experienced perl programmer could knock up (in perl)
using built-in language features a couple of lines
which in java could takes pages to do. If the cost is
readability/maintainability/expandability etc.. then
this might be a problem, but for a number of
day-to-day tasks it isn't
i'm trying to learn perl at the moment for this exact
reason :)Yes. And when a Java source-code processor(a.k.a. compiler) sees the code like:
line = line.replaceAll(regexp, new String(new char[] {(char)(Integer.parseInt("$1"))}));or,
line = line.replaceAll(regexp, doMyProcessOn("$1")); //doMyProcess returns a Stringa common sense should have told him that "$1" isn't a literal string "$1" in this regular expression context.
By the way, I abhor Perl code becaus of its incomprehensibleness. They can't be read by an average common sense. Java code can be, sort of ...

Java.util.regex error

Hello,
I checked JavaDoc multiple times but do not see what is wrong with
myString.replaceAll("D:\\web\\mars","")which results in
java.util.regex.PatternSyntaxException: Illegal/unsupported escape squence near index 7
D:\web\mars
       ^
     at java.util.regex.Pattern.error(Unknown Source)
     at java.util.regex.Pattern.escape(Unknown Source)
     at java.util.regex.Pattern.atom(Unknown Source)
     at java.util.regex.Pattern.sequence(Unknown Source)
     at java.util.regex.Pattern.expr(Unknown Source)
     at java.util.regex.Pattern.compile(Unknown Source)
     at java.util.regex.Pattern.<init>(Unknown Source)
     at java.util.regex.Pattern.compile(Unknown Source)
     at java.lang.String.replaceAll(Unknown Source)
     at ArticleImageImportProcessor.main(ArticleImageImportProcessor.java:40)
Exception in thread "main" please, every suggestion/hint is most appeciated

You have to "encode" backslash twice, first for String purpose and second time because of special meaning of '\' in regular expressions.
It should looks like
myString.replaceAll("D:\\\\web\\\\mars","")

Simple Java regex question

I have a file with set of Name:Value pairs
e.g
Action1:fail
Action2:pass
Action3:fred
Using regex package I Want to get value of Name "Action1"
I have tried diff things but I cannot figure out how I can do it. I can find Action1: is present or not but dont know how I can get value associated with it.
I have tried:
Pattern pattern = Pattern.compile("Action1");
CharSequence charSequence = CharSequenceFromFile(fileName); // method retuning charsq from a file
Matcher matcher = pattern.matcher(charSequence);
if(matcher.find()){
int start = matcher.end(0);
System.out.println("matcher.group(0)"+ matcher.group(0));
how I can get value associated with specific tag?
thanks
anmol

read the data from the text file on a line basis and you can do:
String line //get this somehow
String[] keyPair = line.split(":")g
System.out.println(keyPair[0]); //your name
System.out.println(keyPair[1]); //your valueor if you've got the text file in one big string:
String pattern = "(\\a*):(\\a*)$"; //{alpha}:{alpha}newline //?
//then
//do some things with match objects
//look in the API at java.util.regex

RegEx: How to find out which part of the pattern failed?

Hi there,
I was wondering: is there any way to find out where the pattern matching failed?
Say I got the string "John Paul Yoko Ringo", and I want to match it against the pattern /John Paul George Ringo/.
I would like to know something like "pattern failed at index 11", or if I had groups something like "matching group 3 failed".
Is there any way to do this? Thanks in advance!
Best regards,
- Torben

jschell wrote:
I would like to know something like "pattern failed at index 11", or if I had groups something like "matching group 3 failed".
Is there any way to do this? Thanks in advance!
I wonder if that is reasonable. It means that the parse tree for the regex would need to keep mapping information.
At a minimum it is going to require an array, not a single result, because a regex can 'fail' in many ways.
Consider the following regex with the following input
/(a|b)d/
abababababx
Where does it 'fail'?Right. If you just want the character position at which it failed, those tools might tell you that as part of a bigger picture. But by itself, without any context, that number's not necessarily meaingful. A given character can be examined many times due to backtracking. Part of the expression could succeed for part of the input, then the expression might fail for the rest, so we backtrack, and may get several more failures, then more partial successes, all at different points, then ultimately it may fail anywhere within the input.
So just knowing where isn't enough. You need to know what steps were taken to get there. I do think these tools provide that, though I haven't looked closely.

Regex with xml for italicize or node creation

Okay
Guess it's a complex situation to explain.
I am working on the text content of xml documents again. made quite a lot of progress with some of my other regex requirements.
I am looking for a specific set of words to italicize say for example 'In Vitro'
String Regex = "In Vitro";
// here I get the text of a particular xml Node which is a text node
String paragraph = nl.item(i).getNodeValue();
//Value of paragraph before replace is "and lipids and In Vitro poorlysoluble(in water"
String replace = "<Italic>In Vitro<Italic/>";
String paragRepl = m.replaceFirst(replace);
//Value of pargRepl after regex replace is "and lipids,?;:!and <Italic>In Vitro<Italic/> poorlysoluble(in water"
//then I update the content of the node again
nl.item(i)..setNodeValue(paragRepl);
// save the xml documentthe italic tag is interpreted by our custom stylesheet to display "In Vitro" in italics, the reason it cannot do that is because the the character entities of the < and > have been put in the text content of the node i.e < and >. On closer examination of the text of the node after the document was saves, it appeared this way " <Italic>In Vitro<Italic/> ". For some reasom the greater than sign came out okay, but still no point, It didn't actually create a new node. I am not sure how you can automatically put tags around specific text you find in xml documents using regex, or If I have to create a new node at that point.
it's xml so these entities come into picture.
any help is greatly appreciated, in short I need to just add a set of tags to a particular regex I find in an xml document,
thanks in advance
Jeevan

okay i am getting closer to the solution as there is an api call from another proprietary language that would do this
but as I loop through the xml document, it keep selecting the text "In Vitro" even after it has been italicized.
So I guess my next challenge is getting a regex which looks for "In Vitro" but not italicized
For regex so far I have seen case insensitive handling, I have seen for italics
basically if I I can get my hands on a regex for example
String regex = "In Vitro && Not Italic"
any help is appreciated
Jeevan

Issue with regexes in http health probes on ACE 4710

Folks,
We're currently experiencing fairly bizarre behavior when attempting to set up http probes that expect a regexp. Namely, if we specify a regexp, the probe *always* passes, regardless of status code and regardless of whether or not the message actually matches the pattern. Doing 'no expect regexp' fixes this behavior (by which I mean that the 'expect status' rules work again).
We haven't noticed until now because this is the first time we've tried to set up a probe that does this. Are we missing something? Is this a known issue with our current firmware version?
Sincerely,
Patrick T. Ramsey
# show run probe | begin HTTP-nfscheck | end regex
Generating configuration....
probe http HTTP-nfscheck
description Simple HTTP probe to check nfs mount health
port 80
interval 15
passdetect interval 20
request method head url /nfs-health-check/
open 1
expect regex "^ureytgraeuikghfdjg$"
# sh ver
Cisco Application Control Software (ACSW)
TAC support: http://www.cisco.com/tac
Copyright (c) 1985-2009 by Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained herein are owned by
other third parties and are used and distributed under license.
Some parts of this software are covered under the GNU Public
License. A copy of the license is available at
http://www.gnu.org/licenses/gpl.html.
Software
loader:    Version 0.95.1
system:    Version A3(2.4) [build 3.0(0)A3(2.4) adbuild_11:46:02-2009/09/27_/auto/adbu-rel2/rel_a3_2_3_throttle/REL_3_0_0_A3_2
_4]
system image file: (hd0,1)/c4710ace-mz.A3_2_4.bin
Device Manager version 1.2 (0) 20090925:1550
installed license: no feature license is installed
Hardware
cpu info:
    Motherboard:
        number of cpu(s): 2
    Daughtercard:
        number of cpu(s): 16
memory info:
    total: 6226388 kB, free: 3972668 kB
    shared: 0 kB, buffers: 22020 kB, cached 0 kB
cf info:
    filesystem: /dev/hdb2
    total: 861668 kB, used: 728656 kB, available: 89240 kB
last boot reason: Unknown
configuration register: 0x1
ldbottom kernel uptime is 325 days 3 hours 46 minute(s) 43 second(s)

I also went through a similar issue in which we need to probe the real server PESERVER01 and if the real server replies with the keyword "PE Server" in the HTTP content then the probe should be passed successful.
In my case the real server was listening on port 32776 for HTTP service so we configured the serverfarm as below,
serverfarm host SF-TEST-32776
description SF-TEST-32776
failaction purge
probe PE-SERVER-STRING
rserver PESERVER01 32776
    inservice
And the TCP probe as below,
probe tcp PE-SERVER-STRING
port 32776
send-data GET /IOR/ping HTTP/1.1      <<== command should not be in inverted commas
expect regex "PE Server"
The above probe worked really well and when we checked the probe status it was marking as success. I also tried changing the regex from "PE Server" to "Vishal12345" and it was failing as expected because there was no such keyword in the HTTP content.
==================================================================================
T2-LB02# sh probe PE-SERVER-STRING
probe       : PE-SERVER-STRING
type        : TCP
state       : ACTIVE
   port      : 32776   address     : 0.0.0.0         addr type : -
   interval : 15      pass intvl : 60              pass count : 3
   fail count: 3       recv timeout: 10
                ------------------ probe results ------------------
   associations ip-address      port porttype probes   failed   passed   health
   ------------ ---------------+-----+--------+--------+--------+--------+------
   serverfarm : SF-TEST-32776
     real      : PESERVER01[32776]
                10.10.10.1    32776 PROBE    105      0        105      SUCCESS
==================================================================================
I was struggling with this issue from long time. Even raised couple of Cisco TAC cases with no luck. The most important thing here is to identify the exact command to be send to real server like GET /IOR/ping HTTP/1.1 that we used here.
To collect this command I did packet capture on one of the client machine and then tried to open the URL from real server which can return the string "PE Server". Then analyzed the captures in Wireshark and checked the HTTP data with follow the TCP stream option in which I seen the below data, which gives the command to be send in probe as well as the string we should expect.
==================================================================================
GET /IOR/ping HTTP/1.1
User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
Host: 10.144.70.85:32776
Accept: */*
HTTP/1.0 200 OK
Content-type: text/html
Ping
PE Server
WRVFKO11 [Win32 Server Production (3 silos) (Oracle Blob 512 MB) -- {dap451.007.028 dap451.004.002 pe451.003.010x pui451.003.010 pui451.001.004} Mar 9 2012 15:07:53 en ]
===================================================================================
Please try this and see if it helps you.
Thanks,
Vishal Babrekar

A problem with regex and special characters

Hello,
I am using regex in my application but i have a problem with special characters. Here is the explanation of what i am doing:
I have a certain piece of text that i want to parse and replace every occurrence of a given word with some sort of a tag which have the word found inside it.
so that: go Going Go to gOschool by bus and to learn and to play GO Go
and i need to replace the word "go" (case insensitive and only at word boundaries) should be:
*<start>go<end> Going <start>Go<end> to gOschool by bus and to learn and to play <start>GO<end> <start>Go<end>*
Consider the following code and call the method with the parameter"go?"
The Matcher finds a weird match at the word "G?oing" with only the letter G !!!
It also ignores the "?" in the pattern completely.
Any clue of what is happening i would be very grateful...
private static String replaceMatches(String strToFind)
        String resultArticle="";
        String article = " "+"go? G?oing Go? to gOschool by bus and to learn and to play GO? Go?*"+" ";
        strToFind = "\\b"+ strToFind +"\\b";
        String linkPart1= "<start>";
        String linkPart2 = "<end>";
        Pattern p = null;
        try{
            p=Pattern.compile(strToFind, Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher(article);
        String[] res = p.split(article);
        int i=0;
        //System.out.println("result of split: "+res.length );
        while(m.find())
            resultArticle+=(res[i]+" ");
            resultArticle+=linkPart1;
            resultArticle+=m.group().trim();
            resultArticle+=(linkPart2+" ");
            i++;
        if(i<res.length)
            resultArticle+=res;
//System.out.println("result of match: " + i);
System.out.println(article);
//System.out.println(resultArticle.trim()+scripts);
catch(PatternSyntaxException ex){}
return resultArticle.trim();
}Thanks

tarek.mamdouh wrote:
because split will not work when trying to replace the first word if i don't append a space at the beginning.Split doesn't work anyway. And my question wasn't why do you add spaces (which you really don't need to do), but why do you do them with " " + "go" rather than just " go"
replaceAll will replace all the occurrences in the text with only one word. without taking into consideration the case of the word i need to replace.No.
>
If i use replacaAll(article, strToFind) the output will be:
<start>go?<end> G?oing <start>go?<end> to gOschool by bus and to learn and to play <start>go?<end> <start>go?<end>No. I showed you the actual output of an actual replaceAll.
which is not what i want as i need to keep the case of the words i am replacingThe replaceAll I showed you does that.
Please study the examples given and read the docs carefully rather than making claims based on inaccurate guesses.

Using a literal "." in sed regex

I recently picked up O'Reilly's _Classic Shell Scripting_.
Two of the examples have me stuck.
(1) Both the man pages on 10.4.5 and various references say that to get a literal period into the regex part of a s/regex/str/, use "\.". This command, however, ls -a | sed s/\./[hidden]/ replaces the first character of every line of input, as if the backslash is having no effect. In fact, the same one-liner without the backslash produces the same output. ???
(2) Using ampersand ("&") in the replacement text yields the sed error "unterminated substitution in regular expression", instead of macro-ing in the matched text of the regex. I.e., the command ls -a | sed s/file/Mark's &/ causes an error. ???
Bonus Question : Is MySQL really preloaded with OS X and, if so, where do I find it?
Thanks. I realize that these are pretty basic questions.

Hi cholla pete,
   Thanks for providing the exact command that you used. You would be surprised at the number of people that would simply claim that it doesn't work. However, it's easy to see what's happening with your command. The problem is that sed never sees your backslash. It is used for quoting in the shell as well as in sed. Thus, the shell "consumes" the backslash before the argument is passed to sed. The following should do what you want:
ls -a | sed 's/^\./[hidden]/'
Note also that I've added the beginning of the line anchor, '^', to your regular expression as I doubt that you want to replace the period that separates filenames from filename extensions.
   Regular expressions share many meta-characters with shells. It is common to use single quotes to "protect" regular expressions in arguments to utilities that consume regular expressions.
   The problem in your second sed command is the apostrophe in the name. The shell sees that as a single quote. I'm surprised that your shell executed the command when you pressed the <Return> key. However, you will probably also need to quote that sed argument as well because an ampersand preceded by a space also has meaning to most shells. Something that should do what you want is the following:
ls -a | sed "s/tst/Mark's &/"
Double quotes will protect both the single quote and the ampersand. However, double quotes don't protect all meta-characters from the shell. Single quotes are often necessary, even though they aren't required here.
   Unfortunately, preceding a single quote with a backslash within single quotes doesn't quote it because the single quotes quote the backslash so it doesn't do anything. Therefore, you must take single quotes out of single-quoted strings to quote them. Hence, the following will also work:
ls -a | sed 's/tst/Mark'\''s &/'
In this command the argument to sed is actually the concatenation of three strings, the second of which is simply a quoted single-quote.
   Tired of hearing the word "quote" yet? It can get a bit messy. However, once you learn the rules of precedence, the above becomes second nature with practice.
   Bonus Answer: No, mysql doesn't come with the client version of OS X. I don't think it comes with the server version either but I don't have one so I can't really say.
Gary
~~~~
   Sin boldly.
         -- Martin Luther

B1i Truncating space characters in elements from RegEx flat file SLD

Task:
I'm importing from a fixed-width file that's specifically 94 characters wide consisting of multiple columns. My only concern at this point is the number of lines and their width. I have to pad the block out to a specific number of lines, and all existing lines in my source file already fill all 94 characters. Multiple headers and footers exist in the same flat file, so defining the format explicitly based on column width in RegEx will add significant time to the project (though it's not necessarily impossible).
This file ends lines with a single LF character (no CR), which apparently is too much for TXT and CSV formats to handle (the entire file comes in as one line). So I turned to Regex, where I can split the document on arbitrary characters like LFs.
I've recorded a message and the lines look something like this:
"ABB CCCCCCCCC DDDDDDDDDEEEEEEFFFFGHHHIIJKKKKKK KKKK K KKKKK KKKKKK KKKK K KKKKK LLLLLLLL"
and column L is occasionally blank (just spaces). This is one of the row formats present in the file. Most of the columns use spaces as fill characters and are left justified.
Tag definition looks like this:
<tagDefinition xmlns="urn:com.sap.b1i.bizprocessor:pltdefinition" regex="\n" schemaName="" tagName="line" matchSplit="S" stackSize="10000" DOTALL="true" MULTILINE="false">
</tagDefinition>
Problems:
With a recorded test messsage, the lines build properly:
<Payload Role="S" intype="rgx_ruledoc">
<bfa:io xmlns:bfa="urn:com.sap.b1i.bizprocessor:bizatoms" pltype="rgx" schemaName="">
<line xmlns="">ABB CCCCCCCCC DDDDDDDDDEEEEEEFFFFGHHHIIJKKKKKK KKKK K KKKKK KKKKKK KKKK K KKKKK </line>
</bfa:io>
</Payload>
But by the time it reaches my first XSL Transformation atom, I receive the data in this form:
<Payload Role="S" intype="rgx_ruledoc">
<bfa:io xmlns:bfa="urn:com.sap.b1i.bizprocessor:bizatoms" pltype="rgx" schemaName="">
<line xmlns="">ABB CCCCCCCCC DDDDDDDDDEEEEEEFFFFGHHHIIJKKKKKK KKKK K KKKKK KKKKKK KKKK K KKKKK </line>
</bfa:io>
</Payload>
In other words, I'm missing 3 spaces from within the middle of K and 11 from the end of the example line. Multiple lines are affected by this. I've built the Regex import portion properly, judging by the test message. But the data I build is not the data I'm given. And the truncation seems to be happening somewhere outside my control.
Questions:
1. Is there a way to specify that all spaces are non-breaking in the input/output format descriptors?
2. Is there a way I can specify that B1i doesn't modify the data before handing it to me?
3. Is there an easier/more correct way to do this?
Also possibly 4. Can you specify line endings in TXT/CSV imports?

As far as I've found, there's no way to specify that B1i doesn't normalize spaces for you in an input Msg. I ended up splitting the incoming message line to its individual columns using the Regex format descriptor. That way, I was able to track the integrity of each component of the message.
However, when writing out to the same file format using the TXT file type, B1i is inserting commas between the columns, despite:
1. Being TXT format, which shouldn't use delimiters.
2. Having the delimiter field empty. It correctly ignores this field, even if it does insert a delimiter.
3. Specifying no delimiter explicitly using <FileOut type="file_full">/<Control><deli>.
It seems odd that the included means for writing raw text would mangle the output message so.
Has anyone successfully written data to a TXT file with no delimiter?

Regex \\d* vs \\d\\d

Similar Messages

Maybe you are looking for