Regex question: replace

Hi,
I'm getting into java.util.regex lately. Having used Perl for regex I'm trying to get familiar with Java's regex "spirit".
Concerning replacement we can use replaceAll or replaceFirst however:
- what if I want to replace only the third or fourth element?
- what if I want to replace second to fourth element?
in PERL we use " regex_epression_here for 2..4;" for instance.
I you would have some interesting website/tutorials related to JAVA regex that would be great.
Thanks for your help.
Rgds,
SR

Yep,
here is a sample of replacement in Perl
$Line =~ s/\]/|/ for 2..4; #Replace 2nd 'til
4th delimiter (]) with pipe (|)
....Based on the reference I gave earlier
import java.util.regex.*;
* A rewriter does a global substitution in the strings passed to its
* 'rewrite' method. It uses the pattern supplied to its constructor,
* and is like 'String.replaceAll' except for the fact that its
* replacement strings are generated by invoking a method you write,
* rather than from another string.
* This class is supposed to be equivalent to Ruby's 'gsub' when given
* a block. This is the nicest syntax I've managed to come up with in
* Java so far. It's not too bad, and might actually be preferable if
* you want to do the same rewriting to a number of strings in the same
* method or class.
* See the example 'main' for a sample of how to use this class.
* @author Elliott Hughes
public abstract class Rewriter_1
    private Pattern pattern;
    private Matcher matcher;
     * Constructs a rewriter using the given regular expression;
     * the syntax is the same as for 'Pattern.compile'.
    public Rewriter_1(String regularExpression)
        this.pattern = Pattern.compile(regularExpression);
     * Returns the input subsequence captured by the given group
     * during the previous match operation.
    public String group(int i)
        return matcher.group(i);
     * Overridden to compute a replacement for each match. Use
     * the method 'group' to access the captured groups.
    public abstract String replacement(int index);
     * Returns the result of rewriting 'original' by invoking
     * the method 'replacement' for each match of the regular
     * expression supplied to the constructor.
    public String rewrite(CharSequence original)
        this.matcher = pattern.matcher(original);
        StringBuffer result = new StringBuffer(original.length());
        int index = 0;
        while (matcher.find())
            matcher.appendReplacement(result, replacement(++index));
        matcher.appendTail(result);
        return result.toString();
    public static void main(String[] arguments)
        String result = new Rewriter_1("\\|")
            public String replacement(int index)
                if ((index >= 3) && (index <=5))
                    return "y";
                else
                    return group(0);
        }.rewrite("| | | | | |");
        System.out.println(result);
}

Similar Messages

RegEx To Replace Non-Static URL Variable

I have the need to strip a query string of one or more variables, which unfortunately are not static.
I've been trying to wrap my head around RegEx for this, but haven't had much luck.
A sample of my URL would be:
http://www.mydomain.com/locations.cfm?page=2&sort_method=name
Obviously the page number will change, and the sort method could be any one of a number of values.
Can anyone point me in the direction for using RegEx to "replace/rereplace" the variable and it's value?
Thank you in advance.

<cfset myUrl="http://www.mydomain.com/locations.cfm?pAgE=25&Sort_Method=name06">


<cfset dynamicPageNumber = 9>
<cfset newURLPage = REReplaceNoCase(myUrl,"page=[0-9]{1,3}","page=#dynamicPageNumber#")>


<cfset dynamicSortMethod = 'bkbk_method'>
<cfset newURLSortMethod = REReplaceNoCase(myUrl,"sort_method=[A-Za-z0-9]{1,8}","sort_method=#dynamicSortMethod#")>
<cfoutput>
newURLPage: #newURLPage# <br>
newURLSortMethod: #newURLSortMethod#
</cfoutput>

Regex question

Hi,
I have a question regarding the regular expressions in java.
Let's say I have the following regex: "(one)|(two)|(three)" and the following string: "two". The string obviously matches the regex, because of the "\2" group. Is there any way to determine the group number that matched the string, without having to use something like:
for (int i = 1; i <= matcher.groupCount(); i++)
}

It's not top secret, the time difference is the problem.
It's for a school project. We have to make Pascal Compiler and the first step is the Lexical Analyzer. This means that I have some regular expressions for identifiers, numeric constants, string constants and so on...
For example the regex for the identifiers (variable name) looks like: "[a-zA-Z_][a-zA-Z0-9_]*", but the one for the key words is basically an array, like the one in my first post.
The regular expressions work fine, but for the next part of the project I need to know the index of the key words, within the key word array (which in my case is a regular expression). So this is why I was wondering if there is any way to get the group number, without having to iterate through the whole regex.

OT: Regex Question

I'm doing a series of search and replace operations with Dreamweaver and wondered if anyone can suggest a regular expression for a particular situation.
The following URL is fine as it is:
<td><a href="http://www.geoworld.org/Brazil" title="Brazil">Brazil</a></td>
However, I need to replace the spaces in this URL with underscores...
<td><a href="http://www.geoworld.org/Central African Republic" title="Central African Republic">Central African Republic</a></td>
The finished URL should like like this:
<td><a href="http://www.geoworld.org/Central_African_Republic" title="Central African Republic">Central African Republic</a></td>
In other words, I want to replace ALL spaces in the URL proper with underscores, but I want to leave the spaces in the title attributes and visible text alone. Does anyone know a regular expression that will do this?
Thanks.

Find:
(href="[^"]+)\s([^"]+")
Replace:
$1_$2
This will replace one space with an underscore each href attribute. Run the same regex several times until no more instances are found.

Regex question: How do I insert commas between meta data?

Current search engine is being replaced with Google Search Appliance (GSA). It requires meta data to be separated by a comma + space, whereas the previous search engine required only a space. For example:
<meta name="C_NAME" content="Screen1 Screen2">
must become
<meta name="C_NAME" content="Screen1, Screen2">
There are 17 unique screen names and each of 2500 html files may have one or more screen names identified in that meta tag field.
I am hoping for some regular express magic to help me with that global search/replace effort. Suggestions are greatly appreciated.
Thanks,
Rick
================================
Nevermind... figured it out. Just needed to study regex syntax a bit. Here's the answer:
Find: <meta name="C_NAME" content="(\w+)\s(\w+)\s
Replace: <meta name="C_NAME" content="$1, $2,

The only transition you can add this way is default cross dissolve. If the images are in the timeline, move the playhead to the beginning of the images, select them all, and drag from the timeline to the canvas to overwrite with transition.

IMPDP question -- Replace all objects

Question
I have a situation where I would like to use Data Pump to refresh a test database schema weekly from our production instance schema. I would like to set up a job to do this with data pump. I can use data pump (impdp) with success using the network_link with no issue. My question is does anyone know if there is a way (or can recommend a way) to update all schema objects in this refresh job? For example, I want to bring over all tables, data, sequences, plsql, etc. The impdp command allows for TABLE_EXISTS_ACTION=REPLACE to replace existing tables/data, is there a similar commadn so I can do this for specifically sequences, etc? Is there a replace command for other objects types when using impdp? If not anyone have any ideas how I can easily do this?
I know I could write a sql script to drop and recreate all, but then I run into the problem of having to regrant privileges to sequences, synonyms, plsql, etc.
Looking for ideas.
Thanks.

- Is TEST always the exact same schema (DDL) as PROD?YES, or should I say it better be as I do not typically allow for direct changes to be made to production without running through test. Although if we are refreshing test from production, any DDL changes would eventually be updated in test from their values in prod anyway.
- Have you considered using replication or CTAS instead?Replication, YES, but there is a lot of overhead for something I don't think warrants it. CTAS would work, but drop and recreating is slower than IMPDP.
Just the data & PL/SQL, no DDL?I suppose DDL would also need to be refreshed.
There is IGNORE=N, which forces you to pre-drop the tables in TEST, thereby guaranteeing freshness.Does this option drop the objects or does it just err out and require that you pre-drop them by some other means. I suppose it is the latter..
Because IMPDP is good at brining privileges over. I am considering the following.
1. Connect to TEST, drop user XYZ cascade;
2. impdp command using network_link option.
Does doing this miss something?
I agree that impdp log file will have to be interrogated for errors.
Let me know.
Thanks.

Regex question; $1, $2, etc

Hi,
If I have the following regex Pattern set up:
Pattern title = Pattern.compile("<title>([^<]+)</title>");and I want what's within the parentheses to be stored as a varialbe, the way it would be in perl:
my $title = $1;or whatever, how do I do that in java? Couldn't find it on any of hte regex tutorials I was looking at.
thanks,
bp

The JDK regex package doesn't store captured groups in local variables like Perl does. Instead, you have to retrieve them from the Matcher using the group(int) methods. However, you can use $1, $2, etc. in the replacement string when you do a replaceAll or replaceFirst, and the Matcher will replace them with the appropriate captured groups.

A regex question

Hi all,
I'm trying to get a regular expression used in java to only replace all commas with '#' in a blanket from a specific string.
eg:
original string:"aaa,bbb,to_char(p.sss,'999,999,999.9999') sss,ddd,to_char(eee,'999,999'),fff"
desired output:"aaa,bbb,to_char(p.sss#'999#999#999.9999') sss,ddd,to_char(eee#'999#999'),fff"After some researches,I got this: "(?<=$[^$]{1,50}),(?=[^$]{1,100}$)".This one works fine in regex tools such as RegexBuddy..etc..
However the java program(jdk 1.5.0) seems not work correctly:
     public static void main(String[] args) {
          String str="aaa,bbb,to_char(p.sss,'999,999,999.9999') sss,ddd,to_char(eee,'999,999'),fff";
          System.out.println(str);
          System.out.println("-------------");
          str=str.replaceAll("(?<=\$[^\$]{1,100}),(?=[^\$]{1,100}\$)", "#");
          System.out.println(str);
     }the output still "aaa,bbb,to_char(p.sss,'999,999,999.9999') sss,ddd,to_char(eee,'999,999'),fff"It seems there is something wrong with the "positive lookahead",but as far as I know,java can support this kind of regex: (?<=\$[^\$]{1,100}?)Any ideas?
Thanks!

Right: we consume some of the text with one part of the regex to prevent the other part from seeing it. Here's the breakdown I promised:
With the lookaround approach, we were essentially locating a comma first, then looking backward and forward to figure out whether we should replace it. Since the lookbehind turned out to be unreliable, we need to start matching at some point before the comma, in such a way that all of the ineligible commas either get ignored, or get matched within a capturing group so we can plug them back into the replacement string. The first thing we need to do is match everything up to the first open-parenthesis, because we know we can ignore any commas before that point. As a standalone regex, that part would look like this: "[^(]+\\(" Once we're inside the parens, we can go ahead and match everything up to the next comma. In case we find a set of parentheses with no commas in it, we also add the close-paren to the negated character class: "[^),]+," That works fine for the first match, but it will break down after that because the first part will match everything up to the next open-paren, including the rest of the contents of the first set of parens. That part was meant to fail within parens; that's why it's optional. Since it's required to match an open-paren, and the only way it can reach the next one of those is to match the the intervening close-paren, we can fix it by adding the close-paren to the open-paren in the character class: "[^()]+\\(" And that's all we really need. Once the last comma inside the parens is matched, the first part of the regex takes us up the the next open-paren, where the second part takes over again. The lookbehind turns out not to be necessary once the rest of the regex is properly tuned--it was left over from my earlier attempts to create a working regex. The open-paren in the second character class isn't really needed either, but it doesn't hurt anything and it helps express our intentions. And, as I said earlier, the possessive quantifiers just make the regex a little more efficient. str = str.replaceAll("((?:[^()]++\\()?+[^(),]++),", "$1#"); Although we developed this regex as a replacement for a lookaround-based one, I would encourage everyone to look for non-lookaround solutions first. Despite all the enhancements that have been made to regexes over the years, they still work best when used in a forward-looking, positive-matching style like what we ended up with here.

Regex Search & Replace

I'm having trouble making my regex work. Here's the code
public static void main(final String[] args) {
     final String input = "Have 5 of each: there are dogs and there are cats and there are horses";
     final String output = input.replaceAll("(Have 5.+are )(\\w+?)", "$1five $2");
     System.out.printf("Input : %s\n", input );
     System.out.printf("Output: %s\n", output);
}My desired output was
Have 5 of each: there are five dogs and there are five cats and there are five horses
but instead I get
Have 5 of each: there are dogs and there are cats and there are five horses.
In other words, it's making only a single replacement of "there are xxx" instead of multiple replacements. Anyone see what I'm doing wrong?
Additional note: I can't just replace "there are" with "there are 5," because I have to see the "Have *nn* of each " to know what replacement to use, and nn will vary each time I call this. This is NOT an academic assignment. I'm trying to understand the regex as a prototype for inclusion in a larger app.
Any suggestions or insight would be greatly appreciated!

I guess part of the problem is the documentation (or maybe I'm not good enoug at reading it). In the Javadocs for java.util.regex.Pattern, it lists the greedy, relucatnt & posesssive qualifiers, but doesn't explain what they mean, or how they affect the result (i.e., when you want to use one type over the over). I have to admit I'm baffled.

Simple Java regex question

I have a file with set of Name:Value pairs
e.g
Action1:fail
Action2:pass
Action3:fred
Using regex package I Want to get value of Name "Action1"
I have tried diff things but I cannot figure out how I can do it. I can find Action1: is present or not but dont know how I can get value associated with it.
I have tried:
Pattern pattern = Pattern.compile("Action1");
CharSequence charSequence = CharSequenceFromFile(fileName); // method retuning charsq from a file
Matcher matcher = pattern.matcher(charSequence);
if(matcher.find()){
int start = matcher.end(0);
System.out.println("matcher.group(0)"+ matcher.group(0));
how I can get value associated with specific tag?
thanks
anmol

read the data from the text file on a line basis and you can do:
String line //get this somehow
String[] keyPair = line.split(":")g
System.out.println(keyPair[0]); //your name
System.out.println(keyPair[1]); //your valueor if you've got the text file in one big string:
String pattern = "(\\a*):(\\a*)$"; //{alpha}:{alpha}newline //?
//then
//do some things with match objects
//look in the API at java.util.regex

Java Regex Question (HTML Tokenizing

Hello
I would like to tokenize a HTML Page into its html tags and could not find any working expression. I tried it with:
<[.]*>
and for all input fields:
<(INPUT.*)>
But it doesn't find anything either or it findes anything.
Can somebody help me?

</?\S+?[\s\S+]*?>
"/?" means: "/" can be there but doesnt have to
"\S" means: every character which isnt a whitespace
"+" means: look for the previous character if it is there at least one time.
the "?" after the "+" means: look only for as few of the previous characters as needed to fullfill the regex.
thats why <adf>sdf> isnt found because <adf> is the shortest string that fullfills the regex.
"[]" means: treat everything inside the brackets as one term
"\s" means: look for a whitespace
"*" means: the previous character (which is the term inside the brackets) can be there as many times as it wants, even zero times
"*?" is like "+?"

Java.util.regex and replacing patterns with function calls

Hi everyone,
I'm in terrible need for help.
Any advice is much appreciated.
I have the following sentence in a file. The sequence of numbers is actually a
date in seconds since 1970. I need to identify the 9 or 10 sequence of numbers
send that to a method that will transate the numbers into a date in a string format.
Then replace it in the file.
Here is an example:
"My name is Peter. Please 1020421277 help me figure this out 108062327. "
using the following block I can convert it to this
"My name is Peter. Please | help me figure this out |. "
[block]
Pattern p = Pattern.compile("[0-9]{9,10}+");
Matcher m = p.matcher("");
String aLine = null;
while((aLine = in.readLine()) != null) {
m.reset(aLine);
String result = m.replaceAll("|");
out.write(result);
out.newLine();
[end block]
So I need to change the above block so that my sentence looks like this.
"My name is Peter. Please 05/03/2002 10:21:17 help me figure this out 06/04/1973 17:18:47. "
The method that converts the numbers into a date is not of a concern because I already have
the code to do that.
If anyone has suggestion, please let me know.
Thanks in advance.
Peter
[email protected]

Never mind, I was able to figure it out....
Common2 cm = new Common2();
Pattern p = Pattern.compile("[0-9]{9,10}+");
Matcher m = p.matcher("");
m = p.matcher(s1);
while ((found = m.find())) {
String replaceStr = m.group();
System.out.print("\tReplaceString is: " + replaceStr);
replaceStr = cm.get_date(replaceStr);
System.out.println("\t\tConverted to: " + replaceStr);
m.appendReplacement(buf,replaceStr);
m.appendTail(buf);
----------------- The get_date method in Common2 looked like this ------------------------------
public String get_date(String myString)
String s;
long lsecs = Long.parseLong(myString);
lsecs = lsecs * 1000; // its now milliseconds.
// Determine default calendar and then the offset to GMT.
Calendar localCalendar = Calendar.getInstance();
int offset = localCalendar.get(Calendar.DST_OFFSET) + localCalendar.get
(Calendar.ZONE_OFFSET);
// Take the number of milliseconds subtract the gmtoffset
Date GMTdate = new Date(lsecs - offset);
// Format date
Format myformat;
myformat = new SimpleDateFormat("MM/dd/yyyy HH:mm:ss a z ");
s = myformat.format(GMTdate);
return s;
Peter

Java Regex Question

I wanted to do some regex to see if a string has a subdomain.
I want to pass string then check if there is a xxx.example.com or if it's just example.com. Anyone have a clue?
Thanks,
Brian

I just went around and used the split method to check, I'm posting my code in case someone else has this problem and limited to the 1.4 jdk.
String split = domain.split("[.]") ;
if(split.length > 2)
domain = split[split.length - 2] + "." + split[split.lengh -1] ;basically what I wanted to do was see if it was a subdomain and then strip the preceding and just get to the actual domain.
Thanks for the replys

Regex question (does not contain)

Can anyone tell me what regular expression I could use with Dreamweaver to search for files that do NOT contain the word "physiology"? Ideally, I'd like to find pages that don't contain any variation - physiology, Physiology or PHYSIOLOGY. However, if you have time to show me a couple regex's, including one that's case-sensitive, that would be great.
I've tried the following two "negative lookaround" regex's without success:
^(?:(?!Physiology).)*$
^(?!.*Physiology).*
I think they're both designed to work with strings, not with entire files.
Thanks.

Not sure how to do this in DW but I suggest try using Windows FindStr function as explained here:
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/findstr.msp x?mfr=true
Adapt the method to suit your needs.
Good luck.

Regex question. Please help!

I'm trying to capture instances like
&_l_t_;something&_g_t_;
&_l_t_;blahblah&_g_t_;I tried the following regular expressions:
"&_l_t_;.+?&_g_t_;"
"\\&_l_t_;.+?\\&_g_t_;"but neither worked.
In the code above, the underscore character should not be there, because i was not able to post my message correctly if i did not use underscore to connect the characters '&', 'l', 't', ';'
Please help!

import java.util.regex.*;
public class TagCheck {
public static void main(String[] args) {
    String[] codes = {
      "&_lt;html&_gt;", "a&_lt;b", "abc", "&_lt;head&_gt;", "c&_gt;d"
    String code = "^(&_lt;).*(&_gt;)$";
    Pattern codePattern = Pattern.compile(code);
    Matcher match;
    for(int j = 0; j < codes.length; j++) {
      match = codePattern.matcher(codes[j]);
      System.out.println("codes[" + j + "] = " + codes[j]);
      if(match.find())
        for(int k = 0; k <= match.groupCount(); k++)
          System.out.println("\t\t\tgroup " + k + " = " + match.group(k));
}

Regex question: replace

Similar Messages

Maybe you are looking for