Regex and backslashs

Hi,
I need to exclude cache folders from backup and am using java.util.regex for this. I am reading the regex like D:\test\b\cache.*? from a simple text file using BufferedReader. I Thought the problem was related to the backslash escapes but I am not sure. After the debugging - whatever: I am stuck...
all kinds of comments welcome
additional data
the variable given to the Matcher.reset() method are like string= "D:\\test\\b\\cache", the pattern in the Matcher is pattern= "D:\\test\\b\\cache.*?"

hmmm, had tried with double ones allready - gues that was before i had the correct end part of the pattern - whatever after you pointed me onto it I tried again - works...
now the only question is if our users will accept to enter \\ to the files, is there a quick way to have single ones in the text file read correctly?
thx a lot

Similar Messages

  • A problem with regex and special characters

    Hello,
    I am using regex in my application but i have a problem with special characters. Here is the explanation of what i am doing:
    I have a certain piece of text that i want to parse and replace every occurrence of a given word with some sort of a tag which have the word found inside it.
    so that: go Going Go to gOschool by bus and to learn and to play GO Go
    and i need to replace the word "go" (case insensitive and only at word boundaries) should be:
    *<start>go<end> Going <start>Go<end> to gOschool by bus and to learn and to play <start>GO<end> <start>Go<end>*
    Consider the following code and call the method with the parameter"go?"
    The Matcher finds a weird match at the word "G?oing" with only the letter G !!!
    It also ignores the "?" in the pattern completely.
    Any clue of what is happening i would be very grateful...
    private static String replaceMatches(String strToFind)
            String resultArticle="";
            String article = " "+"go? G?oing Go? to gOschool by bus and to learn and to play GO? Go?*"+" ";
            strToFind = "\\b"+ strToFind +"\\b";
            String linkPart1= "<start>";
            String linkPart2 = "<end>";
            Pattern p = null;
            try{
                p=Pattern.compile(strToFind, Pattern.CASE_INSENSITIVE);
            Matcher m = p.matcher(article);
            String[] res = p.split(article);
            int i=0;
            //System.out.println("result of split: "+res.length );
            while(m.find())
                resultArticle+=(res[i]+" ");
                resultArticle+=linkPart1;
                resultArticle+=m.group().trim();
                resultArticle+=(linkPart2+" ");
                i++;
            if(i<res.length)
                resultArticle+=res;
    //System.out.println("result of match: " + i);
    System.out.println(article);
    //System.out.println(resultArticle.trim()+scripts);
    catch(PatternSyntaxException ex){}
    return resultArticle.trim();
    }Thanks                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

    tarek.mamdouh wrote:
    because split will not work when trying to replace the first word if i don't append a space at the beginning.Split doesn't work anyway. And my question wasn't why do you add spaces (which you really don't need to do), but why do you do them with " " + "go" rather than just " go"
    replaceAll will replace all the occurrences in the text with only one word. without taking into consideration the case of the word i need to replace.No.
    >
    If i use replacaAll(article, strToFind) the output will be:
    <start>go?<end> G?oing <start>go?<end> to gOschool by bus and to learn and to play <start>go?<end> <start>go?<end>No. I showed you the actual output of an actual replaceAll.
    which is not what i want as i need to keep the case of the words i am replacingThe replaceAll I showed you does that.
    Please study the examples given and read the docs carefully rather than making claims based on inaccurate guesses.

  • REGEXP_REPLACE and backslashes ... possible bug in 10g?

    I have had a problem with backslashes in the replacement text of the REGEXP_REPLACE function.
    You see, I'm using REGEXP_REPLACE to implement a "replace whole words only" function. The implementation consists of four find-replace pairs constructed as follows (with the <find> and <replace> taking the place of the actual strings):
    update X set Y = REGEXP_REPLACE (REGEXP_REPLACE (REGEXP_REPLACE (REGEXP_REPLACE ('([^[:alnum:]])<find>([^[:alnum:]])', '\1<replace>\2'), '^<find>([^[:alnum:]])', '<replace>\1'), '([^[:alnum:]])<find>$', '\1<replace>'), '^<find>$', '<replace>')
    The problem comes in when <replace> contains backslashes. According to the documentation (the Oracle Database SQL Reference, section 7) suggests that \ must appear as \\ in the replacement string.
    However, I'm finding that you use \\ to represent \ in the replacement only if the find string does not contain groups (the parenthesized expressions). Is that how it's supposed to work? We're using 10g.
    Here is a small SQL script to illustrate the behaviour I'm seeing. It is supposed to replace "is" with "\". I expected the expression for OUTPUT1 to be correct, but it seems that the expression for OUTPUT2 works (notice that the second expression is different in only ONE CHARACTER from the first).
    create table REGEXP_TEST ( INPUT VARCHAR2(15), OUTPUT1 VARCHAR2(15), OUTPUT2 VARCHAR2(15), EXPECTED VARCHAR2(15), INCORRECT VARCHAR2(15)) ;
    insert into REGEXP_TEST (INPUT, EXPECTED, INCORRECT) values ('This is a test', 'This \ a test', 'This \\ a test');
    insert into REGEXP_TEST (INPUT, EXPECTED, INCORRECT) values ('This is', 'This \', 'This \\');
    insert into REGEXP_TEST (INPUT, EXPECTED, INCORRECT) values ('is a test', '\ a test', '\\ a test');
    insert into REGEXP_TEST (INPUT, EXPECTED, INCORRECT) values ('is', '\', '\\');
    update REGEXP_TEST set OUTPUT1 = REGEXP_REPLACE (REGEXP_REPLACE (REGEXP_REPLACE (REGEXP_REPLACE (INPUT, '([^[:alnum:]])is([^[:alnum:]])', '\1\\\2'), '^is([^[:alnum:]])', '\\\1'), '([^[:alnum:]])is$', '\1\\'), '^is$', '\\');
    update REGEXP_TEST set OUTPUT2 = REGEXP_REPLACE (REGEXP_REPLACE (REGEXP_REPLACE (REGEXP_REPLACE (INPUT, '([^[:alnum:]])is([^[:alnum:]])', '\1\\\2'), '^is([^[:alnum:]])', '\\\1'), '([^[:alnum:]])is$', '\1\\'), '^is$', '\');
    select * from REGEXP_TEST;

    Hello Darren,
    You're right, the behaviour does seem a little odd:
    SQL> select regexp_replace('X', 'X', '\\') from dual;
    SQL> select regexp_replace('X', '(X)', '\\') from dual;
    What I suggest is you rewrite your query as a single regular expression thus removing the 3 extra calls to REGEXP_REPLACE:
    update REGEXP_TEST
    set OUTPUT1 = REGEXP_REPLACE (INPUT, '(^|[^[:alnum:]])is([^[:alnum:]]|$)', '\1\\\2');
    This should solve your problem for now.
    Regards.

  • Regex and implementing FilenameFilter problem

    Hello,
    So what I'm trying to do is to create a program that takes a certain set of files, pulls the first line of each file and uses it to name the file. Right now, I'm at the point of getting a listing of files based on a patterns. So when I run the program on the command line (of a windows machine), it spit out the files that I'm looking for. Something like:
    java FileRenamer *.txt
    Above should produce a listing of only files that have .txt on them (I want to have the capability to choose *.txt or whatever other combination of pattern match).
    To do the above, I want to use a FileNameFilter interface to figure out what files match. The problem that I'm running into is that when I run a unit test against the getFilesListBasedOnPattern method, I get:
    java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
    *.txt
    The problem is that the *.txt has a regex character (the *) and I'm not sure how make it behave like the wildcard in the dos command line where *.txt means everything that has .txt at the end.
    The code listing is below. Does anyone have any suggestions on how to best approach this?
    mapsmaps
    =======> Code below <=======
    // unit test snippet that causes blow out:
    FileRenamer fr = new FileRenamer();
    String [] strArrFilesBasePattern = fr.getFilesListBasedOnPattern(dirTestFiles,"*.txt");
    ====
    //main program
    package com.foo.filerenamer;
    import java.io.File;
    import java.io.FilenameFilter;
    import java.util.Vector;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    * TODO Use regexp to filter out input to *.txt type of thing or nothing else
    public class FileRenamer
        // Vallid file patterns are *.*, ?
        public static final String strVALIDINPUTCHARS = "[_.a-zA-Z0-9\\s\\*\\?-]+";
        private static Pattern regexPattern = Pattern.compile(strVALIDINPUTCHARS);
        private static Matcher regexMatcher;
         * @param args
         * @throws InterruptedException
        public static void main(String[] args) throws InterruptedException
            int intMillis = 0;
            if (args.length > 0)
                try
                    intMillis = Integer.parseInt(args[0]);
                    System.out.println("Sleep set to " + intMillis + " seconds");
                catch (NumberFormatException e)
                    intMillis = 5000;
                    System.out.println("Sleep set to default of " + intMillis + " since first parameter was non-int");
                for (int i=0;i<args.length;i++)
                    System.out.println("hello there - args["+i+"] = "+ args);
    Thread.sleep(intMillis);
    // TODO Auto-generated method stub
    public boolean checkArgs(String [] p_strAr)
    boolean bRet = false;
    if (p_strAr.length != 1)
    return false;
    else
    regexMatcher = regexPattern.matcher(p_strAr[0]);
    bRet = regexMatcher.matches();
    return bRet;
    public String[] getFilesListBasedOnPattern(File p_dirFilesLoc, String p_strValidPattern)
    String[] strArrFilteredFileNames = p_dirFilesLoc.list(new RegExpFileFilter(p_strValidPattern));
    return strArrFilteredFileNames;
    class RegExpFileFilter implements FilenameFilter
    private String m_strPattern = null;
    private Pattern m_regexPattern;
    public RegExpFileFilter(String p_strPattern)
    m_strPattern = p_strPattern;
    m_regexPattern = Pattern.compile(m_strPattern);
    public boolean accept(File m_directory, String m_filename)
    if (m_regexPattern.matcher(m_filename).matches())
    return true;
    return false;

    I am doing something similar but have a problem with Java automatically converting wildcards in path-arguments to the first match (!).
    It seems the JVM is applying some intelligence here and checks if a path is passed to main() and if so, it automatically resolves wildcards (also quotes are escaped/resolved), which is pretty annoying and not what I want, since I do never see the original parameters this way:(
    Is there a way to get the original parameters without the JVM intervening / "helping"?
    Any help would be appreciated, as I want my utility to act just like any other shell-program...

  • About regex and 'like' query

    Dear all,
    can i doing regex query in SAP B1?
    And how to use 'like' query with table, i mean :
    SELECT * FROM test T0 INNER JOIN test2 T1
    WHERE T0 LIKE '%T1.testfield%'
    thanks for your help

    wait - something came out funny in the previous posting - the system highlighted the name "field" with blue - that is NOT what I typed...
    instead I typed...
    "field" between two brackets ( bracket = [ and the other bracket - I cannot type them in because they come out as a different character)
    what the heck is going on with the forum here???  I am seeing this highlighting in other postings as well...
    Hope THIS one comes out correctly
    Edited by: Zal Parchem on Dec 29, 2007 2:47 PM

  • Question related to regex and whitespaces  \s

    Hello, i have a problem related to regex.
    I have a text area where sm types text. I noticed that when i click on the Enter button (i have a new line) the string is not being recognised.
    String regex = "[A-Za-z0123456789_./-]*";I tried to place \s but \s includes other whitespaces characters.
    I would like to include in my regex the \n character (the Enter button) or general the \s characters.
    How am i supposed to do that?
    Thanks, in advance!

    g_p_java wrote:
    prometheuzz wrote:
    >
    Note that on Windows, a line break is "\r\n".
    Also, A-Za-z0123456789_ can be written as \w:
    String regex = "[\r\n\\w./-]*";
    If we are using Linux , Unix is that different?The OS line break is just \n. I'm not sure what Swing puts into a GUI element, whether it's OS dependent or not. It won't hurt you to leave the \r in there though. If there's no \r in the string, it won't stop your regex from working, just like it won't stop it from working when you have A-Z and they don't happen to enter a Z.
    The only way it would cause a problem to leave the \r in the regex is if \r were somehow part of the input and you didn't want it treated as end-of-line. I don't see that happening though.

  • ACE Probe regex and escaping Parenthesis

    I'm trying to setup a ACE probe that expects a return of
    (server.domain.com) EXISTS=TRUE,AVAILABLE=TRUE,ACTIVE=TRUE
    But it doesn't appear that I can use Parenthesis inside a regex.  I've tried escaping as well.
    expect \(server\.domain\.com\) EXISTS=TRUE,AVAILABLE=TRUE,ACTIVE=TRUE
    % invalid command detected at '^' marker.   Pointing at the (
    But this doesn't work either.  Any ideas?

           Hi,
    Hi,
    If it has taken it, it should match the response from server.  Is it still not matching?
    If you look at the regex builder below, the regex matches the response which is expected from the server. So ACE should be able to match it.
    Also, you can try and put \ before dots but not sure. In my opinion it should work fine with what we have put in already. If it doesn't we will have to use hit and trial. Let me know if you need this regex builder. You can download it from google though. In any case i just attached it.

  • RegEx and capturing groups

    hi.
    i'm trying to use the capturing groups to extract substrings.
    this is the data format: MTWTFSS@2005-03-19
    and this my regex: Pattern.compile("((\\p{Upper}|-){7})@(19|20)\\d\\d-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])");
    i want to extract the substring before the @. i have set the capturing group, but i always get an error:
    java.lang.IllegalStateException: No match found
    at java.util.regex.Matcher.group(Matcher.java:353)
    at vdrremotecontrol.VDRTimer.(VDRTimer.java:93)
    at vdrremotecontrol.VDRRemoteControl.getTimersFromVDR(VDRRemoteControl.java:628)
    at vdrremotecontrol.VDRRemoteControl.loadSettings(VDRRemoteControl.java:855)
    at tvbrowser.core.plugin.JavaPluginProxy.doLoadSettings(JavaPluginProxy.java:191)
    ... 5 more
    what's the right way to set the capturing group?
    greetings, henrik

    me again :-)
    // MTWTFSS@2005-03-19
    Pattern repeatingAt =
    tingAt =
    Pattern.compile("((\\p{Upper}|-){7})@(19|20)\\d\\d-(0[
    1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])");
    if(dayPattern.matcher(day).matches()) {
    repeating = false;
    this.day_of_month = Integer.parseInt(day);
    } else if(datePattern.matcher(day).matches())
    tches()) {
    repeating = false;
    this.day_of_month =
    of_month =
    Integer.parseInt(datePattern.matcher(day).group(3));
    } else
    } else if(simpleRepeating.matcher(day).matches()) {
    repeating = true;
    repeating_days = determineDays(day);
    } else
    } else if(repeatingAtShort.matcher(day).matches())
    repeating = true;
    String days = day.substring(0,
    bstring(0, day.indexOf("@"));
    repeating_days = determineDays(days);
    } else if(repeatingAt.matcher(day).matches())
    tches()) {
    repeating = true;
    Matcher matcher =
    matcher = repeatingAt.matcher(day);
    System.out.println("#"+day+"#");
    System.out.println(matcher.groupCount());
    System.out.println(matcher.group(1));
    String days = day.substring(0,
    bstring(0, day.indexOf("@"));
    repeating_days = determineDays(days);
    }output is:
    [java] #MDMDFSS@2005-06-10#
    [java] 5
    [java] SCHWERWIEGEND: Die Einstellungen des
    n des Plugins "Video Disc Recorder" konnten nicht
    geladen werden.
    [java]
    java]
    (/home/henni/.tvbrowser/java.vdrremotecontrol.VDRRemot
    eControl.prop)
    [java] util.exc.TvBrowserException: Die
    : Die Einstellungen des Plugins "Video Disc Recorder"
    konnten nicht geladen werden.
    [java]
    java]
    (/home/henni/.tvbrowser/java.vdrremotecontrol.VDRRemot
    eControl.prop)
    [java] at
    a] at
    tvbrowser.core.plugin.JavaPluginProxy.doLoadSettings(J
    avaPluginProxy.java:197)
    [java] at
    a] at
    tvbrowser.core.plugin.AbstractPluginProxy.loadSettings
    (AbstractPluginProxy.java:114)
    [java] at
    a] at
    tvbrowser.core.plugin.PluginProxyManager.activatePlugi
    n(PluginProxyManager.java:505)
    [java] at
    a] at
    tvbrowser.core.plugin.PluginProxyManager.activateAllPl
    uginsExcept(PluginProxyManager.java:459)
    [java] at
    a] at
    tvbrowser.core.plugin.PluginProxyManager.init(PluginPr
    oxyManager.java:220)
    [java] at
    a] at tvbrowser.TVBrowser.main(TVBrowser.java:307)
    [java] Caused by:
    d by: java.lang.IllegalStateException: No match
    found
    [java] at
    a] at
    java.util.regex.Matcher.group(Matcher.java:353)
    [java] at
    a] at
    vdrremotecontrol.VDRTimer.<init>(VDRTimer.java:94)
    [java] at
    a] at
    vdrremotecontrol.VDRRemoteControl.getTimersFromVDR(VDR
    RemoteControl.java:628)
    [java] at
    a] at
    vdrremotecontrol.VDRRemoteControl.loadSettings(VDRRemo
    teControl.java:855)
    [java] at
    a] at
    tvbrowser.core.plugin.JavaPluginProxy.doLoadSettings(
    JavaPluginProxy.java:191)
    [java] ... 5
    ... 5
    more[/cod
    e][u][/u]
    This example executes fine.
            String expression = "((\\p{Upper}|-){7})@(19|20)\\d\\d-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])";
            String data = "MTWTFSS@2005-03-19";
            Pattern p = Pattern.compile(expression);
            Matcher m = p.matcher(data);
            while (m.find()) {
                System.out.println(m.group(1));
            }        /Kaj

  • Help with RegEx and Textinput

    I have the following method:
    public function handleKeyPress(event:KeyboardEvent):void {
    //trace("Key pressed: "+event.keyCode+","+event.charCode);
    var testTest:String = cComponent.cInput.text +
    String.fromCharCode(event.keyCode);
    trace("Text tested: \""+testTest+"\"");
    // See if the user typed '/' followed by a number and a
    space
    if(testTest.match(new RegExp("\/[0-9]*\s"))) {
    // User typed '/### '
    // TODO: Get bpname and do something here...
    cComponent.cInput.text="";
    event.stopImmediatePropagation();
    Basically, I need to match on the pattern "/### ", where ###
    is any series of numbers, do something, then clear the TextInput,
    but I can't seem to get the RegEx right. I have tried several
    different ones, including '\/\d*\s' and '\/[0-9]* ', but nothing
    seems to work.
    On top of this, the call to cComponent.cInput.text="" doesn't
    wipe the value of the TextInput field either.
    Can someone point me in the right direction?

    This got me mostly working. It's still matching on all
    strings that start with / then some number, but I still cannot get
    that TextInput to clear. From the code I call:
    cComponent.cInput.text="";
    It doesnt throw an error, but it doesn't clear the text
    either.

  • Regex and complex numbers

    Hi community ,
    I made a small program using simple regex.
    I can't figure out why I cannot input a complex number
    in a format like (     Re{z},Im{z}     ) i.e. I don't know why
    the brackets can't be separated with the white spaces?
    Thnx in advance,
    Nikola
    Here is the code :
    * @author Nikola Radakovic
    import java.util.*;
    import java.util.regex.Pattern;
    // complexni number
    class ComplexNo {
         private float x,y;
         public float getX() {
              return x;
         public void setX(float x) {
              this.x = x;
         public float getY() {
              return y;
         public void setY(float y) {
              this.y = y;
         public ComplexNo sumarize(ComplexNo drugi){
              this.x = this.x + drugi.getX();
              this.y = this.y + drugi.getY();
              return this;
    public class Complex {
         public static void main(String [] args){
              ComplexNo no1;
                 ComplexNo result = new ComplexNo();
              Scanner sc = new Scanner(System.in);
              boolean flag = true;
              Pattern pattern = Pattern.compile("\\(\\s*\\d*\\.?\\d*,\\d*\\.?\\d*\\s*\\)");
              while(flag){
              System.out.println("Input a complex number in format (Re{z},Im{z}) : ");
              while(sc.hasNext(pattern))
                   String str = sc.next().replaceAll("\\(|\\)","");
                   Scanner float_scan = new Scanner(str);
                   float_scan.useDelimiter(",");
                   // get a float number
                   while(float_scan.hasNextFloat()){
                        no1 = new ComplexNo();
                        no1.setX(float_scan.nextFloat());
                        no1.setY(float_scan.nextFloat());
                        System.out.println("Inputed number "+no1.getX()+","+no1.getY());
                        result.sumarize(no1);
              flag = false;
              System.out.println("The result is  "+result.getX()+","+result.getY());
    }Edited by: EqAfrica on Jan 8, 2008 7:29 PM
    Edited by: EqAfrica on Jan 8, 2008 7:30 PM

    EqAfrica wrote:
    It works perfect :)
    even if I don't know what's the difference
    between sc.findWitihHorizon(pattern,0).
    and sc.hasNext(pattern)The pattern remains the same.
    thnx a lotI believe the key is (from the API) "This method searches through the input up to the specified search horizon, *ignoring delimiters*. "

  • Regex and StringBuffer

    Hi,
    I have this weird issue.
    I build up the regex pattern using a StringBuffer. I then do -
    matcher = Pattern.compile(regexPatternBuffer.toString()).matcher(fileContentBuffer.toString());When I printout regexPatternBuffer.toString(), it prints out fine. I went ahead and copied the sop'ed pattern and pasted it in "Pattern.compile". I get the desired result. But the earlier case doesnt seem to work. Hope I am clear.

    My bad. Heres the example.
    int paramCount = paramClasses.length;
                            System.out.println(methodName +","+paramCount);
                            patternStringBuilder.setLength(0);
                            patternStringBuilder.append("(?m)" + methodName + "\\\\(");
                            int loopCount = 0;
                            for(Class pc : paramClasses){
                                 patternStringBuilder.append("\\\\s*");
                                 patternStringBuilder.append(pc.getSimpleName());
                                 patternStringBuilder.append("\\\\s*");
                                 patternStringBuilder.append("[A-Za-z]+");
                                 patternStringBuilder.append("\\\\s*");
                                 if(loopCount < paramCount - 1){
                                      loopCount++;
                                      patternStringBuilder.append("[,]");
                            patternStringBuilder.append("\\\\)");
                            //matcher.reset();
                            if(paramCount == 0){
                                 matcher = Pattern.compile("(?m)[^\\s]+\\([^)]*\\)").matcher(fileContentBuffer.toString());
                            } else{
                                 String voo = patternStringBuilder.toString();
                                 System.out.println("---->"+voo.trim());
                                 matcher = Pattern.compile(voo).matcher(fileContentBuffer.toString());
                            str = "";
                            while(matcher.find()) {
                              str = matcher.group().replace("\n", "");
                              umlStringBuilder.append(removeDuplicateWhitespace(str));
                            }I dont seem to get into the while loop. matcher.find() turns out as false.

  • Regex and ReplaceFirst(). It actually replaces last...

    Hi everyone.
    I'm a little new to regex - so I am having trouble with the regex expression using method ReplaceFirst() in Class String.
    I'm starting off with a string that contains the word isbn followed by the isbn number and other stuff. i.e. "It's a book ISBN 978-90-481-2410-7 which we couldn't find". My goal is to get just the first isbn number. When there is only one ISBN in the string it works fine - not so with 2.
    My first step is to remove everything before the first isbn using ->
    line= line.replaceFirst("(.*)[iI] *[sS] *[bB] *[nN]", "");However, if there are 2 isbns in the string it removes everything before the second ISBN... Here are a few tests that I did on a website that lets you try regex expressions. (I also tested it in java.)
    Test --------------- Target String ------------------------------------------------------- replaceFirst() ----------- replaceAll()
    1 ------ hISBN 978-90-481-2409-1 e-ISBN 978-90-481-2410-7 ------- 978-90-481-2410-7 --- 978-90-481-2410-7
    2 ------ hISBN 978-90-481-2409-1 ---------------------------------------------- 978-90-481-2409-1 --- 978-90-481-2409-1
    3 ------ e-ISBN 978-90-481-2410-7 hISBN 978-90-481-2409-1 ------- 978-90-481-2409-1 --- 978-90-481-2409-1It seems that the regex goes from right to left, while I thought it went from left to right. Or did I miss something? How would I make sure it only removes everything "before" the "first" ISBN?
    Thanks for any help,
    Matt
    I hope this shows up normally because in the preview window the table always gets massed up.....

    You could always use "ISBN " in a lookbehind construct, something like so:
    String regex = "(?<=ISBN )[\\d]{3}-[\\d]{2}-[\\d]{3}-[\\d]{4}-[\\d]";for instance, if the isbn.txt is in the same directory as the class files,
    import java.util.Scanner;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    public class IsbnTest {
      private static final String ISBN_TXT = "isbn.txt";
      public static void main(String[] args) {
        String regex = "(?<=ISBN )[\\d]{3}-[\\d]{2}-[\\d]{3}-[\\d]{4}-[\\d]";
        Pattern p = Pattern.compile(regex);
        Scanner scanner = new Scanner(IsbnTest.class.getResourceAsStream(ISBN_TXT));
        while (scanner.hasNextLine()) {
          String line = scanner.nextLine();
          Matcher m = p.matcher(line);
          if (m.find()) {
          System.out.println(m.group());
    }

  • RegEx and Database

    My problem is actually to take the name fields (lastname, firstname, middlename) in the database (all in Uppercase) and correctly display them. For instance:
    O'IRISHNAME --> as O'Irishname and
    LONGHYPHENATED-NAME as Longhyphenated-Name and
    LASTNAME/PARTTWO as Lastname/Parttwo and lastly
    LNAME PARTTWO as Lname Parttwo.
    Using the java.util.regex API seems to be the way to go.
    1) How do I make the corrections IN the database rather than say in a file? as a function?
    2) The apostrophe and the dash seem to be easy enough but the space (&160; or  ) wont return in a query. No luck with the / either.
    Point me in the right direction, please

    908644 wrote:
    My problem is actually to take the name fields (lastname, firstname, middlename) in the database (all in Uppercase) and correctly display them. For instance:
    Using the java.util.regex API seems to be the way to go.
    1) How do I make the corrections IN the database rather than say in a file? as a function?Then you may better have asked that here:
    PL/SQL
    Sing oracle has a regular expression implementation for SQL.
    2) The apostrophe and the dash seem to be easy enough but the space (&160; or  ) wont return in a query. No luck with the / either.Do it the other way around an look for "no letter" caracters: <tt>"([^[:alpha:]]*)([:alpha:])([:alpha:]*)"</tt>
    bye
    TPD

  • Regex and Extended Chars

    Hey guys,
    I am looking for a regex that will allow only ascii chars and not extended chars. Can someone please help me out with that? I have written this bit of code that will remove extended ascii chars but I was looking for a regex. Any ideas?
      //this is what I have to remove ascii chars.
       public static void main(String[] args){
            String str = "123asd.32#$%^&*()_+={}|\\;";
            str +=  (char)128;
            str +=  (char)129;
            str +=  (char)130;
            str +=  (char)131;
            str +=  (char)132;
            System.out.println(str);
            str = removeExtenedChars(str);
            System.out.println(str);
       public static String removeExtenedChars(String str){
            StringBuffer sb = new StringBuffer();
            StringCharacterIterator it = new StringCharacterIterator(str);
            for(char c = it.first(); c != CharacterIterator.DONE;c = it.next() ){
               int asciiVal = (int)c;
               if(asciiVal > 0 && asciiVal < 128){
                    sb.append(c);
            return sb.toString();
       }

    String asciiOnly = "your string".replaceAll("[^\u0000-\u007f]+","");

  • Powershell Regex and multiple matches

    Hi all, 
    I have been messing with powershell for a while now, but this regex challenge has got me stumped.
    I have a block of text, with is an output from a previous command, the variable is called $data
    The contents look like:
    Host : 10.0.0.1
    Output : Listening on eth1
    # Host name (port/service if enabled) last 2s last 10s last 40s cumulative
    1 LOCALPC.internaldomain.net => 79.1Kb 79.1Kb 79.1Kb 19.8KB
    www.awebsite.com.au <= 3.99Mb 3.99Mb 3.99Mb 1.00MB
    Total send rate: 83.3Kb 83.3Kb 83.3Kb
    Total receive rate: 3.99Mb 3.99Mb 3.99Mb
    Total send and receive rate: 4.08Mb 4.08Mb 4.08Mb
    Peak rate (sent/received/total): 83.3Kb 3.99Mb 4.08Mb
    Cumulative (sent/received/total): 20.8KB 1.00MB 1.02MB
    ============================================================================================
    ExitStatus : 0
    I am trying to extract the two host names from that block of text, one will always be a hostname ending in internaldomain.net the other one can either be a host name or IP address. They may appear in alternate orders as well.
    I can get a single match without an issue:
    if ($data -match "(?<=\s).*?(?=.internaldomain)") {
    $pcname = $matches[0].Trim()
    or
    if ($data -match "\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3} (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b") {
    $externalip = $matches[0].Trim()
    if ($data -match "([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)*?") {
    $externalhost = $matches[0].Trim()
    But I can't for the life of me extract the second match.
    Anyone got some pointers in how I can extract both matches into strings?
    Cheers,
    Pazu

    Thank you very much mjolinor, greatly appreciated and is all working now, one last question, the regex seems to be matching the number 1 prior to the first proper match, I'm not good enough with regex yet to understand why.
    An example of the data match on the one is:
    Host : 10.0.0.1
    Output : Listening on eth1
    # Host name (port/service if enabled) last 2s last 10s last 40s cumulative
    1 Work-iPhone.internal.net => 3.57Kb 3.57Kb 3.57Kb 915B
    11.111.239.45 <= 2.81Kb 2.81Kb 2.81Kb 719B
    Total send rate: 3.57Kb 3.57Kb 3.57Kb
    Total receive rate: 2.96Kb 2.96Kb 2.96Kb
    Total send and receive rate: 6.54Kb 6.54Kb 6.54Kb
    Peak rate (sent/received/total): 3.57Kb 2.96Kb 6.54Kb
    Cumulative (sent/received/total): 915B 759B 1.63KB
    ============================================================================================
    ExitStatus : 0
    In this sample data, the second line ending in cloudfront doesn't get matched, but it matches 'Total'?
    Host : 10.0.0.1
    Output : Listening on eth1
    # Host name (port/service if enabled) last 2s last 10s last 40s cumulative
    1 TARDIS.mashdinternal.net => 2.50Kb 2.50Kb 2.50Kb 640B
    server-54-240-177-103.syd1.r.cloudfront <= 169Kb 169Kb 169Kb 42.2KB
    Total send rate: 2.66Kb 2.66Kb 2.66Kb
    Total receive rate: 169Kb 169Kb 169Kb
    Total send and receive rate: 172Kb 172Kb 172Kb
    Peak rate (sent/received/total): 2.66Kb 169Kb 172Kb
    Cumulative (sent/received/total): 681B 42.2KB 42.9KB
    ============================================================================================
    ExitStatus : 0
    Any ideas?
    Cheers,
    Pazu

Maybe you are looking for