Working w/ regular expressions

Hi all,
I'm having trouble figuring out what regular expression to use to parse my output. I want to capture a block of text that starts with a tab (\t) and ends with a line containing the word "error code."
For example, if my output is the following: ( [\tab] denotes ASCII tab character)
[\tab]Some command
Anytext
Anytext
blah blah error code is 0
[\tab]Some other command
[\tab]Yet another command
Anytext
blah blah error code is 1
I would like subsequent calls to matcher.find() to find these two blocks:
[\tab]Some command
Anytext
Anytext
blah blah error code is 0
and
[\tab]Yet another command
Anytext
blah blah error code is 1
I thought the regular expression should be something like
"[\t](.*\n)+.*error code.*"
but the above regular expression returns the entire text intead of the two "blocks" that I want. I know that Java returns the longest match for the expression but I don't know how to exclude "error code" in the middle...
"[\t](.*\n)+(.*error code.*){1}" ???
Any help is greatly appreciated.
Thanks,
KK

..but I'm not sure what is the purpose of the second part (?:\\n|\\Z).
Would someone care to explain this too me?This non-capturing group is for final delimiter, it means "last character is a Line Feed char or end of input reached", just to prevent error if last line is an error line without Line Feed char.
A good reference about groups are at JRegex documentation/examples:
http://jregex.sourceforge.net/.
Good news for you: the minimal regular expression that solves your problem.
import java.io.*;
import java.util.*;
import java.util.regex.*;
public class ParseTest {
public static void main(String[] args) throws Exception {
String Output =
"\t\tgcc -someoption file0\n"
+ "\t\tgcc -someoption file1\n"
+ "\t\tgcc -someoption file2\n"
+ "\t\tgcc -someoption file3\n"
+ "some error message\n"
+ "some more error message\n"
+ "make: 1254-004 The error code from the last command is 1.\n"
+ "make: 1254-005 Ignored error code 1 from last command.\n"
+ "\t\tgcc -someoption file4\n"
+ "\t\tgcc -someoption file5\n"
+ "error message\n"
+ "more error message\n"
+ "make: 1254-004 The error code from the last command is 1.\n"
+ "make: 1254-005 Ignored error code 1 from last command.\n"
+ "\t\tgcc -someoption file6\n"
+ "\t\tgcc -someoption file7\n"
+ "last error message\n"
+ "make: 1254-005 Ignored error code 1 from last command.";
System.out.println(Output+"\n");
final String
// Regular Expression pattern to find only commands with error messages.
// $1 = command
// $2 = error message
re = "\\t+(.+)\\n([^\\t]+[^\\t\\n])(?:\\n|$)";
Pattern p = Pattern.compile(re);
System.out.println("Pattern:\n"+p.pattern());
Matcher m = p.matcher(Output);
for (int j=1; m.find(); j++) {
System.out.println("\nMatching "+j+":\n");
System.out.println("--------------------------------");
System.out.println(m.group(1));
System.out.println("--------------------------------");
System.out.println(m.group(2));
System.out.println("--------------------------------");
Regards.

Similar Messages

  • About regular expressions

    This question was posted in response to the following article: http://help.adobe.com/en_US/ColdFusion/10.0/Developing/WSc3ff6d0ea77859461172e0811cbec0a38 f-7fff.html

    "ColdFusion supplies four functions that work with regular expressions" should be "ColdFusion supplies six functions that work with regular expressions,"

  • Are regular expressions limited

    hey
    I tried this
    public class Main
         public static void main(String[] args)
              String string = "ksjdfjk_kjsdfjk_jkfsdjkf";
              boolean match = string.contains("_.+_");
              if(match == true)
                   System.out.println("worked");
              else
                   System.out.println("not worked");
    }but I keep getting "not worked"
    Are regular expressions limited or is there a way
    to set dot "." to any character but "_";
    I'm trying to get the program
    to detect anycharacters
    out of anycharacters_anycharacters_anycharacters

    Here's a solution for anyone
    searching the forums.
    public class Main
         public static void main(String[] args)
              String string = "ksjdfjk_sdfsdfdsf_jkfsdjkf";
              boolean match = string.matches(".*_.+_.*");
              if(match == true)
                   System.out.println("worked");
              else
                   System.out.println("not worked");
    }Thanks for the quick replies. :)

  • Oracle regular expression help

    I have worked with regular expressions a little bit in unix scripting.
    I need to diff two schemas to look for missing objects.
    Problem: I changed the naming conventions.
    My objects used to end with '_T' and now end with '_MV'
    I cant use regular instr because I can have
    HELLO_TYPES_T or I could have HELLO_T
    I want to trim off the last T and MV and then do a minus to see if I am missing objects.
    I I think I need to use regexp_instr with an end of line regular expression, but I can't get the syntax correct. Can someone give me a hand?

    Well, how about this:
    SQL> with schema1
      2   as ( select  'HELLO_TYPES_T' obj from dual union all
      3        select  'HELLO_T' from dual union all
      4        select  'TYPES_T' obj from dual union all
      5        select  'HELLO_TYPES_MV' obj from dual union all
      6        select  'HELLO_MV' from dual union all
      7        select  'TYPES_MV' obj from dual union all    
      8        select  'OBJECTS_T' obj from dual )
      9  ,    schema2
    10   as ( select  'HELLO_TYPES_T' obj from dual union all
    11        select  'HELLO_T' from dual union all
    12        select  'TYPES_T' obj from dual union all
    13        select  'HELLO_TYPES_MV' obj from dual union all
    14        select  'HELLO_MV' from dual union all
    15        select  'TYPES_MV' obj from dual)
             ---actual query
    16    select regexp_replace(obj, '^*(_T|_MV)$', '') regexp
    17    from   schema1
    18    minus
    19    select regexp_replace(obj, '^*(_T|_MV)$', '') regexp
    20    from   schema2;
    REGEXP
    OBJECTS
    1 row selected.And vice versa (schema2 minus schema1)

  • Regular expression on words with % wildcard

    Hi,
    I've got some processing working using regular expression where I need to process words e.g.
    regexp_replace('word1 word2','(\w+)','myprefix{\1}') - results in - 'myprefixword1 myprefixword2'
    However, if I'm presented with this; '%word0 word1% wo%d2 word3', then I need to treat % as special case and leave the word as is, so result here would be; - '%word0 word1% wo%d2 myprefixword3', is this achievable using regexp ?

    And for those who don't know, I guess we should explain why we're having to expand single spaces to double spaces...
    (I'll use the "¬" character to represent spaces to make it clearer to see)
    If we have a string such as
    word1¬word2¬word3and we want to identify the words in the string (without using any special regexp word identifier) then we are going to use the spaces to identify the start and end of words. To make life easy, we manually put a space at the start and end of the string so we can say that each word in the string will have a space before and after it regardless of where it is in the string...
    ¬word1¬word2¬word3¬However, when we specify what we want to search for we are going to say we want a space, followed by a number of characters (not spaces), followed by a space...
    ¬[^¬]*¬So, ideally, you'd expect it to look through the string and say
    ¬word1¬word2¬word3¬
    \_____/... found word1
    ¬word1¬word2¬word3¬
          \_____/... found word2
    ¬word1¬word2¬word3¬
                \_____/... found word3
    Unfortunately, there is a problem. Once the first word has been found the pointer for searching the rest of the string is located on the next character after the match i.e.
    ¬word1¬word2¬word3¬
           ^So it won't be able to pick out word2 and will only get to word3. Let's see it in action...
    SQL> ed
    Wrote file afiedt.buf
      1  with t as (select ' word1 word2 word3 ' as txt from dual)
      2  --
      3  select regexp_replace(txt, ' [^ ]* ', 'xxxxx') as txt
      4* from t
    SQL> /
    TXT
    xxxxxword2xxxxx
    SQL>In order to deal with this, if we replace the single spaces with double spaces (not required at the start and end) our string looks like...
    ¬word1¬¬word2¬¬word3¬So as it searches it finds word1 as a match and then the pointer in the string is located...
    ¬word1¬¬word2¬¬word3¬
           ^... so the next match for the pattern of space-characters-space is word2 and then the pointer is located...
    ¬word1¬¬word2¬¬word3¬
                  ^... ready to find word 3. Example...
    SQL> ed
    Wrote file afiedt.buf
      1  with t as (select ' word1  word2  word3 ' as txt from dual)
      2  --
      3  select regexp_replace(txt, ' [^ ]* ', 'xxxxx') as txt
      4* from t
    SQL> /
    TXT
    xxxxxxxxxxxxxxx
    SQL>Hopefully that's a little clearer. You just have to remember the "pointer" principle and the fact that once a match is found it is located on the character after the match.
    ;)

  • Regular Expression Class in RoboHelp

    Is there any class in RoboHelp object model for working with regular expressions? Or maybe a class for find and change option?
    In Javascript there is RegExp class. Does RoboHelp have the same?

    I suspect you have a development background which I do not have. Regular expressions work just fine from the tools I have used (as above plus dnGrep) but I think you want to create a script of some sort that takes in regular expressions, effectively replacing the tools I have used to run the expressions. I am sorry I cannot help you on that Charlotte.
    Curiosity makes me ask why you have to do this with your own script rather than the tools I have used.
    See www.grainge.org for RoboHelp and Authoring tips
    @petergrainge

  • Url pattern tag and regular expression

    I am trying to set up my web.xml in Tomcat 4.1.3 container to go to one page if letters are entered in last part of url or go to another page if numbers are entered in the last part of a url.
    For example if here is how the url would be set up where the url will go to either the all numbers location or the non numbers location:
    Any number entry for last part of url which could be something like 343
    http://127.0.0.1:8080/theapp/pack/weburl/343
    Any non number entry for last part of url which could be something like abec
    http://127.0.0.1:8080/theapp/pack/weburl/abec
    My attempt below is not working because it doesnt seem to take the regular expressions. But if I manually put in letters such as: <url-pattern>/pack/weburl/ab</url-pattern> it would take me to the correct page. How does web.xml work with regular expressions inthe url-pattern tag??
    <servlet>
    <servlet-name>Number</servlet-name>
    <servlet-class>pack.Number</servlet-class>
    </servlet>
    <servlet-mapping>
    <servlet-name>Number</servlet-name>
    <url-pattern>/pack/weburl/\d*</url-pattern>
    </servlet-mapping>
    <servlet>
    <servlet-name>NotNumber</servlet-name>
    <servlet-class>pack.NotNumber</servlet-class>
    </servlet>
    <servlet-mapping>
    <servlet-name>NotNumber</servlet-name>
    <url-pattern>/pack/weburl/[A-Za-z]</url-pattern>
    </servlet-mapping>

    Sorry, this pattern can't take regular expressions.
    Referring to the servlet spec section 11.2 which defines these mappings
    In the web application deployment descriptor, the following syntax is used to define
    mappings:
    � A string beginning with a �/� character and ending with a �/*� postfix is used
    for path mapping.
    � A string beginning with a �*.� prefix is used as an extension mapping.
    � A string containing only the �/� character indicates the "default" servlet of the
    application. In this case the servlet path is the request URI minus the context
    path and the path info is null.
    � All other strings are used for exact matches only.As an alternative, I would suggest that you match the request to a filter, and then use some logic based on request.getURI() to determine which resource to forward to from there.

  • Regular expression does not work in IE

    Hi,
    I'm having a regular expression which should check the content of an inputText field: it should contain a number, a character and the size should be at least 6. It's working when I test it in FireFox but it always fails to succeed in IE:
              <af:inputText>
                <af:validateRegExp pattern="^(?=.*[a-zA-Z])(?=.*[0-9])[a-zA-Z0-9]{6,24}$"
                                   noMatchMessageDetail="Does not match."/>
              </af:inputText>How comes it does not work in IE (6.0)?
    JDeveloper 10.1.3.3
    Thanks in advance,
    Koen Verhulst

    hi Koen
    The web page you refer to ...
    http://www.fileformat.info/tool/regex.htm
    ... seems to be doing server side Java regular expressions (and as such will behave the same in both FF and IE).
    The af:validateRegExp component you want to use does client side evaluation of the regular expression, using scripting, hence the apparent difference between FF and IE.
    There is also this web page ...
    http://www.regular-expressions.info/javascriptexample.html
    ... that seems to behave similar to the markup and scripting resulting from the af:validateRegExp component.
    Besides the value "2abcdef" there seem to be others that are accepted by IE for the regular expression "^(?=.*[a-zA-Z])(?=.*[0-9])[a-zA-Z0-9]{6,24}$", values like these:
    "a2abcdef", "a2abcde", "abcdef2abcde", "2a3cdef", "2a34567"
    Although these values are not accepted by IE (but are accepted in FF):
    "2abcde", "23bcdef"
    success
    Jan

  • Regular expression not working for adobe forms

    Hi,
    Iam using qtp for adobe forms and for some reason if i put in regular expression for apid value it doesn't recognise the object..there is nothing wrong with the regular expression as it is evaluated using regular expression evaluator in qtp 11.0....any ideas
    I got all the addins and everything and when i used regular expression for the top window it works but for any other object it doesn't

    Please try the code and see the problem. The regular expression is fine.
    I can replace the string with these and got results like this:
    import java.util.regex.Pattern;
    public class HtmlFilter implements TextFilter {
        private static String strTagPattern = "<\\s?(.|\n)*?\\s?>";
        private static int patternMode = Pattern.MULTILINE | Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE | Pattern.CANON_EQ;
        private static Pattern tagPattern = Pattern.compile(strTagPattern, patternMode);
        public String filter(String t) {
              if(t==null || t.length()==0) return "";
            String ret = null;
            return tagPattern.matcher(t).replaceAll("");
         public static void main(String[] args) {
              System.out.println(new HtmlFilter().filter(null));
              System.out.println(new HtmlFilter().filter(""));
              System.out.println(new HtmlFilter().filter("<P>abc def</P>"));
              System.out.println(new HtmlFilter().filter("<P>&#25105;&#22269;&#30707;&#27833;&#20379;&#24212;&#23433;&#20840;&#31995;&#32479;&#24433;&#21709;</P>"));
    }The results are
    abc def
    ????????????

  • The Regular Expression anchors, "^" and "$" do not work for me in Dreamweaver

    I am using the Find and Replace feature of Dreamweaver 8 and especially the "Use Regular Expressions" setting.  But I have never been able to make the Regular Expression anchors "^" and "$" work at all.  This are supposed to fix a match at the beginning or ending of a line or input.  But if I search even something as elementary as "^ ", or " $", it can't seem to find them.
    Am I missing something?  Can somebody give me an example of this working?
    Any and all tips or clues would be appreciated.

    Welcome to Apple Discussions and Mac Computing
    Definitely an E-bay issue. The coding on the page is not Safari-friendly. Firefox, as you discovered, in this instance is the better choice.

  • Regular expression​.vi doesn't work correctly

    I try to parse the output from "Flatten To XML" using "Find Regular Expression" but I get unexpected results.
    Input: "<LvVariant><Name>myName</Name><Cluster>...</Clust​er></LvVariant>"
    Regular expression: "<(.+)><Name>(\w+)</Name>(.+)</\1>"
    Expected result: match1 = "LvVariant", match2 = "myName", match3 = "<Cluster>...</Cluster>", total match = input, before match = empty, after match = empty.
    LabVIEW's result: before match = input, all other output strings are empty.
    I checked the expression with other programming languages like PHP and Delphi. There it works fine, but not in LabVIEW. I think, there is a bug at the "Find regular expression.vi".

    ralfc wrote:
    I try to parse the output from "Flatten To XML" using "Find Regular Expression" but I get unexpected results.
    Input: "<LvVariant><Name>myName</Name><Cluster>...</Clust​er></LvVariant>"
    Regular expression: "<(.+)><Name>(\w+)</Name>(.+)</\1>"
    Expected result: match1 = "LvVariant", match2 = "myName", match3 = "<Cluster>...</Cluster>", total match = input, before match = empty, after match = empty.
    LabVIEW's result: before match = input, all other output strings are empty.
    I checked the expression with other programming languages like PHP and Delphi. There it works fine, but not in LabVIEW. I think, there is a bug at the "Find regular expression.vi".
    You are not using the Match Regular Expression correctly, try this:
    You need to expand the bottom of the vi to get the captured groups.
    Ben64

  • Regular expression with \uNNNN not working

    Hello,
    The following code should return a match for dddd and ĒĒĒĒĒĒ however this is not working. If I execute the regular expression in another language everything is working fine.
    According to the documentation http://livedocs.adobe.com/flex/3/html/help.html?content=12_Using_Regular_Expressions_03.ht ml this should be possible in action script
    var regExp:RegExp = new RegExp("([A-Za-z\u00C0-\u024F\u1E00-\u1EFF]+)", "gi");
    var text:String = "ddddd ĒĒĒĒĒĒ";
    var match:Array = text.match(regExp);
    Best regards
    Tom

    Hello,
    The following code should return a match for dddd and ĒĒĒĒĒĒ however this is not working. If I execute the regular expression in another language everything is working fine.
    According to the documentation http://livedocs.adobe.com/flex/3/html/help.html?content=12_Using_Regular_Expressions_03.ht ml this should be possible in action script
    var regExp:RegExp = new RegExp("([A-Za-z\u00C0-\u024F\u1E00-\u1EFF]+)", "gi");
    var text:String = "ddddd ĒĒĒĒĒĒ";
    var match:Array = text.match(regExp);
    Best regards
    Tom

  • CFFORM (Flash) Validation with Regular Expressions Not Working

    I am having troubles getting regular expression validation to
    work in a CFFORM. The below code is an extract of a much larger
    form, the first name and last name have a regular expression
    validation...and it doesn't work!
    I'd appreciate any comments/info for help on this, have
    searched high and low on information to get this working...but no
    joy.
    The code is:
    <cffunction name="checkFieldSet" output="false"
    returnType="string">
    <cfargument name="fields" type="string" required="true"
    hint="Fields to search">
    <cfargument name="form" type="string" required="true"
    hint="Name of the form">
    <cfargument name="ascode" type="string" required="true"
    hint="Code to fire if all is good.">
    <cfset var vcode = "">
    <cfset var f = "">
    <cfsavecontent variable="vcode">
    var ok = true;
    var msg = "";
    <cfloop index="f" list="#arguments.fields#">
    <cfoutput>
    if(!mx.validators.Validator.isValid(this,
    '#arguments.form#.#f#')) { msg = msg + #f#.errorString + '\n';
    ok=false; }
    </cfoutput>
    </cfloop>
    </cfsavecontent>
    <cfset vcode = vcode & "if(!ok)
    mx.controls.Alert.show(msg,'Validation Error'); ">
    <cfset vcode = vcode & "if(ok) #ascode#">
    <cfset vcode =
    replaceList(vcode,"#chr(10)#,#chr(13)#,#chr(9)#",",,")>
    <cfreturn vcode>
    </cffunction>
    <cfform name="new_form" format="flash" width="600"
    height="600" skin="halosilver" action="new_data.cfc">
    <cfformgroup type="panel" label="New Form"
    style="background-color:##CCCCCC;">
    <cfformgroup type="tabnavigator" id="tabs">
    <cfformgroup type="page" label="Step 1">
    <cfformgroup type="hbox">
    <cfformgroup type="panel" label="Requestor Information"
    style="headerHeight: 13;">
    <cfformgroup type="vbox">
    <cfinput type="text" name="reqName" width="300"
    label="First Name:" validate="regular_expression" pattern="[^0-9]"
    validateat="onblur" required="yes" message="You must supply your
    First Name.">
    <cfinput type="text" name="reqLname" width="300"
    label="Last Name:" validate="regular_expression" pattern="[^0-9]"
    validateat="onblur" required="yes" message="You must supply your
    Last Name.">
    <cfinput type="text" name="reqEmail" width="300"
    label="Email:" validate="email" required="yes" message="You must
    supply your email or the address given is in the wrong format.">
    <cfinput type="text" name="reqPhone" width="300"
    label="Phone Extension:" validate="integer" required="yes"
    maxlength="4" message="You must supply your phone number.">
    </cfformgroup>
    </cfformgroup>
    </cfformgroup>
    <cfformgroup type="horizontal"
    style="horizontalAlign:'right';">
    <cfinput type="button" width="100" name="cnt_step2"
    label="next" value="Next"
    onClick="#checkFieldSet("reqName,reqLname,reqEmail,reqPhone","new_form","tabs.selectedInd ex=tabs.selectedIndex+1")#"
    align="right">
    </cfformgroup>
    </cfformgroup>
    </cfformgroup>
    </cfformgroup>
    </cfform>

    quote:
    Originally posted by:
    Luckbox72
    The problem is not the Regex. I have tested 3 or 4 different
    versions that all work on the many different test sites. The
    problem is it that the validation does not seem to work. I have
    changed the patter to only allow NA and I can still type anything
    into the text box. Is there some issue with useing Regex as your
    validation?
    Bear in mind that by default validation does not occur until
    the user attempts to submit the form. If you are trying to control
    the characters that the user can enter into the textbox, as opposed
    to validating what they have entered, you will need to provide your
    own javascript validation.

  • How this regular expression work in detail

    Who can tell me how does the following regular expression work step by step, thanks!
    string input = "PRIVATE11xxPRIVATE22xx123PRIVATE33";
    string pattern = @"(?<Pvt>PRIVATE)?(?(Pvt)(\d+)|([^\d]+))";
    string publicDocument = null, privateDocument = null;
    foreach (Match match in Regex.Matches(input, pattern))
    if (match.Groups[1].Success)
    privateDocument += match.Groups[1].Value + "\n";
    else
    publicDocument += match.Groups[2].Value + "\n";
    Console.WriteLine("Private Document:");
    Console.WriteLine(privateDocument);
    Console.WriteLine("Public Document:");
    Console.WriteLine(publicDocument);

    Hi Sincos,
    sure:
    Your regular expression has two main blocks:
    1. (?<Pvt>PRIVATE)?
    That means find a string PRIVATE with zero or one repetition. This is a named capture group with the name "Pvt" to access it later
    2. (?(Pvt)(\d+)|([^\d]+))
    That's a conditional Expression with a yes and no clause. So let's split this one up:
    The first part (Pvt) means whether the string "PRIVATE" was found or not.
    When it was found, this gets active: (\d+)
    When it was not found, this gets active: ([^\d]+)
    So let's look at those two:
    (\d+) means a numbered capture Group. So one or more numbers
    ([^\d]+) means any character that is not in [\d], so not a number, with one or more repetitions
    Ok. So your regex ends up with three groups:
    1 Group: Numbered Group for [\d+]
    2 Group: Numbered Group for [^\d+]
    3 Group: Named Group for the string "PRIVATE"
    Now let's look at your input string "PRIVATE11xxPRIVATE22xx123PRIVATE33"
    With that one you'll have four matches:
    1st match "PRIVATE11"
      1 Group: "11"
      2 Group: -
      3 Group (Pvt): "PRIVATE"
    2nd match "xxPRIVATE" (String private was not found, so [^\d+] is used (any character not a number))
      1 Group: -
      2 Group: "xxPRIVATE"
      3 Group (Pvt): -
    3rd match "xx" (String private was not found, so [^\d+] is used (any character not a number))
      1 Group: -
      2 Group: "xx"
      3 Group (Pvt): -
    4th match "PRIVATE33"
      1 Group: 33
      2 Group: -
      3 Group (Pvt): "PRIVATE"
    Now let's look at your code:
    foreach (Match match in Regex.Matches(input, pattern))
    if (match.Groups[1].Success)
    privateDocument += match.Groups[1].Value + "\n";
    else
    publicDocument += match.Groups[2].Value + "\n";
    The Groups-Property of the Match contains 4 values. At index 0 there's the match itself. Index 1, 2 and 3 contain the
    three groups I've described above. So first you check if match.Groups[1].Success. That group is the first Group that stands for [\d+]. That group is succeeded when the string PRIVATE was found and it is followed by a number. With your Input string
    the first and fourth match are succeeded for that one. Then you're adding that Number to your privateDocument-string-variable.
    In the else-block you're adding the Groups[2].Value to the publicDocument-string-variable. That's the second group that stands for  [^\d+]. So it is used for matches 2 and 3 with the two strings "xxPRIVATE" and "xx".
    That's it.
    I would recommend you to install a tool to play around with Regex. My favourite is Expresso. You can download it here:
    http://www.ultrapico.com/expressodownload.htm
    Thomas Claudius Huber
    "If you can't make your app run faster, make it at least look & feel extremly fast"
    My latest Pluralsight-courses:
    XAML Layout in Depth
    Windows Store Apps - Data Binding in Depth
    twitter: @thomasclaudiush
    homepage: www.thomasclaudiushuber.com

  • Regular Expression replacement not working

    I am trying to use a regular expression to replace non-ascii characters on a file, and I'm afraid I've reached the end of my regex knowledge. 
    Here is the specific code
    'Set the Regular Expression paramaters
    Set RegEx = CreateObject("VBScript.Regexp")
    RegEx.Global = True
    RegEx.Pattern = "[^\u0000-\u007F]"
    RegEx.IgnoreCase = True
    'Replace the UTF-8 characters
    ReplacedText = RegEx.Replace(FileText, "\u0020")
    If I understand regular expressions correctly the pattern of "[^\u0000-\u007F]" should replace any character that is not an ascii character, and then replace it with a space (which I understand is "\u0020").  What am I doing wrong?

    Simply use
    ReplacedText = RegEx.Replace(FileText, " ")
    Regards, Hans Vogelaar (http://www.eileenslounge.com)

Maybe you are looking for