Maximum length of a regular expression

Hi,
I'm making a program for decoding traffic measurement data in text files. It is based upon regular expressions. Some of the patterns are very big. The biggest pattern is 22390 characters long. And here's were the problem is: the program just freezes when it needs to match a string with that huge pattern. Another of my longer patterns is 17634 characters long and that matches strings without any problem!
Now what does this mean?
Is there a maximum length on regular expressions and do I succeed it with my 22390 characters long pattern?
I don't find anything on the internet about a maximum length on regular expressions.
Somebody, please help!
Kind regards, Sweepee.

I understand that this topic sounds funny to some people :-), but I didn't type the pattern by my self. I use an XML file to describe a text file. In that XML file you define groups of fields within the file with measurement data. Look to this example:
<block name="data record">
<date length="8">date</date>
<time length="5">time</time>
<integer length="8">location_area_code_target_cell</integer>
<integer length="8">location_area_code_origin_cell</integer>
</block>
This fragment of XML describes a data record in the text file. With the definition of those types within a block, a pattern is generated for that block. The example above is a small data record. The bigger a data record gets, the bigger the pattern becomes. I have one measurement file where a data record contains 386 fields, which results in a pattern of 20000+ characters. This is an exception from the other files, which contain much smaller data records. In the other types of text files decoding works great with these generated patterns, but with this one text file with large data records en a large pattern it fails somehow and I still don't know how it comes.

Similar Messages

  • Is there a maximum length for a custom expression?

    Is there a documented maximum length of a custom expression in a sync rule? I have been unable to find one and as of yet, have not hit one. But for curiosity's sake, I was wondering if there is a limit?
    Thanks

    Hello FIM-EN,
    Custom Expression of Synchronization rules are saved to two attributes: Initial Flow and Persistent Flow.
    Both are "Indexed string", so theoretically, the length is unlimited.
    Regards,
    Sylvain

  • Hi Friends, i have to validate the min.length attribute using Regular expression, The min.length accepts single value digit only which accepts =5 & min.length accepts =9 please guide me as soon as possible

    dfdsfds

    User,  we need to know your jdev version!
    It would be helpful if your question would not only the header of the post.
    Can you please rephrase your question? Somehow i don't understand xour question.
    Timo

  • Regular expressions string length

    Hi
    New to regular expressions and trying to create an expressions that will test if a string does not contain any number and is less than 20 characters long! This works for determining if it has any numbers in it, but how do I specify that it can be between 1 and 20 characters in length? The + means >1 but how do I cap it?
    Pattern p = Pattern.compile("[a-zA-Z]+");
             Matcher m = p.matcher(value);
             if (m.matches()) System.out.println("yes");I have tried [a-zA-Z]+{5} but that doesn't work. Any handy hints... can't find much stuff that is applicable via google, frustrating.
    Thanks.

    Looce wrote:
    Pattern p = Pattern.compile("[a-zA-Z]{1,20}"); // repetition totally replaces +, doesn't coexist with it
    Matcher m = p.matcher(value);
    if (m.matches()) System.out.println("yes");
    Note that the following code can be shortened by doing:
    if(value.matches("[a-zA-Z]{1,20}")) { /* ... */ }

  • EL expression for maximum length of column of underlying EO

    Hi all,
    I am using ADF Faces with ADF BC as the model layer. I have a simple inputText component on the screen that shows 32 characters (columns property = 32). The underlying database table can hold 128 characters. Is there any EL expression that I can use for the maximum length? I have tried the EL expression builder, but I don't see any property that looks promising (#{bindings.fieldname.<something>} is what I'm looking for)
    Thanks,
    John

    Chris,
    In 10.1.3.2, we already automatically drop an EL expression for the Columns property that is bound to the Display Width hint.
    It will look like:
                <af:inputText value="#{bindings.Dname.inputValue}"
                              label="#{bindings.Dname.label}"
                              required="#{bindings.Dname.mandatory}"
                              columns="#{bindings.Dname.displayWidth}">
                  <af:validator binding="#{bindings.Dname.validator}"/>
                </af:inputText>If the developer has not supplied a custom value of Display Width, then it defaults to the maximum length of the attribute.
    If you have provided a custom Display Width hint that is shorter than the maximum length, then you can reference the maximum length by using the precision property of the attribute definition, as shown in the EL expression for maximumLength below.
                <af:inputText value="#{bindings.Dname.inputValue}"
                              label="#{bindings.Dname.label}"
                              required="#{bindings.Dname.mandatory}"
                              columns="#{bindings.Dname.displayWidth}"
                              maximumLength="#{bindings.Dname.attributeDef.precision}">
                  <af:validator binding="#{bindings.Dname.validator}"/>
                </af:inputText>

  • Introduction to regular expressions ... continued.

    After some very positive feedback from Introduction to regular expressions ... I'm now continuing on this topic for the interested audience. As always, if you have questions or problems that you think could be solved through regular expression, please post them.
    Having fun with regular expressions - Part 2
    Finishing my example with decimal numbers, I thought about a method to test regular expressions. A question from another user who was looking for a way to show all possible combinations inspired me in writing a small package.
    CREATE OR REPLACE PACKAGE regex_utils AS
      -- Regular Expression Utilities
      -- Version 0.1
      TYPE t_outrec IS RECORD(
        data VARCHAR2(255)
      TYPE t_outtab IS TABLE OF t_outrec;
      FUNCTION gen_data(
        p_charset IN VARCHAR2 -- character set that is used for generation
      , p_length  IN NUMBER   -- length of the generated
      ) RETURN t_outtab PIPELINED;
    END regex_utils;
    CREATE OR REPLACE PACKAGE BODY regex_utils AS
    -- FUNCTION gen_data returns a collection of generated varchar2 elements
      FUNCTION gen_data(
        p_charset IN VARCHAR2 -- character set that is used for generation
      , p_length  IN NUMBER   -- length of the generated
      ) RETURN t_outtab PIPELINED
      IS
        TYPE t_counter IS TABLE OF PLS_INTEGER INDEX BY PLS_INTEGER;
        v_counter t_counter;
        v_exit    BOOLEAN;
        v_string  VARCHAR2(255);
        v_outrec  t_outrec;
      BEGIN
        FOR max_length IN 1..p_length 
        LOOP
          -- init counter loop
          FOR i IN 1..max_length
          LOOP
            v_counter(i) := 1;
          END LOOP;
          -- start data generation loop
          v_exit := FALSE;
          WHILE NOT v_exit
          LOOP
            -- start generation
            v_string := '';
            FOR i IN 1..max_length
            LOOP
              v_string := v_string || SUBSTR(p_charset, v_counter(i), 1);
            END LOOP;
            -- set outgoing record
            v_outrec.data := v_string;
            -- now pipe the result
            PIPE ROW(v_outrec);
            -- increment loop
            <<inc_loop>>
            FOR i IN REVERSE 1..max_length
            LOOP
              v_counter(i) := v_counter(i) + 1;     
              IF v_counter(i) > LENGTH(p_charset) THEN
                 IF i > 1 THEN
                    v_counter(i) := 1;
                 ELSE
                    v_exit := TRUE;  
                 END IF;
              ELSE
                 -- no further processing required
                 EXIT inc_loop;  
              END IF;  
            END LOOP;        
          END LOOP; 
        END LOOP; 
      END gen_data;
    END regex_utils;
    /This package is a brute force string generator using all possible combinations of a characters in a string up to a maximum length. Together with the regular expressions, I can now show what combinations my solution would allow to pass. But see for yourself:
    SELECT *
      FROM (SELECT data col1
              FROM TABLE(regex_utils.gen_data('+-.0', 5))
           ) t
    WHERE REGEXP_LIKE(NVL(REGEXP_SUBSTR(t.col1,
                                         '^([+-]?[^+-]+|[^+-]+[+-]?)$'
                       '^[+-]?(\.[0-9]+|[0-9]+(\.[0-9]*)?)[+-]?$'
    ;You will see some results, which are perfectly valid for my definition of decimal numbers but haven't been mentioned, like '000' or '+.00'. From now on I will also use this package to verify the solutions I'll present to you and hopefully reduce my share of typos.
    Counting and finding certain characters or words in a string can be a tedious task. I'll show you how it's done with regular expressions. I'll start with an easy example, count all spaces in the string "Having fun with regular expressions.":
    SELECT NVL(LENGTH(REGEXP_REPLACE('Having fun with regular expressions', '[^ ]')), 0)
      FROM dual
      ;No surprise there. I'm replacing all characters except spaces with a null string. Since REGEXP_REPLACE assumes a NULL string as replacement argument, I can save on adding a third argument, which would look like this:
    REGEXP_REPLACE('Having fun with regular expressions', '[^ ]', '')So REPLACE will return all the spaces which we can count with the LENGTH function. If there aren't any, I will get a NULL string, which is checked by the NVL function. If you want you can play around by changing the space character to somethin else.
    A variation of this theme could be counting the number of words. Counting spaces and adding 1 to this result could be misleading if there are duplicate spaces. Thanks to regular expressions, I can of course eliminate duplicates.
    Using the old method on the string "Having fun with regular expressions" would return anything but the right number. This is, where Backreferences come into play. REGEXP_REPLACE uses them in the replacement argument, a backslash plus a single digit, like this: '\1'. To reference a string in a search pattern, I have to use subexpressions (remember the round brackets?).
    SELECT NVL(LENGTH(REGEXP_REPLACE('Having  fun  with  regular  expressions', '( )\1*|.', '\1')))
      FROM dual
      ;You may have noticed that I changed from using the "^" as a NOT operator to using the "|" OR operator and the "." any character placeholder. This neat little trick allows to filter all other characters except the one we're looking in the first place. "\1" as backreference is outside of our subexpression since I don't want to count the trailing spaces and is used both in the search pattern and the replacement argument.
    Still I'm not satisfied with this: What about leading/trailing blanks, what if there are any special characters, numbers, etc.? Finally, it's time to only count words. For the purpose of this demonstration, I define a word as one or more consecutive letters. If by now you're already thinking in regular expressions, the solution is not far away. One hint: you may want to check on the "i" match parameter which allows for case insensitive search. Another one: You won't need a back reference in the search pattern this time.
    Let's compare our solutions than, shall we?
    SELECT NVL(LENGTH(REGEXP_REPLACE('Having  fun  with  regular  expressions.  !',
                                     '([a-z])+|.', '\1', 1, 0, 'i')), 0)
      FROM dual;This time I don't use a backreference, the "+" operator (remember? 1 or more) will suffice. And since I want to count the occurences, not the letters, I moved the "+" meta character outside of the subexpression. The "|." trick again proved to be useful.
    Case insensitive search does have its merits. It will only search but not transform the any found substring. If I want, for example, extract any occurence of the word fun, I'll just use the "i" match parameter and get this substring, whether it's written as "Fun", "FUN" or "fun". Can be very useful if you're looking for example for names of customers, streets, etc.
    Enough about counting, how about finding? What if I want to know the last occurence of a certain character or string, for example the postition of the last space in this string "Where is the last space?"?
    Addendum: Thanks to another forum member, I should mention that using the INSTR function can do a reverse search by itself.[i]
    WITH t AS (SELECT 'Where is the last space?' col1
                 FROM dual)
    SELECT INSTR(col1, ' ', -1)
      FROM DUAL;Now regular expressions are powerful, but there is no parameter that allows us to reverse the search direction. However, remembering that we have the "$" meta character that means (until the) end of string, all I have to do is use a search pattern that looks for a combination of space and non-space characters including the end of a string. Now compare the REGEXP_INSTR function to the previous solution:
    SELECT REGEXP_INSTR(t.col1, ' [^ ]*$')                       
      FROM t;So in this case, it'll remain a matter of taste what you want to use. If the search pattern has to look for the last occurrence of another regular expression, this is the way to solve such a requirement.
    One more thing about backreferences. They can be used for a sort of primitive "string swapping". If for example you have to transform column values like swapping first and last name, backreferenc is your friend. Here's an example:
    SELECT REGEXP_REPLACE('John Doe', '^(.*) (.*)$', '\2, \1')
      FROM dual
      ;What about middle names, for example 'John J. Doe'? Look for yourself, it still works.
    You can even use that for strings with delimiters, for example reversing delimited "fields" like in this string '10~20~30~40~50' into '50~40~30~20~10'. Using REVERSE, I would get '05~04~03~02~01', so there has to be another way. Using backreferences however is limited to 9 subexpressions, which limits the following solution a bit, if you need to process strings with more than 9 fields. If you want, you can think this example through and see if your solution matches mine.
    SELECT REGEXP_REPLACE('10~20~30~40~50',
                          '^(.*)~(.*)~(.*)~(.*)~(.*)$',
                          '\5~\4~\3~\2~\1'
      FROM dual;After what you've learned so far, that wasn't too hard, was it? Enough for now ...
    Continued in Introduction to regular expressions ... last part..
    C.
    Fixed some typos and a flawed example ...
    cd

    Thank you very much C. Awaiting other parts.... keep going.
    One german typo :-)
    I'm replacing all characters except spaces mit anull string.I received a functional spec from my Dutch analyst in which it is written
    tnsnames voor EDWH:
    PCESCRD1 = (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)
                                                   (HOST=blah.blah.blah.com)
                                                   (PORT=5227)))
               (CONNECT_DATA=(SID=pcescrd1)))
    db user: BW_I2_VIEWER  / BW_I2_VIEWER_SCRD1Had to look for translators.
    Cheers
    Sarma.

  • Exception while using Regular Expressions

    Hi, im using the following regular expression in my code for pattern matching.
    REGEXP_LIKE(LLT_NAME,''[^[:digit:]^[:alpha:]]'||l_temp_token||'[^[:digit:]^[:alpha:]]|^'||l_temp_token||'[^[:digit:]^[:alpha:]]|[^[:digit:]^[:alpha:]]'||l_temp_token||'$|^'||l_temp_token||'$'',''i'')
    here, l_temp_token is the string to be matched.
    The problem is when length(l_temp_token) is > 100, it is throwing the following Exception
    ORA-12733 regular expression too long
    But my requirement is, the length of input string can be upto 250
    Do anyone know what is the maximum size for regular expressions or is there any way to increase it?
    Thanks in Advance

    Could explain what rule you're trying to verify with this regular expression? Maybe there's an alternative.
    C.

  • Help in query using regular expression

    HI,
    I need a help to get the below output using regular expression query. Please help me.
    SELECT REGEXP_SUBSTR ('PWRPKG(P/W+P/L+CC)', '[^+]+', 1, lvl) val, lvl
    FROM DUAL,(SELECT LEVEL lvl FROM DUAL
    CONNECT BY LEVEL <=(SELECT MAX ( LENGTH ('PWRPKG(P/W+P/L+CC)') - LENGTH (REPLACE ('PWRPKG(P/W+P/L+CC)','+',NULL))+ 1) FROM DUAL));
    I need the output as
    correct result:
    ==============
    val lvl
    P/W 1
    P/L 2
    CC 3
    But i tried the above it is not coming the above result. Please help me where i did a mistake.
    Thanks in advance

    Frank gave you a solution in your other thread. You could simplify it if you are on 11g:
    SQL> select * from table_x
      2  /
    TXT
    TECHPKG(INTELLI CC+FRT SONAR)
    PWRPKG(P/W+P/L+CC)
    select  txt,
            regexp_substr(
                          txt,
                          '(.*\()*([^+)]+)',
                          1,
                          column_value,
                          null,
                          2
                         ) element,
            column_value element_number
      from  table_x,
            table(
                  cast(
                       multiset(
                                select  level
                                  from  dual
                                  connect by level <= regexp_count(txt,'\+') + 1
                       as sys.OdciNumberList
      order by rowid,
               column_value
    TXT                                      ELEMENT    ELEMENT_NUMBER
    TECHPKG(INTELLI CC+FRT SONAR)            INTELLI CC              1
    TECHPKG(INTELLI CC+FRT SONAR)            FRT SONAR               2
    PWRPKG(P/W+P/L+CC)                       P/W                     1
    PWRPKG(P/W+P/L+CC)                       P/L                     2
    PWRPKG(P/W+P/L+CC)                       CC                      3
    SQL>  SY.

  • Help regarding regular expression

    HI All ,
    Please see the following string
    String s = "IF ((NOT NUM4 IS ALPHABETIC ) AND NUM3 IS ALPHABETIC-UPPER AND (NUM5 IS GREATER OR EQUAL TO 3) AND (NUM5 IS NOT GREATER THAN 3) AND (NUM3 GREATER THAN 46) AND (NUM5 GREATER THAN NUM3) OR NUM3 LESS THAN 78) .";
    My problem is: i want to capture the part of this line which contains "ALPHABETIC ,ALPHABETIC-UPPER for ex :NOT NUM4 IS ALPHABETIC , NUM3 IS ALPHABETIC-UPPER.from that I have to capture the word num4 , num3 which are in these phrases only ;from the whole string whereever it exists along with the phrase,Can any one help me out by suggesting something.num4 and num3 are variable names

    I suspect you're right, Sabre, but I can't resist...
    import java.util.regex.*;
    * A rewriter does a global substitution in the strings passed to its
    * 'rewrite' method. It uses the pattern supplied to its constructor, and is
    * like 'String.replaceAll' except for the fact that its replacement strings
    * are generated by invoking a method you write, rather than from another
    * string. This class is supposed to be equivalent to Ruby's 'gsub' when given
    * a block. This is the nicest syntax I've managed to come up with in Java so
    * far. It's not too bad, and might actually be preferable if you want to do
    * the same rewriting to a number of strings in the same method or class. See
    * the example 'main' for a sample of how to use this class.
    * @author Elliott Hughes
    public abstract class Rewriter
      private Pattern pattern;
      private Matcher matcher;
       * Constructs a rewriter using the given regular expression; the syntax is
       * the same as for 'Pattern.compile'.
      public Rewriter(String regularExpression)
        this.pattern = Pattern.compile(regularExpression);
       * Returns the input subsequence captured by the given group during the
       * previous match operation.
      public String group(int i)
        return matcher.group(i);
       * Overridden to compute a replacement for each match. Use the method
       * 'group' to access the captured groups.
      public abstract String replacement();
       * Returns the result of rewriting 'original' by invoking the method
       * 'replacement' for each match of the regular expression supplied to the
       * constructor.
      public String rewrite(CharSequence original)
        this.matcher = pattern.matcher(original);
        StringBuffer result = new StringBuffer(original.length());
        while (matcher.find())
          matcher.appendReplacement(result, "");
          result.append(replacement());
        matcher.appendTail(result);
        return result.toString();
      public static void main(String[] args)
        String s = "IF ((NOT NUM4 IS ALPHABETIC ) " +
                    "AND NUM3 IS ALPHABETIC-UPPER " +
                    "AND (NUM5 IS GREATER  OR EQUAL TO 3) " +
                    "AND (NUM5 IS NOT GREATER THAN 3) " +
                    "AND (NUM3 GREATER THAN 46) " +
                    "AND NUM645 IS ALPHABETIC " +
                    "AND (NUM5 GREATER THAN NUM3) " +
                    "OR NUM3 LESS THAN 78 " +
                    "AND NUM34 IS ALPHABETIC-UPPER " +
                    "AND NUM92 IS ALPHABETIC-LOWER " +
                    "AND NUM0987 IS ALPHABETIC-LOWER) .";
        String result =
          new Rewriter("(NUM\\d+) +IS +(ALPHABETIC(?:-(?:UPPER|LOWER))?)")
            public String replacement()
              String type = group(2);
              if (type.endsWith("UPPER"))
                return "Character.isUpper(" + group(1) + ")";
              else if (type.endsWith("LOWER"))
                return "Character.isLower(" + group(1) + ")";
              else
                return "Character.isLetter(" + group(1) + ")";
          }.rewrite(s);
        System.out.println(result);
    }

  • Help with Regular Expression for field validation

    I'm fairly new to using regular expressions and using Acrobat. This is probably a simple question, but I've been unable to figure it out.
    I have a text field on a PDF that I would like to be 9 characters in length. The first 2 characters can only be alphanumeric, the last 7 characters can only be numeric.
    At first I was using the following, which allows all the characters to be alphanumeric:
    var re = /^[A-Za-z0-9 :\\_]$/;
    if (event.change.length >0) {
    if (event.willCommit == false) {
        if (!re.test(event.change)) {
            event.rc = false
    That works fine, but it's not quite what I needed. With some assistance I changed it (see below) to fit what I was looking for. However, this didn't work; it prevents anything from being entered in the field:
    var re = /^[A-Za-z0-9]{2}\d{7}$/;
    if (event.change.length >0) {
    if (event.willCommit == false) {
        if (!re.test(event.change)) {
            event.rc = false
    Any help would be greatly appreciated.
    Thanks...

    Here's a function you can call form the field's custom Format script. It should be placed in a document-level JavaScript:
    function custom_ks1() {
        // Define non-commited regular expression
        var re = /^[A-Za-z0-9]{0,2}([0-9]{0,7})?$/;
        // Get all of the characters the user has entered
        var value = AFMergeChange(event);
        // Allow field to be cleared
        if(!value) return;
        if (event.willCommit) {
            // Define commited regular expression
            var re = /^[A-Za-z0-9]{2}[0-9]{7}$/;
            if (!re.test(value)) {  // If final value doesn't match, alert user
                app.alert("Your error message goes here.");
                // event.rc = false
        } else {  // not commited
            // Only allow characters that match the regular expression
            event.rc = re.test(value);
    Call it like this:
    // Custom Keystroke script
    custom1_ks();

  • Use regular expressions to extract .llb filename from path

    I am trying to be clever (always a dangerous thing) and use a regular expression to extract the name of a library from a filepath converted to a string.   Whilst I appreciate there are other ways to do this, regex would appear to be a very powerful neat way, should I be able to get it to work.
    i.e if I have a string of the type, C:\applications\versions\library.llb\toplevel.vi, I want to be able to extract library.llb from the string, given that it will be of variable length, may include numbers & spaces and may be within a file hierarchy of variable depth.   In other words, I want to extract the portion of the string between the last \ that ends with .llb
    The best I have managed so far, is \\+.*llb which returned everything minus the drive letter and the toplevel.vi
    Can anyone help me achieve this, or am i better using an alternative method (e.g filepath to array string, search for .llb)
    Thanks
    Matt
    Solved!
    Go to Solution.

    Hi Matt,
    attached you'll find two other options.
    Mike
    Message Edited by MikeS81 on 04-13-2010 01:30 PM
    Attachments:
    Path.PNG ‏7 KB

  • Regular expressions and input streams

    Hello,
    I am trying to find a way to read characters from a stream and find matches in them with regular expressions. The problem is that the stream may contain a big amount of characters, so I can't read all of them, keep them in a big string, and then perform the searches. Is there a way to perform searches while I read the characters? I scanned the documentation of Pattern and Matcher and found nothing that can help.
    Any ideas?
    Thanks.

    Looking at the method
    public Matcher Pattern.matcher(CharSequence input);
    one could be excused for assuming that one could turn an InputStream into a CharSequence object since a 'stream' implies sequencial access and CharSequence sounds like it provides sequential access.
    This would just require the implementation of 4 method of which two (toString() and subSequence()) could probably be ignored BUT the method you wouild have to implement are charAt() and length() which imply random access rather than sequencial access!
    Using the built in regex your problem looks difficult. If you are just looking for 'equality' matches (simple searching) then you could use Knuth-Morris-Pratt algorithm or the Boyer-Moore algorithm.

  • Regular Expressions and comments

    Hello,
    I've got a problem with regular expressions. I Want to find special words like "todo" in java comments, but wasn't able to manage that satisfactory. I hope someone of you can give me an advice! :-)
    example of a comment:
    * comment text. todo: what is still todo.
    First of all, I tried this:
    /\*(.*?)todo(.*?)\*/
    but this version seems to ignore the borders of the comment and shows me also parts of the code.
    Then I tried this:
    /\*([^/]*?)todo(.*?)\*/
    this version returns good results, but doesn't notice html formatting like
    * <code> .... </code> text. todo: text
    So my idea now is to check whether a star precedes the slash, but I don't really know how to combine it.
    Or is there another simplier solution? Thanks for your hints!
    Greetz Jan

    Thanks for your answer, but when I take this expression the programm seems to hang-up. An operation that usually finishs in 3 secs didn't even in 10 minutes. :-( Do you have any idea what could be wrong?
    btw. can I take this one:
    pattern = Pattern.compile("/\\*(?:[^\\*]+|\\*(?!/))*todo.*?\\*/", Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
    matcher = pattern.matcher(text);
    text = matcher.replaceAll("");or do I have to take a this instead of replaceAll():
    while (matcher.find()) {
        text = text.substring(0, matcher.start(1)) + text.substring(matcher.end(1), text.length());
        matcher = pattern.matcher(text);
    }any hints?
    Greetz Jan

  • Trying to use regular expressions to convert names to Title Case

    I'm trying to change names to their proper case for most common names in North America (esp. the U.S.).
    Some examples are in the comments of the included code below.
    My problem is that *retName = retName.replaceAll("( [^ ])([^ ]+)", "$1".toUpperCase() + "$2");* does not work as I expect. It seems that the toUpperCase method call does not actually do anything to the identified group.
    Everything else works as I expect.
    I'm hoping that I do not have to iterate through each character of the string, upshifting the characters that follow spaces.
    Any help from you RegEx experts will be appreciated.
    {code}
    * Converts names in some random case into proper Name Case. This method does not have the
    * extra processing that would be necessary to convert street addresses.
    * This method does not add or remove punctuation.
    * Examples:
    * DAN MARINO --> Dan Marino
    * old macdonald --> Old Macdonald &lt;-- Can't capitalize the 'D" because of Ernst Mach
    * ROY BLOUNT, JR. --> Roy Blount, Jr.
    * CAROL mosely-BrAuN --> Carol Mosely-Braun
    * Tom Jones --> Tom Jones
    * ST.LOUIS --> St. Louis
    * ST.LOUIS, MO --> St. Louis, Mo &lt;-- Avoid City Names plus State Codes
    * This is a work in progress that will need to be updated as new exceptions are found.
    public static String toNameCase(String name) {
    * Basic plan:
    * 1. Strategically create double spaces in front of characters to be capitalized
    * 2. Capitalize characters with preceding spaces
    * 3. Remove double spaces.
    // Make the string all lower case
    String retName = name.trim().toLowerCase();
    // Collapse strings of spaces to single spaces
    retName = retName.replaceAll("[ ]+", " ");
    // "mc" names
    retName = retName.replaceAll("( mc)", " $1");
    // Ensure there is one space after periods and commas
    retName = retName.replaceAll("(\\.|,)([^ ])", "$1 $2");
    // Add 2 spaces after periods, commas, hyphens and apostrophes
    retName = retName.replaceAll("(\\.|,|-|')", "$1 ");
    // Add a double space to the front of the string
    retName = " " + retName;
    // Upshift each character that is preceded by a space
    // For some reason this doesn't work
    retName = retName.replaceAll("( [^ ])([^ ]+)", "$1".toUpperCase() + "$2");
    // Remove double spaces
    retName = retName.replaceAll(" ", "");
    return retName;
    Edited by: FuzzyBunnyFeet on Jan 17, 2011 10:56 AM
    Edited by: FuzzyBunnyFeet on Jan 17, 2011 10:57 AM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

    Hopefully someone will still be able to provide a RegEx solution, but until that time here is a working method.
    Also, if people have suggestions of other rules for letter capitalization in names, I am interested in those too.
    * Converts names in some random case into proper Name Case.  This method does not have the
    * extra processing that would be necessary to convert street addresses.
    * This method does not add or remove punctuation.
    * Examples:
    * CAROL mosely-BrAuN --&gt; Carol Mosely-Braun
    * carol o'connor --&gt; Carol O'Connor
    * DAN MARINO --&gt; Dan Marino
    * eD mCmAHON --&gt; Ed McMahon
    * joe amcode --&gt; Joe Amcode         &lt;-- Embedded "mc"
    * mr.t --&gt; Mr. T                    &lt;-- Inserted space
    * OLD MACDONALD --&gt; Old Macdonald   &lt;-- Can't capitalize the 'D" because of Ernst Mach
    * old mac donald --&gt; Old Mac Donald
    * ROY BLOUNT,JR. --&gt; Roy Blount, Jr.
    * ST.LOUIS --&gt; St. Louis
    * ST.LOUIS,MO --&gt; St. Louis, Mo     &lt;-- Avoid City Names plus State Codes
    * Tom Jones --&gt; Tom Jones
    * This is a work in progress that will need to be updated as new exceptions are found.
    public static String toNameCase(String name) {
         * Basic plan:
         * 1.  Strategically create double spaces in front of characters to be capitalized
         * 2.  Capitalize characters with preceding spaces
         * 3.  Remove double spaces.
        // Make the string all lower case
        String workStr = name.trim().toLowerCase();
        // Collapse strings of spaces to single spaces
        workStr = workStr.replaceAll("[ ]+", " ");
        // "mc" names
        workStr = workStr.replaceAll("( mc)", "  $1  ");
        // Ensure there is one space after periods and commas
        workStr = workStr.replaceAll("(\\.|,)([^ ])", "$1 $2");
        // Add 2 spaces after periods, commas, hyphens and apostrophes
        workStr = workStr.replaceAll("(\\.|,|-|')", "$1  ");
        // Add a double space to the front of the string
        workStr = "  " + workStr;
        // Upshift each character that is preceded by a space and remove double spaces
        // Can't upshift using regular expressions and String methods
        // workStr = workStr.replaceAll("( [^ ])([^ ]+)", "$1"toUpperCase() + "$2");
        StringBuilder titleCase = new StringBuilder();
        for (int i = 0; i < workStr.length(); i++) {
            if (workStr.charAt(i) == ' ') {
                if (workStr.charAt(i+1) == ' ') {
                    i += 2;
                while (i < workStr.length() && workStr.charAt(i) == ' ') {
                    titleCase.append(workStr.charAt(i++));
                if (i < workStr.length()) {
                    titleCase.append(workStr.substring(i, i+1).toUpperCase());
            } else {
                titleCase.append(workStr.charAt(i));
        return titleCase.toString();
    {code}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • What is maximum length .mov file that can be used in iDVD?

    Hello Appleheads
    I created an HD .mov file in iMovie which I had added as a movie in iDVD - I find this works best for me, as exporting directly to iDVD from iMovie always results in stuttering when played back on a regular DVD player.
    The .mov file is 2 hours and 14 minutes long, and 10.5GB
    Am I right in thinking that the maximum length is 1 hour and 39 minutes? And does the size (GBs) have any bearing.
    Many thanks - Brion

    Hi Brion
    Gbs or Mbs - doesn't play any role what so ever.
    Time does - Time is = Movie duration + MENU DURATION
    (a complex menu can easily take 15 minutes or even more)
    iDVD 08, 09 & 11 has three levels of qualities. (vers 7.0.1, 7,0.4 & 7.1.1)
       iDVD 6 has the two last ones
    • Professional Quality (movies + menus up to 120 min.) - BEST
    • Best Performances (movies + menus  less than 60 min.) - High quality on final DVD
    • High Quality (in iDVD08 or 09) / Best Quality (in iDVD6) (movies + menus up to 120 min.) - slightly lower quality than above
    About double on DL DVDs.
    Yours Bengt W

Maybe you are looking for