Extracting Content with Regular Expressions

hi all,
i am trying to extract some content from a text file using regular expressions but i am unable to get the exact match that i am expecting
here is the sample text
The passage begins like that (something in text form (some
more embedded text) remaining outer text).
i am expecting to get the following match,
(something in text form (some
more embedded text) remaining outer text)
but my regular expression just returns me a broken section of the text as this one
(something in text form (some
more embedded text)
how can i get this fixed, i mean how can i match evenly across the open parenthesis.
here is my broken reg exp (?:\((?:\\.|[^\)\\])*\))
thanks for any help!

Cross-posted. See duplicate http://forum.java.sun.com/thread.jspa?threadID=653042&tstart=0

Similar Messages

  • Extract values with Regular Expression

    How to extract values into [ ] using regular expression.
    With data As
    Select 'AAAAAA[10] AAA: 19C' Txt From Dual Union all
    Select 'XX[450]-10A' Txt From Dual Union all
    Select '[5]AVC19C' Txt From Dual Union all
    Select 'FVD[120]D2AC' Txt From Dual
    )I hope this return
    10
    450
    5
    120Thanks in advanced

    user11118871 wrote:
    Thanks for all.
    Hi BluShadow, can you explain what this is doing!?
    ThanksSure.
    Search for ----------------|           /------- Replace the found pattern in the search string with backreference 1
                         /------------\   /\       (the first thing backreferenced in the search string)
    regexp_replace(txt, '^.*\[(.*)\].*$','\1')
                         |\/\/\--/\/\/|
                         | \ \  /  \ \|
                         | | |  |  | |\- End of string
                         | | |  |  | |
                         | | |  |  | \- Any number of characters
                         | | |  |  |
                         | | |  |  \- Right square bracket (escaped)
                         | | |  |
                         | | |  \- any characters (backreferenced by round brackets)
                         | | |
                         | | \- Left square bracket (escaped)
                         | |
                         | \- Any number of characters
                         |
                         \- Start of stringEdited by: BluShadow on Jun 29, 2009 2:19 PM

  • Grouping & Back-references with regular expressions on Replace Text window

    I really appreciate the inclusion of the Regular Expressions in the search & replace feature. One thing I am missing is back-references in the replacement expression. For instance, in the unix tools vi or sed, I might do something like this:
    s/\(firstPart\) \(secondPart\) \(oldThirdPart\)/\2 \1 newThirdPart/g
    which would allow me to switch the places of firstPart and secondPart, and totally replace thirdPart. If grouping and back-references are already present in the Replace Text window, how does one correctly invoke them?

    duplicate of Grouping & Back-references with regular expressions on Replace Text window

  • Assistance with Regular Expression and Tcl

    Assistance with Regular Expression and Tcl
    Hello Everyone,
      I recently began learning Tcl to develop scripts for automating network switch deployments. 
    In my script, I want to name the device with a location and the last three octets of the base mac address.
    I can get the Base MAC address by : 
    show version | include Base
     Base ethernet MAC Address       : 00:00:00:DB:CE:00
    And I can get the last three octets of the MAC address using the following regular expression. 
    ([0-9a-f]{2}[:-]){2}([0-9a-f]{2}$)
    But I have not been able to figure out how to call the regular expression in the tcl script.
    I have checked several resources but have not been able to figure it out.  Suggestions?
    Ultimately, I want to set the last three octets to a variable (something like below) and then call the variable when I name the switch.
    set mac [exec "sh version | i Base"] (include the regular expression)
    ios_config "hostname location$mac"
    Thanks for any assistance in advance.
    Chris

    This worked for me.
    Switch_1(tcl)#set result [exec show ver | inc Base]   
    Base ethernet MAC Address       : 00:1B:D4:F8:B1:80
    Switch_1(tcl)#regexp {([0-9A-F:]{8}\r)} $result -> mac
    1
    Switch_1(tcl)#puts $mac                               
    F8:B1:80
    Switch_1(tcl)#ios_config "hostname location$mac"      
    %Warning! Hostname should contain at least one alphabet or '-' or '_' character
    locationF8:B1:80(tcl)#

  • How to search with regular expression

    I make pdx files so that I can search text quickly. But Acrobat doesn't provide a way to search with regular expression. I'm wondering if there is a way that I don't know to search for regular expression in Acrobat Pro 9?

    First, Acrobat must "mount" the PDX.
    As "Find" does not use the cataloged index, use Shift+Ctrl+F to open the advanced search dialog.
    It may be helpful to first enter Acrobat Preferences and for the Search category tick "Always use advanced search options".
    Back to the Search dialog - use the drop down menu for "Look In" to pick "Select Index" then, if no PDXs show, click the Add button.
    In the Open Index File dialog, browse to the location of the desired PDX and select it.
    OK out and use "Return results containing" to pick a "Match ..." requirement or Boolean.
    To become familiar with query syntax, for Acrobat, it is good to review Acrobat Help.
    http://help.adobe.com/en_US/Acrobat/9.0/Professional/WS58a04a822e3e50102bd615109794195ff-7 c4b.w.html
    Be well...

  • Problem with Regular Expression

    Hi There!!
    I have a problem with regular expression. I want to validate that one word and second word are same. For that I have written a regex
    Pattern p=Pattern.compile("([a-z][a-zA-Z]*)\\s\1");
    Matcher m=p.matcher("nikhil nikhil");
    boolean t=m.matches();
    if (t)
              System.out.println("There is a match");
         else
              System.out.println("There is no match");
    The result I am getting is always "There is no match
    Your timely help will be much appreciated.
    Regards

    Ram wrote:
    ErasP wrote:
    You are missing a backward slash in the regex
    Pattern p = Pattern.compile("([a-z][a-zA-Z]*)\\s\\1");
    But this will fail in this case.
    Matcher m = p.matcher("Nikhil Nikhil");It is the reason for that *[a-z]*.The OP had [a-z][a-zA-Z]* in his code, so presumably he know what that means and wants that String not to match.

  • Get the string between li tags, with regular expression

    I have a unordered list, and I want to store all the strings between the li tags (<li>.?</li>)in an array:
    <ul>
    <li>This is String One</li>
    <li>This is String Two</li>
    <li>This is String Three</li>
    </ul>
    This is what have so far:
    <li>(.*?)</li>
    but it is not correct, I only want the string without the li tags.
    Thanks.

    No one?
    Anoyone here experienced with Regular Expression?

  • Need help with regular expression

    I'm trying to use the java.util.regex package to extract URLs from html files.
    The URLs that I am interested in extracting from the HTML look like the following:
    <font color="#008000">http://forum.java.sun.com -
    So, the URL is always preceeded by:
    <font color="#008000">
    and then followed by a space character and then a hyphen character. I want to be able to put all these URLs in a Vector object. This doesn't seem like it should be too difficult but for some reason I can't get anywhere with it. Any help would be greatly appreciated. Thanks!

    hi gupta am not sure of the java syntax but i can tell u about the regular expression...try this....
    <font color="#008000">(http:\/\/[a-zA-Z0-9.]+) [-]
    i dont know the java methods to call...just the reg exp...
    Sanjay Acharya

  • How to find sunstring with regular expression?

    How can I find a substring in a string with a regular expression?
    Example:
    I have a original string "<tr><th>RecordId: </th><td valign=middle>A4711</td></tr>"
    Now i want to extract the value "A4711" from this string with a regular expression. Everything except "A4711" is fixed, the id "A4711" itself is dynamic. How is it possible to get the substring "A4711" of the original string with a regular expression?

    i wrote a little method with the infos above to get such results:
         * Get all substrings of a string that matches a regular expression.
         * @param original String to inspect.
         * @param regExp Regular expression as search criteria.
         * @return All matches of <i>regExp</i> or null if one input parameter is null.
        public static String[] getSubstrings(String original, String regExp) {
            String[] result = null;
            if (original != null && regExp != null) {
                Pattern pattern = Pattern.compile(regExp);
                Matcher matcher = pattern.matcher(original);
                boolean matchFound = matcher.find();
                Vector matches = new Vector();
                while (matchFound) {
                    String match = matcher.group();        
                    matches.addElement(match);
                    matchFound = matcher.find();
                }//next match
                int count = matches.size();
                result = new String[count];
                for (int i = 0; i < count; i++) {
                    result[i] = (String) matches.elementAt(i);
                }//next match
            }//else: input unavailable
            return result;
        }//getSubstrings()

  • CFFORM (Flash) Validation with Regular Expressions Not Working

    I am having troubles getting regular expression validation to
    work in a CFFORM. The below code is an extract of a much larger
    form, the first name and last name have a regular expression
    validation...and it doesn't work!
    I'd appreciate any comments/info for help on this, have
    searched high and low on information to get this working...but no
    joy.
    The code is:
    <cffunction name="checkFieldSet" output="false"
    returnType="string">
    <cfargument name="fields" type="string" required="true"
    hint="Fields to search">
    <cfargument name="form" type="string" required="true"
    hint="Name of the form">
    <cfargument name="ascode" type="string" required="true"
    hint="Code to fire if all is good.">
    <cfset var vcode = "">
    <cfset var f = "">
    <cfsavecontent variable="vcode">
    var ok = true;
    var msg = "";
    <cfloop index="f" list="#arguments.fields#">
    <cfoutput>
    if(!mx.validators.Validator.isValid(this,
    '#arguments.form#.#f#')) { msg = msg + #f#.errorString + '\n';
    ok=false; }
    </cfoutput>
    </cfloop>
    </cfsavecontent>
    <cfset vcode = vcode & "if(!ok)
    mx.controls.Alert.show(msg,'Validation Error'); ">
    <cfset vcode = vcode & "if(ok) #ascode#">
    <cfset vcode =
    replaceList(vcode,"#chr(10)#,#chr(13)#,#chr(9)#",",,")>
    <cfreturn vcode>
    </cffunction>
    <cfform name="new_form" format="flash" width="600"
    height="600" skin="halosilver" action="new_data.cfc">
    <cfformgroup type="panel" label="New Form"
    style="background-color:##CCCCCC;">
    <cfformgroup type="tabnavigator" id="tabs">
    <cfformgroup type="page" label="Step 1">
    <cfformgroup type="hbox">
    <cfformgroup type="panel" label="Requestor Information"
    style="headerHeight: 13;">
    <cfformgroup type="vbox">
    <cfinput type="text" name="reqName" width="300"
    label="First Name:" validate="regular_expression" pattern="[^0-9]"
    validateat="onblur" required="yes" message="You must supply your
    First Name.">
    <cfinput type="text" name="reqLname" width="300"
    label="Last Name:" validate="regular_expression" pattern="[^0-9]"
    validateat="onblur" required="yes" message="You must supply your
    Last Name.">
    <cfinput type="text" name="reqEmail" width="300"
    label="Email:" validate="email" required="yes" message="You must
    supply your email or the address given is in the wrong format.">
    <cfinput type="text" name="reqPhone" width="300"
    label="Phone Extension:" validate="integer" required="yes"
    maxlength="4" message="You must supply your phone number.">
    </cfformgroup>
    </cfformgroup>
    </cfformgroup>
    <cfformgroup type="horizontal"
    style="horizontalAlign:'right';">
    <cfinput type="button" width="100" name="cnt_step2"
    label="next" value="Next"
    onClick="#checkFieldSet("reqName,reqLname,reqEmail,reqPhone","new_form","tabs.selectedInd ex=tabs.selectedIndex+1")#"
    align="right">
    </cfformgroup>
    </cfformgroup>
    </cfformgroup>
    </cfformgroup>
    </cfform>

    quote:
    Originally posted by:
    Luckbox72
    The problem is not the Regex. I have tested 3 or 4 different
    versions that all work on the many different test sites. The
    problem is it that the validation does not seem to work. I have
    changed the patter to only allow NA and I can still type anything
    into the text box. Is there some issue with useing Regex as your
    validation?
    Bear in mind that by default validation does not occur until
    the user attempts to submit the form. If you are trying to control
    the characters that the user can enter into the textbox, as opposed
    to validating what they have entered, you will need to provide your
    own javascript validation.

  • DOM Parser fails with regular expression using anchor (carat, dollar)

    I'm using version "Oracle XDK Java 9.0.4.0.0 Production"
    In trying to parse XML against schema: a regular expression fails to parse the data "8:00" with the following simple regular expression: "^.*$" (used to narrow the error)
    The error message is
    <Line 14, Column 25>: XSD-2025: (Error) Invalid text '8:00' in element: 'XYZ'
    If I remove the anchors and just have ".*", the data is parsed successfully.
    I dont understand why the parse fails when I use a anchors in the regular expression, and the java Pattern/Matcher classes succeed with the anchors?

    That "ns670" string is an xml namespace prefix. it should have a corresponding xml namespace declaration somewhere in the xml document (i'm guessing you have not shown the whole document). the actual value of an xml namespace prefix is meaningless. if you parse the xml with a namespace aware DOM parser, it should generate Nodes with the correct namespace. the namespace is the value you care about when extracting data from the document, not the namespace prefix.
    alternately, if you parse the document using a namespace aware DOM parser, you can just look for nodes based on their "local" name (the part after the ":" separator) and ignore the namespace/prefix.
    whatever you do, please do not parse the xml with a regex, see this http://stackoverflow.com/a/1732454/552759 for details (applies to xml as well).

  • Problem with Regular Expressions

    Hi Everyone:
    I'm having a problem finding the easiest way to retrieve the replacement text that has been edited to insert back-references from previous matches. For instance,
    Lets say I want to use the below regular expression
    foo_bar([1-9])_fun
    on the search text
    foo_bar1_fun
    foo_bar2_fun
    foo_bar3_fun
    and then the replacement text would be
    foobar_fun$1
    so in the end I would get
    foobar_fun1
    foobar_fun2
    foobar_fun3
    What I would like to do is be able to extract the replacement text that has been modified with the back reference matches after I use the Matcher.appendReplacement(stringbuffer, string) method.
    So to clarify further, after I find the first match and use the appendReplacement Method, I would like to extract just the replacement that was made to the text, in this case foobar_fun1
    Thanks for any help!

    Alright, thanks for the reply. I'll try and make this a little more clear
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    public class LeClassName {
       public static void main(String[] args) {
          String input = "12341234foo_bar1_fun24342345foo_bar2_fun3522545foo_bar3_fun3423432";
          Pattern pattern = Pattern.compile("foo_bar([1-9])_fun");
          StringBuffer sb = new StringBuffer();
          Matcher matcher = pattern.matcher(input);
          while (matcher.find()) {
             matcher.appendReplacement(sb, "foobar_fun$1");
             //after the first pass through, I would like to extract, foobar_fun1
             // but what is in sb is 12341234foobar_fun1
          System.out.println(sb.toString());
    }I did find a solution myself after a bit of thinking but if anyone can come up with a better solution do tell and I'll award that person the answer. Thanks again!
    My Solution:
    private String fixBackReplacements() throws IndexOutOfBoundsException {
                String currentMatch = this.findMatch.group();
                Pattern temporaryPattern = Pattern.compile(findText);
                Matcher temporaryMatcher = temporaryPattern.matcher(currentMatch);
                return temporaryMatcher.replaceFirst(replaceText);
    }

  • Help with Regular Expression for field validation

    I'm fairly new to using regular expressions and using Acrobat. This is probably a simple question, but I've been unable to figure it out.
    I have a text field on a PDF that I would like to be 9 characters in length. The first 2 characters can only be alphanumeric, the last 7 characters can only be numeric.
    At first I was using the following, which allows all the characters to be alphanumeric:
    var re = /^[A-Za-z0-9 :\\_]$/;
    if (event.change.length >0) {
    if (event.willCommit == false) {
        if (!re.test(event.change)) {
            event.rc = false
    That works fine, but it's not quite what I needed. With some assistance I changed it (see below) to fit what I was looking for. However, this didn't work; it prevents anything from being entered in the field:
    var re = /^[A-Za-z0-9]{2}\d{7}$/;
    if (event.change.length >0) {
    if (event.willCommit == false) {
        if (!re.test(event.change)) {
            event.rc = false
    Any help would be greatly appreciated.
    Thanks...

    Here's a function you can call form the field's custom Format script. It should be placed in a document-level JavaScript:
    function custom_ks1() {
        // Define non-commited regular expression
        var re = /^[A-Za-z0-9]{0,2}([0-9]{0,7})?$/;
        // Get all of the characters the user has entered
        var value = AFMergeChange(event);
        // Allow field to be cleared
        if(!value) return;
        if (event.willCommit) {
            // Define commited regular expression
            var re = /^[A-Za-z0-9]{2}[0-9]{7}$/;
            if (!re.test(value)) {  // If final value doesn't match, alert user
                app.alert("Your error message goes here.");
                // event.rc = false
        } else {  // not commited
            // Only allow characters that match the regular expression
            event.rc = re.test(value);
    Call it like this:
    // Custom Keystroke script
    custom1_ks();

  • Help with regular expression needed

    Hi,
    Perhaps someone here can help me with my regular expression I'm trying to build in my Java code.
    The regular expression that I'm looking to build consists of any non-whitespace character up until it finds one or two <>= symbols and then any character thereafter. So both these Strings would match the expression:
    City 1==London
    Age>=18
    The regular expression that I'm using is as follows:
    (\\S+)([><=]){1,2}(.+)However, group 1 always retrieves the first <>= symbol as in "City 1=". How can I make the <>= part greedy so that it retrieves both operator symbols?
    Thanks.

    Make the first group, the non-spaces, reluctant:
    "(\\S+?)([<>=]{1,2})(.+)"

  • Help with regular expression to find a pattern in clob

    can someone help me writing a regular expression to query a clob that containts xml type data?
    query to find multiple occurrences of a variable string (i.e <EMPID-XX> - XX can be any number). If <EMPID-01> appears twice in the clob i want the result as EMPID-01,2 and if EMPID-02 appears 4 times i want the result as EMPID-02,4.

    with
    ofx_clob as
    (select q'~
    <EMPID>1
    < UNQID>123456
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>2
    < UNQID>123457
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>1
    < UNQID>123458
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    ~' ofx from dual
    select '<EMPID>' || to_char(ids) || '(' || to_char(count(*)) || ')' multi_empid
      from (select replace(regexp_substr(ofx,'<EMPID>\d*',1,level),'<EMPID>') ids
              from ofx_clob
            connect by level <= regexp_count(ofx,'<EMPID>')
    group by ids having count(*) > 1
    MULTI_EMPID
    <EMPID>1(2)
    with
    ofx_clob as
    (select q'~
    <EMPID>1
    < UNQID>123456
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>2
    < UNQID>123457
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>1
    < UNQID>123456
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>2
    < UNQID>123456
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    <EMPID>1
    < UNQID>123458
    < TIMESTAMP>...
    < ADDRINFO>
    < TITLE>^@~*
    < FIRST>ABCD
    < MI>
    < LAST>EFGH
    < ADDR1>ADDR1
    < ADDR2>^@~*
    < CITY>CITY
    ~' ofx from dual
    select '<EMPID>' || listagg(to_char(ids) || '(' || to_char(count(*)) || ')',',') within group (order by ids) multi_empid
      from (select replace(regexp_substr(ofx,'<EMPID>\d*',1,level),'<EMPID>') ids
              from ofx_clob
            connect by level <= regexp_count(ofx,'<EMPID>')
    group by ids having count(*) > 1
    MULTI_EMPID
    <EMPID>1(3),2(2)
    Regards
    Etbin
    Message was edited by: Etbin
    used listagg to report more than one multiple <EMPID>

Maybe you are looking for

  • Help with ansi joins

    hi all i have this query in which i have used ansi joins,not an expert though on ansi joins ...got to knw abt ansi joins only today SELECT abc.vendor_number, abc.vendor_name, api.invoice_amount,        api.invoice_date, api.invoice_type_lookup_code,

  • Firefox crashes when i open it and i cant open it in safe mode

    here are the crash id's i hope they can be more detailed than what i am about to type bp-25faf820-3ada-4c6b-846f-a3c412130711 bp-d123fd1a-cd29-46f0-ac47-e29ff2130711 bp-04cbde72-6dd1-4344-89cd-f63d72130711 bp-a491faed-59c6-401f-8e38-802632130711 here

  • How do you do home share with in different accounts on the same computer?

    How do you homeshare within different log in accounts (and itunes accounts) within one computer?

  • My executable VI runs immediately upon launch. How can I stop this?

    In my application, the operator must enter front panel fields for information such as IP address.  But before that can happen, the VI "auto runs."  Since there is not a valid IP address, the VI starts spewing error messages.  Not cool for a productio

  • Safari only shows half of the internet pages

    Safari on my iPhone5 only shows half of the webpages I go to. It's not even a propper half, like, it's slim or missing the bottum. No, it just shows one corner and the rest is black and gray. It's hard to discribe so I made 2 printscreens. This is ho