Regular expressions over objects?

Hello everybody,
following problem: I'm working on a project related to natural language processing and I need to match patterns against sentences. The patterns would be like the following example: "<NOUN> is just <VERB> with <NOUN> (and <CONNECTIVE> <NOUN>)*". Thus, each token in the regexp represents either a word or a class of words. Each token (word) would be represented as an object and it is possible to define equality relation on these objects. Now, does anybody here know about a regexp library, which would allow this? All the regexp libraries I've seen sofar are only working on a text. But I need a library, which would work on sequencies of objects, and the programmer would provide a method saying whether two objects are equal or not. Is there any such library? I don't have time to write it for myself.

Actually, I think you need EBNF not regex.
EBNF - Extended Backus-Naur Form - deals with definitions which include synbols which then have their own definitions. A 'rule' is a mapping from a 'symbol' to an 'expression'. A set of rules is known as a 'grammar'. Some of the expressions could contain regex if it respresents a String.
regex - Regular Expressions - are just for text patterns.
I haven't seen any EBNF (or BNF) programs, but I haven't searched too much for them either. Hopefully this will help point you toward more successfull searches.
I have written some EBNF representations in Java, but they were for specialized uses with other code as part of the classes and so they could be applied generally to your needs.

Similar Messages

  • Regular Expression Validator - want to edit an object acct field

    The field is a character field. But the field must be editted as follows:
    Field must either be 4 numeric numbers with the first position always an 8 so it could be a range from 8000 to 8999
    But I must also allow them to enter "8%" or "8n%" or "8nn%" where n represents a number. They want me to allow wild cards.
    How would I define this edit on the same field.
    1) First Validation: If it has a "%" then the first digit must be an 8 then I can have any of the wild card combinations given above.
    2) Second Validation: If no wild card then the field must be in the range from 8000-8999
    Is this validation a candidate for Regular Expression Validator or for my own custom method.
    If I can code this as a regular expression validator how would I code it. I spent the last day looking on the web for Regular Expression Validator examples.

    I get the following errors that occur. I can key 8% or 8ddd but when I try these combinations 8d% or 8dd% then the following errors pop up. It gives multiple error but I am just keying on a single line. I can add multiple object lines but just one at a time.
    java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    - JBO-ObjectCode_Rule_1: Object must be in range 8000-8999 or object must be in form of 8%, 8n%, 8nn% where n represents a number.
    - JBO-27008: Attribute set for ObjectCode in view object AccountsecuritygroupobjView2 failed
    - JBO-ObjectCode_Rule_1: Object must be in range 8000-8999 or object must be in form of 8%, 8n%, 8nn% where n represents a number.
    - JBO-ObjectCode_Rule_1: Object must be in range 8000-8999 or object must be in form of 8%, 8n%, 8nn% where n represents a number.
    - JBO-27008: Attribute set for ObjectCode in view object AccountsecuritygroupobjView2 failed
    - JBO-ObjectCode_Rule_1: Object must be in range 8000-8999 or object must be in form of 8%, 8n%, 8nn% where n represents a number.

  • Request some help, over procedure's performance uses regular expressions for its functinality

    Hi All,
            Below is the procedure, having functionalities of populating two tables. For first table, its a simple insertion process but for second table, we need to break the soruce record as per business requirement and then insert into the table. [Have used regular expressions for that]
            Procedure works fine but it takes around 23 mins for processing 1mm of rows.
            Since this procedure would be used, parallely by different ETL processes, so append hint is not recommended.
            Is there any ways to improve its performance, or any suggestion if my approach is not optimized?  Thanks for all help in advance.
    CREATE OR REPLACE PROCEDURE SONARDBO.PRC_PROCESS_EXCEPTIONS_LOGS_TT
         P_PROCESS_ID       IN        NUMBER, 
         P_FEED_ID          IN        NUMBER,
         P_TABLE_NAME       IN        VARCHAR2,
         P_FEED_RECORD      IN        VARCHAR2,
         P_EXCEPTION_RECORD IN        VARCHAR2
        IS
        PRAGMA AUTONOMOUS_TRANSACTION;
        V_EXCEPTION_LOG_ID     EXCEPTION_LOG.EXCEPTION_LOG_ID%TYPE;
        BEGIN
        V_EXCEPTION_LOG_ID :=EXCEPTION_LOG_SEQ.NEXTVAL;
             INSERT INTO SONARDBO.EXCEPTION_LOG
                 EXCEPTION_LOG_ID, PROCESS_DATE, PROCESS_ID,EXCEPTION_CODE,FEED_ID,SP_NAME
                ,ATTRIBUTE_NAME,TABLE_NAME,EXCEPTION_RECORD
                ,DATA_STRUCTURE
                ,CREATED_BY,CREATED_TS
             VALUES           
             (   V_EXCEPTION_LOG_ID
                ,TRUNC(SYSDATE)
                ,P_PROCESS_ID
                ,'N/A'
                ,P_FEED_ID
                ,NULL 
                ,NULL
                ,P_TABLE_NAME
                ,P_FEED_RECORD
                ,NULL
                ,USER
                ,SYSDATE  
            INSERT INTO EXCEPTION_ATTR_LOG
                EXCEPTION_ATTR_ID,EXCEPTION_LOG_ID,EXCEPTION_CODE,ATTRIBUTE_NAME,SP_NAME,TABLE_NAME,CREATED_BY,CREATED_TS,ATTRIBUTE_VALUE
            SELECT
                EXCEPTION_ATTR_LOG_SEQ.NEXTVAL          EXCEPTION_ATTR_ID
                ,V_EXCEPTION_LOG_ID                     EXCEPTION_LOG_ID
                ,REGEXP_SUBSTR(str,'[^|]*',1,1)         EXCEPTION_CODE
                ,REGEXP_SUBSTR(str,'[^|]+',1,2)         ATTRIBUTE_NAME
                ,'N/A'                                  SP_NAME    
                ,p_table_name
                ,USER
                ,SYSDATE
                ,REGEXP_SUBSTR(str,'[^|]+',1,3)         ATTRIBUTE_VALUE
            FROM
            SELECT
                 REGEXP_SUBSTR(P_EXCEPTION_RECORD, '([^^])+', 1,t2.COLUMN_VALUE) str
            FROM
                DUAL t1 CROSS JOIN
                        TABLE
                            CAST
                                MULTISET
                                    SELECT LEVEL
                                    FROM DUAL
                                    CONNECT BY LEVEL <= REGEXP_COUNT(P_EXCEPTION_RECORD, '([^^])+')
                                AS SYS.odciNumberList
                        ) t2
            WHERE REGEXP_SUBSTR(str,'[^|]*',1,1) IS NOT NULL
            COMMIT;
           EXCEPTION
             WHEN OTHERS THEN
             ROLLBACK;
             RAISE;
        END;
    Many Thanks,
    Arpit

    Regex's are known to be CPU intensive specially when dealing with large number of rows.
    If you have to reduce the processing time, you need to tune the Select statements.
    One suggested change could be to change the following query
    SELECT
                 REGEXP_SUBSTR(P_EXCEPTION_RECORD, '([^^])+', 1,t2.COLUMN_VALUE) str
            FROM
                DUAL t1 CROSS JOIN
                        TABLE
                            CAST
                                MULTISET
                                    SELECT LEVEL
                                    FROM DUAL
                                    CONNECT BY LEVEL <= REGEXP_COUNT(P_EXCEPTION_RECORD, '([^^])+')
                                AS SYS.odciNumberList
                        ) t2
    to
    SELECT REGEXP_SUBSTR(P_EXCEPTION_RECORD, '([^^])+', 1,level) str
    FROM DUAL
    CONNECT BY LEVEL <= REGEXP_COUNT(P_EXCEPTION_RECORD, '([^^])+')
    Before looking for any performance benefit, you need to ensure that this does not change your output.
    How many substrings are you expecting in the P_EXCEPTION_RECORD? If less than 5, it will be better to opt for SUBSTR and INSTR combination as it might work well with the number of records you are working with. Only trouble is, you will have to write different SUBSTR and INSTR statements for each column to be fetched.
    How are you calling this procedure? Is it not possible to work with Collections? Delimited strings are not a very good option as it requires splitting of the data every time you need to refer to.

  • Using regular expressions for validating time fields

    Similar to my problem with converting a big chunk of validation into smaller chunks of functions I am trying to use Regular Expressions to handle the validation of many, many time fields in a flexible working time sheet.
    I have a set of FormCalc scripts to calculate the various values for days, hours and the gain/loss of hours over a four week period. For these scripts to work the time format must be in HH:MM.
    Accessibility guidelines nix any use of message box pop ups so I wanted to get around this by having a hidden/visible field with warning text but can't get it to work.
    So far I have:
    var r = new RegExp(); // Create a new Regular Expression Object
    r.compile ("^[00-99]:\\] + [00-59]");
    var result = r.test(this.rawValue);
    if (result == true){
    true;
    form1.flow.page.parent.part2.part2body.errorMessage.presence = "visible";
    else (result == false){
    false;
    form1.flow.page.parent.part2.part2body.errorMessage.presence = "hidden";
    Any help would be appreciated!

    Date and time fields are tricky because you have to consider the formattedValue versus the rawValue. If I am going to use regular expressions to do validation I find it easier to make them text fields and ignore the time patterns (formattedValue). Something like this works (as far as my very brief testing goes) for 24 hour time where time format is HH:MM.
    // form1.page1.subform1.time_::exit - (JavaScript, client)
    var error = false;
    form1.page1.subform1.errorMsg.rawValue = "";
    if (!(this.isNull)) {
      var time_ = this.rawValue;
      if (time_.length != 5) {
        error = true;
      else {
        var regExp = /^([01]?[0-9]|2[0-3]):[0-5][0-9]$/;
        if (!(regExp.test(time_))) {
          error = true;
    if (error == true) {
      form1.page1.subform1.errorMsg.rawValue = "The time must be in the format HH:MM where HH is 00-23 and MM is 00-59.";
      form1.page1.subform1.errorMsg.presence = "visible";
    Steve

  • Regular Expression Map

    Hi,
    I need some kind of a "RegExpMap" - a map that has the capability
    to bind value objects to a pattern (a regular expression) and to retreive that
    objects back by giving a sample that matches the pattern.
    void put(pattern, Object)
    Object get(sample)so that
    map.put("foo(.*)", o);
    assertTrue(map.get("foobar") == o);Does anyone of you know if there's such an implementation (freely) available?
    Thanks for your help in advance!
    Frank

    Regular expressions are normally computed by first converting the expression to a state machine (of one kind or another, depending on the type of regexp you are using). These state
    machines are then run on your string (character at a time). Now if you can create a state
    machine for each expression then you can potentially merge this set of state machines into a single new one.
    For example the union of two state machines provides a state machine that can match strings that either of the origininal SM's could. The intersection of two state manchines produces a
    state machine that can match string that both of the original SM's would have matched. The
    concatenation of two SM's will allow the a string composed of something that matches the first
    SM and something that matches the second SM to match etc.
    The lib asif pointed out does not do exactly what the OP wants but does provide a lot of the
    functionality that may be necessary to produce thier own combination method.
    Whether it is even possible (or if it would be more efficient then a linear search over all
    regexps) I don't know. Interesting though.
    matfud

  • Need help with regular expression

    I'm trying to use the java.util.regex package to extract URLs from html files.
    The URLs that I am interested in extracting from the HTML look like the following:
    <font color="#008000">http://forum.java.sun.com -
    So, the URL is always preceeded by:
    <font color="#008000">
    and then followed by a space character and then a hyphen character. I want to be able to put all these URLs in a Vector object. This doesn't seem like it should be too difficult but for some reason I can't get anywhere with it. Any help would be greatly appreciated. Thanks!

    hi gupta am not sure of the java syntax but i can tell u about the regular expression...try this....
    <font color="#008000">(http:\/\/[a-zA-Z0-9.]+) [-]
    i dont know the java methods to call...just the reg exp...
    Sanjay Acharya

  • Re: [iPlanet-JATO] Re: Use Of models in utility classes - Pease don't forget about the regular expression potential

    Namburi,
    When you said you used the Reg Exp tool, did you use it only as
    preconfigured by the iMT migrate application wizard?
    Because the default configuration of the regular expression tool will only
    target the files in your ND project directories. If you wish to target
    classes outside of the normal directory scope, you have to either modify the
    "Source Directory" property OR create another instance of the regular
    expression tool. See the "Tool" menu in the iMT to create additional tool
    instances which can each be configured to target different sets of files
    using different sets of rules.
    Usually, I utilize 3 different sets of rules files on a given migration:
    spider2jato.xml
    these are the generic conversion rules (but includes the optimized rules for
    ViewBean and Model based code, i.e. these rules do not utilize the
    RequestManager since it is not needed for code running inside the ViewBean
    or Model classes)
    I run these rules against all files.
    See the file download section of this forum for periodic updates to these
    rules.
    nonProjectFileRules.xml
    these include rules that add the necessary
    RequestManager.getRequestContext(). etc prefixes to many of the common
    calls.
    I run these rules against user module and any other classes that do not are
    not ModuleServlet, ContainerView, or Model classes.
    appXRules.xml
    these rules include application specific changes that I discover while
    working on the project. A common thing here is changing import statements
    (since the migration tool moves ND project code into different jato
    packaging structure, you sometime need to adjust imports in non-project
    classes that previously imported ND project specific packages)
    So you see, you are not limited to one set of rules at all. Just be careful
    to keep track of your backups (the regexp tool provides several options in
    its Expert Properties related to back up strategies).
    ----- Original Message -----
    From: <vnamboori@y...>
    Sent: Wednesday, August 08, 2001 6:08 AM
    Subject: [iPlanet-JATO] Re: Use Of models in utility classes - Pease don't
    forget about the regular expression potential
    Thanks Matt, Mike, Todd
    This is a great input for our migration. Though we used the existing
    Regular Expression Mapping tool, we did not change this to meet our
    own needs as mentioned by Mike.
    We would certainly incorporate this to ease our migration.
    Namburi
    --- In iPlanet-JATO@y..., "Todd Fast" <toddwork@c...> wrote:
    All--
    Great response. By the way, the Regular Expression Tool uses thePerl5 RE
    syntax as implemented by Apache OROMatcher. If you're doing lotsof these
    sorts of migration changes manually, you should definitely buy theO'Reilly
    book "Mastering Regular Expressions" and generate some rules toautomate the
    conversion. Although they are definitely confusing at first,regular
    expressions are fairly easy to understand with some documentation,and are
    superbly effective at tackling this kind of migration task.
    Todd
    ----- Original Message -----
    From: "Mike Frisino" <Michael.Frisino@S...>
    Sent: Tuesday, August 07, 2001 5:20 PM
    Subject: Re: [iPlanet-JATO] Use Of models in utility classes -Pease don't
    forget about the regular expression potential
    Also, (and Matt's document may mention this)
    Please bear in mind that this statement is not totally correct:
    Since the migration tool does not do much of conversion for
    these
    utilities we have to do manually.Remember, the iMT is a SUITE of tools. There is the extractiontool, and
    the translation tool, and the regular expression tool, and severalother
    smaller tools (like the jar and compilation tools). It is correctto state
    that the extraction and translation tools only significantlyconvert the
    primary ND project objects (the pages, the data objects, and theproject
    classes). The extraction and translation tools do minimumtranslation of the
    User Module objects (i.e. they repackage the user module classes inthe new
    jato module packages). It is correct that for all other utilityclasses
    which are not formally part of the ND project, the extraction and
    translation tools do not perform any migration.
    However, the regular expression tool can "migrate" any arbitrary
    file
    (utility classes etc) to the degree that the regular expressionrules
    correlate to the code present in the arbitrary file. So first andforemost,
    if you have alot of spider code in your non-project classes youshould
    consider using the regular expression tool and if warranted adding
    additional rules to reduce the amount of manual adjustments thatneed to be
    made. I can stress this enough. We can even help you write theregular
    expression rules if you simply identify the code pattern you wish to
    convert. Just because there is not already a regular expressionrule to
    match your need does not mean it can't be written. We have notnearly
    exhausted the possibilities.
    For example if you say, we need to convert
    CSpider.getDataObject("X");
    To
    RequestManager.getRequestContext().getModelManager().getModel(XModel.class);
    Maybe we or somebody else in the list can help write that regularexpression if it has not already been written. For instance in thelast
    updated spider2jato.xml file there is already aCSpider.getCommonPage("X")
    rule:
    <!--getPage to getViewBean-->
    <mapping-rule>
    <mapping-rule-primarymatch>
    <![CDATA[CSpider[.\s]*getPage[\s]*\(\"([^"]*)\"]]>
    </mapping-rule-primarymatch>
    <mapping-rule-replacement>
    <mapping-rule-match>
    <![CDATA[CSpider[.\s]*getPage[\s]*\(\"([^"]*)\"]]>
    </mapping-rule-match>
    <mapping-rule-substitute>
    <![CDATA[getViewBean($1ViewBean.class]]>
    </mapping-rule-substitute>
    </mapping-rule-replacement>
    </mapping-rule>
    Following this example a getDataObject to getModel would look
    like this:
    <mapping-rule>
    <mapping-rule-primarymatch>
    <![CDATA[CSpider[.\s]*getDataObject[\s]*\(\"([^"]*)\"]]>
    </mapping-rule-primarymatch>
    <mapping-rule-replacement>
    <mapping-rule-match>
    <![CDATA[CSpider[.\s]*getDataObject[\s]*\(\"([^"]*)\"]]>
    </mapping-rule-match>
    <mapping-rule-substitute>
    <![CDATA[getModel($1Model.class]]>
    </mapping-rule-substitute>
    </mapping-rule-replacement>
    </mapping-rule>
    In fact, one migration developer already wrote that rule andsubmitted it
    for inclusion in the basic set. I will post another upgrade to thebasic
    regular expression rule set, look for a "file uploaded" posting.Also,
    please consider contributing any additional generic rules that youhave
    written for inclusion in the basic set.
    Please not, that in some cases (Utility classes in particular)
    the rule
    application may be more effective as TWO sequention rules ratherthan one
    monolithic rule. Again using the example above, it will convert
    CSpider.getDataObject("Foo");
    To
    getModel(FooModel.class);
    Now that is the most effective conversion for that code if that
    code is in
    a page or data object class file. But if that code is in a Utilityclass you
    really want:
    >
    RequestManager.getRequestContext().getModelManager().getModel(FooModel.class
    So to go from
    getModel(FooModel.class);
    To
    RequestManager.getRequestContext().getModelManager().getModel(FooModel.class
    You would apply a second rule AND you would ONLY run this rule
    against
    your utility classes so that you would not otherwise affect yourViewBean
    and Model classes which are completely fine with the simplegetModel call.
    <mapping-rule>
    <mapping-rule-primarymatch>
    <![CDATA[getModel\(]]>
    </mapping-rule-primarymatch>
    <mapping-rule-replacement>
    <mapping-rule-match>
    <![CDATA[getModel\(]]>
    </mapping-rule-match>
    <mapping-rule-substitute>
    <![CDATA[RequestManager.getRequestContext().getModelManager().getModel(]]>
    </mapping-rule-substitute>
    </mapping-rule-replacement>
    </mapping-rule>
    A similer rule can be applied to getSession and other CSpider APIcalls.
    For instance here is the rule for converting getSession calls toleverage
    the RequestManager.
    <mapping-rule>
    <mapping-rule-primarymatch>
    <![CDATA[getSession\(\)\.]]>
    </mapping-rule-primarymatch>
    <mapping-rule-replacement>
    <mapping-rule-match>
    <![CDATA[getSession\(\)\.]]>
    </mapping-rule-match>
    <mapping-rule-substitute>
    <![CDATA[RequestManager.getSession().]]>
    </mapping-rule-substitute>
    </mapping-rule-replacement>
    </mapping-rule>
    ----- Original Message -----
    From: "Matthew Stevens" <matthew.stevens@e...>
    Sent: Tuesday, August 07, 2001 12:56 PM
    Subject: RE: [iPlanet-JATO] Use Of models in utility classes
    Namburi,
    I will post a document to the group site this evening which has
    the
    details
    on various tactics of migrating these type of utilities.
    Essentially,
    you
    either need to convert these utilities to Models themselves or
    keep the
    utilities as is and simply use the
    RequestManager.getRequestContext.getModelManager().getModel()
    to statically access Models.
    For CSpSelect.executeImmediate() I have an example of customhelper
    method
    as a replacement whicch uses JDBC results instead of
    CSpDBResult.
    matt
    -----Original Message-----
    From: vnamboori@y... [mailto:<a href="/group/SunONE-JATO/post?protectID=081071113213093190112061186248100208071048">vnamboori@y...</a>]
    Sent: Tuesday, August 07, 2001 3:24 PM
    Subject: [iPlanet-JATO] Use Of models in utility classes
    Hi All,
    In the present ND project we have lots of utility classes.
    These
    classes in diffrent directory. Not part of nd pages.
    In these classes we access the dataobjects and do themanipulations.
    So we access dataobjects directly like
    CSpider.getDataObject("do....");
    and then execute it.
    Since the migration tool does not do much of conversion forthese
    utilities we have to do manually.
    My question is Can we access the the models in the postmigration
    sameway or do we need requestContext?
    We have lots of utility classes which are DataObjectintensive. Can
    someone suggest a better way to migrate this kind of code.
    Thanks
    Namburi
    [email protected]
    [email protected]
    [Non-text portions of this message have been removed]
    [email protected]
    [email protected]

    Namburi,
    When you said you used the Reg Exp tool, did you use it only as
    preconfigured by the iMT migrate application wizard?
    Because the default configuration of the regular expression tool will only
    target the files in your ND project directories. If you wish to target
    classes outside of the normal directory scope, you have to either modify the
    "Source Directory" property OR create another instance of the regular
    expression tool. See the "Tool" menu in the iMT to create additional tool
    instances which can each be configured to target different sets of files
    using different sets of rules.
    Usually, I utilize 3 different sets of rules files on a given migration:
    spider2jato.xml
    these are the generic conversion rules (but includes the optimized rules for
    ViewBean and Model based code, i.e. these rules do not utilize the
    RequestManager since it is not needed for code running inside the ViewBean
    or Model classes)
    I run these rules against all files.
    See the file download section of this forum for periodic updates to these
    rules.
    nonProjectFileRules.xml
    these include rules that add the necessary
    RequestManager.getRequestContext(). etc prefixes to many of the common
    calls.
    I run these rules against user module and any other classes that do not are
    not ModuleServlet, ContainerView, or Model classes.
    appXRules.xml
    these rules include application specific changes that I discover while
    working on the project. A common thing here is changing import statements
    (since the migration tool moves ND project code into different jato
    packaging structure, you sometime need to adjust imports in non-project
    classes that previously imported ND project specific packages)
    So you see, you are not limited to one set of rules at all. Just be careful
    to keep track of your backups (the regexp tool provides several options in
    its Expert Properties related to back up strategies).
    ----- Original Message -----
    From: <vnamboori@y...>
    Sent: Wednesday, August 08, 2001 6:08 AM
    Subject: [iPlanet-JATO] Re: Use Of models in utility classes - Pease don't
    forget about the regular expression potential
    Thanks Matt, Mike, Todd
    This is a great input for our migration. Though we used the existing
    Regular Expression Mapping tool, we did not change this to meet our
    own needs as mentioned by Mike.
    We would certainly incorporate this to ease our migration.
    Namburi
    --- In iPlanet-JATO@y..., "Todd Fast" <toddwork@c...> wrote:
    All--
    Great response. By the way, the Regular Expression Tool uses thePerl5 RE
    syntax as implemented by Apache OROMatcher. If you're doing lotsof these
    sorts of migration changes manually, you should definitely buy theO'Reilly
    book "Mastering Regular Expressions" and generate some rules toautomate the
    conversion. Although they are definitely confusing at first,regular
    expressions are fairly easy to understand with some documentation,and are
    superbly effective at tackling this kind of migration task.
    Todd
    ----- Original Message -----
    From: "Mike Frisino" <Michael.Frisino@S...>
    Sent: Tuesday, August 07, 2001 5:20 PM
    Subject: Re: [iPlanet-JATO] Use Of models in utility classes -Pease don't
    forget about the regular expression potential
    Also, (and Matt's document may mention this)
    Please bear in mind that this statement is not totally correct:
    Since the migration tool does not do much of conversion for
    these
    utilities we have to do manually.Remember, the iMT is a SUITE of tools. There is the extractiontool, and
    the translation tool, and the regular expression tool, and severalother
    smaller tools (like the jar and compilation tools). It is correctto state
    that the extraction and translation tools only significantlyconvert the
    primary ND project objects (the pages, the data objects, and theproject
    classes). The extraction and translation tools do minimumtranslation of the
    User Module objects (i.e. they repackage the user module classes inthe new
    jato module packages). It is correct that for all other utilityclasses
    which are not formally part of the ND project, the extraction and
    translation tools do not perform any migration.
    However, the regular expression tool can "migrate" any arbitrary
    file
    (utility classes etc) to the degree that the regular expressionrules
    correlate to the code present in the arbitrary file. So first andforemost,
    if you have alot of spider code in your non-project classes youshould
    consider using the regular expression tool and if warranted adding
    additional rules to reduce the amount of manual adjustments thatneed to be
    made. I can stress this enough. We can even help you write theregular
    expression rules if you simply identify the code pattern you wish to
    convert. Just because there is not already a regular expressionrule to
    match your need does not mean it can't be written. We have notnearly
    exhausted the possibilities.
    For example if you say, we need to convert
    CSpider.getDataObject("X");
    To
    RequestManager.getRequestContext().getModelManager().getModel(XModel.class);
    Maybe we or somebody else in the list can help write that regularexpression if it has not already been written. For instance in thelast
    updated spider2jato.xml file there is already aCSpider.getCommonPage("X")
    rule:
    <!--getPage to getViewBean-->
    <mapping-rule>
    <mapping-rule-primarymatch>
    <![CDATA[CSpider[.\s]*getPage[\s]*\(\"([^"]*)\"]]>
    </mapping-rule-primarymatch>
    <mapping-rule-replacement>
    <mapping-rule-match>
    <![CDATA[CSpider[.\s]*getPage[\s]*\(\"([^"]*)\"]]>
    </mapping-rule-match>
    <mapping-rule-substitute>
    <![CDATA[getViewBean($1ViewBean.class]]>
    </mapping-rule-substitute>
    </mapping-rule-replacement>
    </mapping-rule>
    Following this example a getDataObject to getModel would look
    like this:
    <mapping-rule>
    <mapping-rule-primarymatch>
    <![CDATA[CSpider[.\s]*getDataObject[\s]*\(\"([^"]*)\"]]>
    </mapping-rule-primarymatch>
    <mapping-rule-replacement>
    <mapping-rule-match>
    <![CDATA[CSpider[.\s]*getDataObject[\s]*\(\"([^"]*)\"]]>
    </mapping-rule-match>
    <mapping-rule-substitute>
    <![CDATA[getModel($1Model.class]]>
    </mapping-rule-substitute>
    </mapping-rule-replacement>
    </mapping-rule>
    In fact, one migration developer already wrote that rule andsubmitted it
    for inclusion in the basic set. I will post another upgrade to thebasic
    regular expression rule set, look for a "file uploaded" posting.Also,
    please consider contributing any additional generic rules that youhave
    written for inclusion in the basic set.
    Please not, that in some cases (Utility classes in particular)
    the rule
    application may be more effective as TWO sequention rules ratherthan one
    monolithic rule. Again using the example above, it will convert
    CSpider.getDataObject("Foo");
    To
    getModel(FooModel.class);
    Now that is the most effective conversion for that code if that
    code is in
    a page or data object class file. But if that code is in a Utilityclass you
    really want:
    >
    RequestManager.getRequestContext().getModelManager().getModel(FooModel.class
    So to go from
    getModel(FooModel.class);
    To
    RequestManager.getRequestContext().getModelManager().getModel(FooModel.class
    You would apply a second rule AND you would ONLY run this rule
    against
    your utility classes so that you would not otherwise affect yourViewBean
    and Model classes which are completely fine with the simplegetModel call.
    <mapping-rule>
    <mapping-rule-primarymatch>
    <![CDATA[getModel\(]]>
    </mapping-rule-primarymatch>
    <mapping-rule-replacement>
    <mapping-rule-match>
    <![CDATA[getModel\(]]>
    </mapping-rule-match>
    <mapping-rule-substitute>
    <![CDATA[RequestManager.getRequestContext().getModelManager().getModel(]]>
    </mapping-rule-substitute>
    </mapping-rule-replacement>
    </mapping-rule>
    A similer rule can be applied to getSession and other CSpider APIcalls.
    For instance here is the rule for converting getSession calls toleverage
    the RequestManager.
    <mapping-rule>
    <mapping-rule-primarymatch>
    <![CDATA[getSession\(\)\.]]>
    </mapping-rule-primarymatch>
    <mapping-rule-replacement>
    <mapping-rule-match>
    <![CDATA[getSession\(\)\.]]>
    </mapping-rule-match>
    <mapping-rule-substitute>
    <![CDATA[RequestManager.getSession().]]>
    </mapping-rule-substitute>
    </mapping-rule-replacement>
    </mapping-rule>
    ----- Original Message -----
    From: "Matthew Stevens" <matthew.stevens@e...>
    Sent: Tuesday, August 07, 2001 12:56 PM
    Subject: RE: [iPlanet-JATO] Use Of models in utility classes
    Namburi,
    I will post a document to the group site this evening which has
    the
    details
    on various tactics of migrating these type of utilities.
    Essentially,
    you
    either need to convert these utilities to Models themselves or
    keep the
    utilities as is and simply use the
    RequestManager.getRequestContext.getModelManager().getModel()
    to statically access Models.
    For CSpSelect.executeImmediate() I have an example of customhelper
    method
    as a replacement whicch uses JDBC results instead of
    CSpDBResult.
    matt
    -----Original Message-----
    From: vnamboori@y... [mailto:<a href="/group/SunONE-JATO/post?protectID=081071113213093190112061186248100208071048">vnamboori@y...</a>]
    Sent: Tuesday, August 07, 2001 3:24 PM
    Subject: [iPlanet-JATO] Use Of models in utility classes
    Hi All,
    In the present ND project we have lots of utility classes.
    These
    classes in diffrent directory. Not part of nd pages.
    In these classes we access the dataobjects and do themanipulations.
    So we access dataobjects directly like
    CSpider.getDataObject("do....");
    and then execute it.
    Since the migration tool does not do much of conversion forthese
    utilities we have to do manually.
    My question is Can we access the the models in the postmigration
    sameway or do we need requestContext?
    We have lots of utility classes which are DataObjectintensive. Can
    someone suggest a better way to migrate this kind of code.
    Thanks
    Namburi
    [email protected]
    [email protected]
    [Non-text portions of this message have been removed]
    [email protected]
    [email protected]

  • What's wrong with the regular expression?

    Hi all,
    For the life of me I can not figure out what is wrong with this regular expression
    .*\bA specific phrase\.\b.*
    This is just an example the actual phrase can be an specific phrase. My problem comes when the specific phrase ends in a period. I've escaped the period but it still gives me an error. The only time I don't get an error is when I take off the end boundry character which will not suffice as a solution. I need to be able to capture all the text before and after said phrase. If the phrase doesn't have a period it would look like this...
    .*\bA specific phrase\b.*
    which works fine. So what is it about the \.\b combination that is not matching?
    I've been banging my head on this for a while and I'm getting nowhere.
    The application highlights text that comes from a server. The user builds custom highlights that have some options. Highlight entire line, match partial word, and ignore case. The code that builds my pattern is here
    String strHighlight = _strHighlight;
            strHighlight = strHighlight.replaceAll("\\*", "\\\\*");
            strHighlight = strHighlight.replaceAll("\\.", "\\\\.");
            String strPattern = strHighlight;
            if(_bEntireParagraph)
                if(_bPartialWord)
                    strPattern = ".*" + strHighlight + ".*";
                else               
                    strPattern = ".*\\b" + strHighlight + "\\b.*";           
            else
                if(_bPartialWord)
                    strPattern = strHighlight;
                else               
                    strPattern = "\\b" + strHighlight + "\\b";  
            if(_bIgnoreCase)
                _patHighlight = Pattern.compile(strPattern, Pattern.CASE_INSENSITIVE);
            else
                _patHighlight = Pattern.compile(strPattern);So for example I matching the phrase: The dog ate the cat. And that phrase came over in the following text: Look there's a dog. The dog ate the cat. "Oh my!"
    And my user has the entire line and ignore case options selected then my regex woud look like this: .*\bThe dog ate the cat\b.*
    That should get highlighted, but for some reason it doesn't. Correct me if I'm wrong but doesn't the regex read as follows:
    any characters
    word boundry
    The dog ate the cat[period]
    word boundry
    any characters until newline.
    Any help will be much appreciated

    A word boundary (in the context of regexes) is a position that is either followed by a word character and not preceded by one (start of word) or preceded by a word character and not followed by one (end of word). A word character is defined as a letter, a digit, or an underscore. Since a period is not a word character, the only way the position following it could be a word boundary is if the next character is a letter, digit or underscore. But a sentence-ending period is always followed by whitespace, if anything, so it makes no sense to look for a word boundary there. I think, instead of \b, you should use negative lookarounds, like so:   strPattern = ".*(?<!\\w)" + strHighlight + "(?!\\w).*";

  • Java – Regular Expressions – Finding any non digit byte in a multiple byte

    Hello,
    I’m new to JAVA and Regular Expressions; I’m trying to write a regular expression that will find any records that contain a non digit byte in a multiple byte field.
    I thought the following was the correct expression but it is only finding records that contain “all” non digit bytes.
    \D{1,}
    \D = Non Digit
    {1,} = at least 1 or more
    Below is my sample data. I would like the regular expression to find all of the records that are not all numeric. However when I use the regular expression \D{1,} it is only finding the 2 records that all bytes are non digits. (i.e. “ “ and “A “)
    “ 111229”
    “2 111229”
    “20091229”
    “200912c9”
    “201#1229”
    “20101229”
    “20110229”
    “20111*29”
    “20111029”
    “20111229”
    “20B11229”
    “A “
    “A0111229”
    Please note I have also tried \D{1,}+ and \D{1,}? And they also do not return my desired results
    Any assistance someone can provide would be greatly appreciated.

    You don't show the code you are using but I surmise you are using String.matches() which requires that the whole target must match the regular expression not just part of it. Instead you should create a Pattern and then a Matcher and use the Matcher.find() method. Check the Javadoc for Pattern and Matcher and look at the Java regex tutorial - http://docs.oracle.com/javase/tutorial/essential/regex/ .
    P.S. You can re-use the Pattern object - you don't have to create it every time you need one.
    P.P.S. Java regular expressions work with characters not bytes and characters are not not not bytes.

  • Regular expressions and capture groups

    Hi everyone :)
    Is there a way to override the default behaviour of capture groups in regular expressions? More specifically I want to override this:
    "The captured input associated with a group is always the subsequence that the group most recently matched."
    For example, if I have a string that is this:
    * <comment one>
    * <comment two>
    <some text>
    I have a pattern of the form "(.*)(/\\*.*\\*/)(.*)" which will match multi-line comments. I have also specified the flag DOTALL so that the predefined character class '.' matches over line-breaks.
    If I apply this pattern to the above string I get comment two being captured, not comment one. This is because of the stipulation that I cited above.
    I need to be able to capture only the first match, and prevent the capture group from being overwritten by more recent matches.
    Is this possible? Any ideas?
    Thanks in advance.
    Kind regards,
    Ben Deany

    Is there a way to override the default behaviour of
    capture groups in regular expressions? More
    specifically I want to override this:No, but you don't need to.
    I have a pattern of the form "(.*)(/\\*.*\\*/)(.*)"
    which will match multi-line comments.Comment two is captured by the second group because comment one is eaten by the first group. Use the reluctant quantifier "*?" on the . in the first group instead of the greedy quantifier "*" to get what is apparently the behavior you want. Then the first group will contain nothing, the second group will contain comments one and two, and the third group will contain the following text.
    .* is a very powerful thing to use. It will match everything in its path, guzzling text like moonshine at Mardi Gras. The only reason it doesn't match comment two as well is because then the expression as a whole would not match.
    The parentheses surrounding the first and third groups are not needed (unless you want to use group methods on them too).

  • Regular expressions and input streams

    Hello,
    I am trying to find a way to read characters from a stream and find matches in them with regular expressions. The problem is that the stream may contain a big amount of characters, so I can't read all of them, keep them in a big string, and then perform the searches. Is there a way to perform searches while I read the characters? I scanned the documentation of Pattern and Matcher and found nothing that can help.
    Any ideas?
    Thanks.

    Looking at the method
    public Matcher Pattern.matcher(CharSequence input);
    one could be excused for assuming that one could turn an InputStream into a CharSequence object since a 'stream' implies sequencial access and CharSequence sounds like it provides sequential access.
    This would just require the implementation of 4 method of which two (toString() and subSequence()) could probably be ignored BUT the method you wouild have to implement are charAt() and length() which imply random access rather than sequencial access!
    Using the built in regex your problem looks difficult. If you are just looking for 'equality' matches (simple searching) then you could use Knuth-Morris-Pratt algorithm or the Boyer-Moore algorithm.

  • Regular Expressions in CS5.5 - something is wrong

    Hello Everybody,
    Please correct me, but I think, I found a serious problem with regular Expressions in Indesign CS5.5 (and possibly in other apps from CS5.5).
    Let's start with simple example:
    var range = "a-a,a,a-a,a";
    var regEx = /(a+-a+|a+)(,(a+-a+|a+))*/;
    alert( "Match:" +regEx.test(range)+"\nLeftContext: "+RegExp.leftContext+"\nRightContext: "+RegExp.rightContext );
    What I expected was true match and the left  and the right context should be empty. In Indesign CS3 that is correct BUT NOT in CS5.5.
    In CS 5.5 it seems that the only first "a-a" is matched and the rest is return as the rightContext - looks like big change (if not parsing error in RegExp engine).
    Please correct me if I am wrong.
    The second example - how to freeze ID CS5.5:
    var range = "a-a,a,a-a,a";
    var regEx = /(a+-a+|a+)(,(a+-a+|a+)){8,}/;
    alert( "Match:" +regEx.test(range)+"\nLeftContext: "+RegExp.leftContext+"\nRightContext: "+RegExp.rightContext );
    As you can see it differs only with the {8,} part instead of *
    Run it in CS5.5 and you will see that the ID hangs (in CS3 of course it runs flawlessly}.
    The third example - how to freeze ID 5.5 in one line (I posted it earlier in Photoshop forum because similiar problem was called earlier):
    alert((/(n|s)? /gmi).test('s') );
    As you can guess - it freezes the CS5.5 (CS3 passes the test).
    Please correct me if I am doing something wrong or it's the problem of Adobe.
    Best regards,
    Daniel Brylak

    Hi Daniel,
    Thanks for sharing. Really annoying indeed.
    Just to complete your diagnosis, what you describe about CS.5 is the same in CS5, while CS4 behaves as CS3.
    var range = "aaaaa";
    var regEx = /(a+-a+|a+)(,(a+-a+|a+))*/;
    alert([
        "Match:" +regEx.test(range),
        "LeftContext: "+RegExp.leftContext.toSource(),        // => CS3/4: EMPTY -- CS5+: EMPTY
        "RightContext: "+RegExp.rightContext.toSource()        // => CS3/4: EMPTY -- CS5+: ",a,a-a,a"
        ].join('\r'));
    So there is a serious implementation problem of the RegExp object from ExtendScript CS5.
    I don't think it's related to the greedy modes. By default, JS RegExp quantifiers are greedy, and /a*/ still entirely captures "aaaaaa" in CS5+.
    By the way, you can make any quantifier non-greedy by adding ? after the quantifier, e.g.: /a*?/, /a+?/, etc.
    I guess that Adobe ExtendScript has a generic issue in updating the RegExp.lastIndex property in certain contexts—see http://forums.adobe.com/message/3719879#3719879 —which could explain several bugs such as the Negative Class bug —see http://forums.adobe.com/message/3510078#3510078 — or the problems you are mentioning today.
    @+
    Marc

  • Java Built-in regular expressions versus Jakarta

    Are there any advantages to using the Jakarta regular expression package, over the built-in Java regular expressions?
    I wasn't able to find much information on the Jakarta regexp package, except for the Javadoc and I didn't find that very informative.
    Thanks!
    Jeff

    Well, the String.replaceAll(String, String) method uses the Java regexp internally, and I'm not sure there's a way to change that. So that at least is one place that will use it. I don't know for sure, but Jakarta's ORO is supposed to be fully compatible with Perl 5, and also supports other regexp types as well, so if you need that aspect of it, then go with Jakarta. I heard some of Java's are a little limited in some capabilities. I can't find any particular pages that refer to any comparisons of them at to compare runtime performance.

  • Regular Expressions and Numbers in Strings

    Looking for someone who has had experience in using Regular Expressions in the Classification Rule Builder.
    We have an eVar that is collecting the number of search results in this fashion:
    <Total Results>_<# of Item 1>_<# of Item 2>_<# of Item 3>_<# of Item 4>
    Example output would look like this:
    150_50_0_25_75
    What we've done is initially create a Regular Expression that looks like this:
    ^(.+)\_(.+)\_(.+)\_(.+)\_(.+)$
    The problem is, it appears in situations where the output contains a zero in one of the slots, the value is ignored and it receives the value in the next place over.  Using the example output shown above, I would end up with values like this:
    $0 150_50_0_25_75
    $1 150
    $2 50
    $3 25
    $4 75
    $5 {null}
    Here's the weird part.  When I perform a test of a single record, it appears like it will work just fine, but when it actually runs in Omniture, it's not working as expected.  Here's something else I'd like to know if it's possible to address.  The five-place string is only the newest iteration of this approach.  In the past, we started out with a two-place version, then three-place and then four.  Any recommendations for handling all scenarios?
    Any and all advice is welcome.  Thanks in advance!

    Doing some playing around on rubular.com and thinking the Regular Expression should be build this way instead:
    ^(\d+)\_(\d+)\_(\d+)\_(\d+)\_(\d+)$
    Again, still looking for any additional guidance from more experienced individuals.  Thanks!

  • Need help in unix regular expressions

    Hi All,
    I'm new to shell scripting. Please help me in achieving this
    I am trying to a find regular expression that need to pick a file with begin with the below format and mask variable is called in xml file.
    currently the script accepts:
    mask="CLIENT_ID+'_ADHSUITE_IN_'+date2str(now,'MMddyy','US/Eastern')+'.txt'"
    But it should accept in the below format
    2595_ADHSUITE_IN_ANNWEL_030309_2009-02-10_15-12-46-000_648.TXT715.outpgp_out
    where CLIENT_ID=2595. How to place wild card character '*' in the below to accept file in the above format. here is what i made changes.
    mask="CLIENT_ID+'_ADHSUITE_IN_'*+date2str(now,'MMddyy','US/Eastern')*+'.TXT'*+'.outpgp_out'"
    Please help.
    Thanks

    I believe your statement is being passed over twice:
    First Pass: (This is done by something like javascript)
    CLIENT_ID+'_ADHSUITE_IN_'+'.*'+date2str(now,'MMddyy','US/Eastern')+'.*'+'.TXT'+'.*'+'.outpgp_out'In this pass the variables and functions that are enclosed in literals are processed:
    (1) CLIENT_ID is replaced by 2595 or whatever is current value is:
    (2) date2str(now,'MMddyy','US/Eastern') gets replaced by 040609 (if the current time now is 4th april 2009).
    So at the end of this first pass we have a string:
    2595_ADHSUITE_IN_.\*040609.\*.TXT.*.outpgp_outThis string at the end of the first pass is a Posix basic regular expression. (ref: [http://en.wikipedia.org/wiki/Regular_expression] ) accessed at time of post).
    This is the string I put in the Regular Expression text box on [http://www.fileformat.info/tool/regex.htm]
    and it matches "2595_ADHSUITE_IN_ANNWEL_040609_2009-01-27_17-02-28-000_631.TXT715.outpgp_out" for me (though I prefer my egrep test).
    I hope this is somewhat clearer. Remember I have very little information about your system/application and I make big guesses.
    NB: (I should thank Frits earlier for pointing my sloppiness between wildcards (for eg unix shell filename expansion) and regular expressions).
    For the second pass this used to compared other strings to see

Maybe you are looking for

  • Stable ati driver installed, but still issues

    Hi folks I followed this guide: https://wiki.archlinux.org/index.php/AMD_Catalyst and everything went smoothly, everything installed fine and the accc is showing my graphics card and lets me change the settings. Now i only installed it because i coul

  • How to make iTunes forget the CDDB info and reread from Gracenote ?

    I have some CD which have 2 entries in Gracenote. (These are Chinese CDs, and Gracenote has 2 set of info, one in Chinese and another in English). I picked the English one, and it populated the tags, but now I want to use the Chinese information inst

  • Error in RFC Message

    Hi All, I am doing a RFC-File Schenario. I had done all the configuration. I am triggering the RFC thru a ABAP Report. The first time i ran the report i could see the message generated in XI(SXI_MONITOR). But in the payload i can see nothing. Do i ne

  • Another unsatisfied customer

    On Sunday 1st December I finally took the decision tocancel all my BT subscriptions. Over the past three months I actually decided to keep a log of how many times either my BT vision or BT internet went wrong resulting in the re booting of the home h

  • BAdi RSR_OLAP_BADI and RSDRI_INFOPROV_READ

    Hi folks, I try to use the RSR_OLAP_BADI to compute virtual key figures. These key figures are based on a lookup in an Info Cube which I do in the COMPUTE method of the implementation class. In this method I call the function module RSDRI_INFOPROV_RE