Regular expression and XML

Hello,
I have an XML file containing regular expressions and i parse the file, extract the pattern from it and search for it using java regex package. The problem is it works fine when patterns are words but when the pattern is something like
write \\d+ (write followed by a space followed by one or mre digits) it doesn't work.
I wrote the same code but with the pattern embedded in it,ie. without using XML and it worked. But when extracting with XML it fails.
Also if the pattern is write[0-9] it only extracts write[0-9 and gives an error of no closing bracket.
Could anyone please tell me what i am missing out
Thank you

thank you for your replies. Well i have still no got over the problem so i am posting my code here and hoping it can get solved
import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.io.*;
import java.util.regex.*;
class textextractor extends DefaultHandler{
     boolean regex=false;
public void startElement(String namespaceURI,String localName,String qn,Attributes attr)
          if(localName.equals("REGEX"))
           regex=true;
public void characters(char [] text,int start,int length)throws SAXException {
          String t=new String(text,start,length);
          boolean flag=false;
          if(regex==true)
            Pattern pattern;
              String w=new String(t);
          pattern = Pattern.compile(w);
          Matcher matcher;
          matcher=pattern.matcher("there is a bat   read  write 13    error at line ");
          while(matcher.find())
           flag=true;
           System.out.println("I found the text \"" + matcher.group() +"\" starting at index "
           + matcher.start() +"and ending at index " + matcher.end() + ".");
         if(!flag)
           System.out.println("not found");
         regex=false;
public class saxt2 {
     public static void main(String args[]) {
          try {
                XMLReader parser= XMLReaderFactory.createXMLReader();
                ContentHandler handler=new textextractor();
                parser.setContentHandler(handler);
                                parser.parse("d:\\regex.xml");
              }catch (Exception e) {
               System.err.println(e);
}The xml file is
                  <RegularExpression>
                  <REGEX>write</REGEX>
                  <REGEX>write \\d+</REGEX>
                  <REGEX>read[0-9]</REGEX>
                  </RegularExpression>by running the code you can see that write is found,write \\d+ doesn't match write 13 in the string and read[0-9] gives and error.
Any help will be greatly appreciated

Similar Messages

  • Regular Expressions and XML Schemas

    The standard regular expression syntax in JDK1.4 doesn't support some of the constructs used by pattern facets in XML schemas.
    For example the "Name" type is specified by the pattern "\i\c*" but the java.util.regex regular expression handler throws an exception when it encounters this with the following message:
    Illegal/unsupported escape squence near index 1
    Is anyone aware of any regular expression java libraries or tools that will cope with the XML syntax?

    Thanks, but perhaps I should clarify. I realise that \i\c* is invalid syntax in the java regexp handler. My point is that in XML Schema regular expressions \i is meaningful - in fact it means
    "the set of initial name characters, those matched by Letter | '_' | ':' "
    I was wondering if anyone knew of a regular expression library that understood and correctly interpreted that.

  • Regular Expression and XML character help

    Hi All,
    I am working on a system that needs to convert some input from web browser into XML format and transform the data to some other format using XSLT. While doing this, I encounter some special characters that cannot be used in XML. To resolve this, I wrote a class using java.util.regex.Pattern to filter the input. Most of the special characters we successfully filtered except the following characters (as found so far):
    &#920; &#952; (theta), &#928;(pi) and � (prime)
    Does anybody knows how to filter them?
    Thanks!!!

    UTF-8 will work fine for encoding XML. Of course you actually have to use UTF-8 to encode the data. Just saying you did in the prolog doesn't work, if you actually used some other encoding.

  • Regular expressions and sql

    I have working regular expressions and a working sql connection, but I don�t know how to stop the info from getting into the database when input doesent match the regular expression.
    For instans, you put in an e-mail without an "@" and my program writes and error message. But the info still gets in to the database.
    Any help would be much apreciated as I dont know where to start. If you have links or code examples that would be great to.
    Thanx.

    Well, the obvious answer is "only write the data to the database if the input doesn't match the regular expression."
    Presumably you're really asking how to do that - but it depends upon how your application is structured in the first place, and you haven't told us anything at all about that.

  • Regular Expression and PL/SQL help

    I am using Oracle 9i, does 9i support regular expression? What functions are there?
    My problem is the birth_date column in my database comes from teleform ( a scan program that reads what people wrote on paper), so the format is all jacked up.... 50% of them are 01/01/1981, 10% are 5/14/1995, 10% are 12/5/1993, 10% are 1/1/1983, 10% are 24-JUL-98. I have never really used regular expression and pl/sql, can anybody help me convert all of them to 01/01/1998?
    Does Oralce 9i support regular expression? What can I do if oralce 9i does not support regular expression? Thank you very much in advance.

    9i doesn't support regular expressions (at least not in the 10g regular expressions sense. There is an OWA_PATTERN_MATCH package that has some facilities for regular expressions). But it doesn't look like this is a regular expressions problem.
    Instead, this is probably a case where you need to
    - enumerate the format masks you want to try
    - determine the order you want to try them
    - write a small function that tries each format mask in succession until one matches.
    Of course, there is no guarantee that you'll ever be able to convert the data to the date that the user intended because some values will be ambiguous. For example, 01/02/03 could mean Feb 1, 2003 or Jan 2, 2003 or Feb 3, 2001 depending on the person who entered the data.
    Assuming you can define the order, your function would just try each format mask in turn until one generated a valid date, i.e.
    BEGIN
      BEGIN
        l_date := TO_DATE( p_string_value, format_mask_1 );
        RETURN l_date;
      EXCEPTION
        WHEN OTHERS THEN
          NULL;
      END;
      BEGIN
        l_date := TO_DATE( p_string_value, format_mask_2 );
        RETURN l_date;
      EXCEPTION
        WHEN OTHERS THEN
          NULL;
      END;
      BEGIN
        l_date := TO_DATE( p_string_value, format_mask_3 );
        RETURN l_date;
      EXCEPTION
        WHEN OTHERS THEN
          NULL;
      END;
      BEGIN
        l_date := TO_DATE( p_string_value, format_mask_N );
        RETURN l_date;
      EXCEPTION
        WHEN OTHERS THEN
          NULL;
      END;
      RETURN NULL;
    END;Justin

  • Regular expressions and backreference

    Hello!
    I am trying to use backreferences in REGEXP in the PERL-style, where I want to match my regular expression and later refer to the grouped values. I can read that those are referecenced with \1 .. \9, but I simply cant get it to work. Here is an example in PL/SQL:
    SELECT REGEXP_SUBSTR(l_users.adresse,'([A-Z]+)\s+(\d+)')
    INTO l_dummy_varchar2
    FROM dual;
    OR I could do things like:
    l_dummy_varchar2 := REGEXP_SUBSTR(l_users.adresse,'([A-Z]+)\s+(\d+)');
    It seems to work, but I cant figure out how to get the backreferenced value.
    I would love to do things like:
    dbms_output.put_line('my value ='||\1)
    but this doesnt work.
    Help is very much appreciated.
    Best regards
    Dannie

    Likewise you can extract things using the
    REGEXP_SUBSTR, but you don't need back
    referencing...backreferencing is better than additional function (ltrim) use, and BTW be careful with this "ltrims":
    SQL> set serveroutput on
    SQL>
    SQL> DECLARE
      2       v_txt VARCHAR2(100);
      3     BEGIN
      4       v_txt := ltrim(regexp_substr('HERE IS AN ASCII CHARACTER', 'IS AN [[:alnum:]]*'),'IS AN ');
      5       DBMS_OUTPUT.PUT_LINE('Word after IS AN: '||v_txt);
      6  END;
      7  /
    Word after IS AN: CII
    PL/SQL procedure successfully completed
    SQL>
    SQL> DECLARE
      2       v_txt VARCHAR2(100);
      3     BEGIN
      4       v_txt := regexp_replace('HERE IS AN ASCII CHARACTER', 'IS AN ([[:alnum:]]*)|.','\1');
      5       DBMS_OUTPUT.PUT_LINE('Word after IS AN: '||v_txt);
      6  END;
      7  /
    Word after IS AN: ASCII
    PL/SQL procedure successfully completed
    SQL> -----------
    VB
    http://volder-notes.blogspot.com/

  • Juniper MX Regular expressions and user permissions ACS 5.4

    Hi everyone!
    Im having some trouble with regular expressions and permissions on our Juniper MX routers through ACS 5.4, and i would like some insight/help/poitners!!
    We have a team of engineers that should only have read only permissions (important: show configuration) and also be able to just change the description on interfaces.
    Thus far with the following regular expressions set for the shell profile they are going through i have managed the above, however the problem is when an engineer inputs "Show configuration", only the interfaces descriptions configuration is shown! The rest of the configuration will not be printed.
    deny-commands1=.*.
    allow-commands1=configure
    deny-configuration1=.*.
    allow-commands2=interfaces .*. description .*$
    allow-configuration1=interfaces .*. description .*$
    allow-commands2=show configuration.*
    allow-commands3=show configuration
    (some of these regex i know that are not needed, i was just playing around to check everything before posting)
    Any pointers as to why or how to resolve this?
    example output with the above:
    show configuration
    ## Last commit: 2014-01-09 09:34:44 EET by someone
    interfaces {
        xe-0/0/0 {
        xe-0/0/1 {
            description xxxx;
        xe-0/1/0 {
            description xxxx;
        xe-0/1/1 {
            description xxxx;
        xe-0/2/0 {
            disable;
        xe-0/2/1 {
            description xxxx;
        xe-0/3/0 {
            description xxxx;
        xe-0/3/1 {
            description xxxx;
        ae0 {
            description "xxxx";
        ae1 {
            description xxxx;
        demux0 {
        lo0 {
    {master}
    Thanks in advance!
    Spyros

    You are absolutely right!!  I was doing research online after posting the above.  The correct RADIUS attribute to use is actually CVPN3000/ASA/PIX7.x-Group-Based-Address-Pools.  Then create the pool in ASA, and call that pool's name in ACS under that RADIUS attribute.  Someone explained this perfectly in this community before.  Much appreciate your answer!
    Here's from another post last year:
    ACS  5 does not have the feature of IP pools. Logically its always good to  setup pools locally on vpn server and if you want user to pick ip from  specific local pool you can configure acs to push that attribute.
    On ACS Go to > Policy Elements  -> Network Access ->   Authorization Profiles -> Create ->
    Name of the Policy ->Dictionary Type: Radius-Cisco VPN 3000/ASA/PIX7.x
    Attribute Type : CVPN3000/ASA/PIX7.x-Group-Based-Address-Pools
    Attribute Type: String
    Attribute Value : Static MYPOOL (Name of the Pool which is defined on the ASA)
    Access Policies ->Default Network Access -> Authorization ->  Create -> Under result section call the Authorization p

  • Can somebody help me in getting some good material for Regular Expressions and IP Community list

    can somebody help me in getting some good material for Regular Expressions and IP Community list

    I'm not sure what you mean by "IP Community list", but here are 3 reference sites for Regular Expressions:
    Regular Expression Tutorial - Learn How to Use Regular Expressions
    http://www.regular-expressions.info/tutorial.html
    Regular Expressions Cheat Sheet by DaveChild
    http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/
    Regular Expressions Quick Reference
    http://www.autohotkey.com/docs/misc/RegEx-QuickRef.htm

  • Find text using regular expression and add highlight annotation

    Hi Friends
                       Is it possible to find text using regular expression and add highlight annotation using plugin

    A plugin can use the PDWordFinder to get a list of the words on a page, and their location. That's all that the API offers for searching. Of course, you can use a regular expression library to work with that word list.

  • Regular expression and output format

    hi all,
    i have following scenario-
    regular expression: [0-9]{3}-[0-9]{3}-[0-9]{4}
    generated value by the above regular expression: 123-234-6789
    output format to display the generated above value: xxx-xxx-$1
    now i need to display the generated value (123-234-6789) in the specified output format (xxx-xxx-$1) and the final output will be xxx-xxx-6789
    how is it possible?
    Note: here regular expression and output format can vary
    br,
    bashar

    Hi, Bashar
    You can solve this problem by using the Data Masking Technique.
    Masking data means replacing certain fields with a Mask character (such as an X). This effectively disguises the data content while preserving the same formatting on front end screens and reports. For example, a column of credit card numbers might look like:
    4346 6454 0020 5379
    4493 9238 7315 5787
    4297 8296 7496 8724
    and after the masking operation the information would appear as:
    4346 XXXX XXXX 5379
    4493 XXXX XXXX 5787
    4297 XXXX XXXX 8724
    The masking characters effectively remove much of the sensitive content from the record while still preserving the look and feel. Take care to ensure that enough of the data is masked to preserve security.
    It would not be hard to regenerate the original credit card number from a masking operation such as: 4297 8296 7496 87XX since the numbers are generated with a specific and well known checksum algorithm.
    Best Regards,
    Mahfuz Khan

  • "Match Regular Expression" and "Match Pattern" vi's behave differently

    Hi,
    I have a simple string matching need and by experimenting found that the "Match Regular Expression" and "Match Pattern" vi's behave somewhat differently. I'd assume that the regular expression inputs on both would behave the same. A difference I've discovered is that the "|" character (the "vertical bar" character, commonly used as an "or" operator) is recognized as such in the Match Regular Expression vi, but not in the Match Pattern vi (where it is taken literally). Furthermore, I cannot find any documentation in Help (on-line or in LabVIEW) about the "|" character usage in regular expressions. Is this documented anywhere?
    For example, suppose I want to match any of the following 4 words: "The" or "quick" or "brown" or "fox". The regular expression "The|quick|brown|fox" (without the quotes) works for the Match Regular Expression vi but not the Match Pattern vi. Below is a picture of the block diagram and the front panel results:
    The Help says that the Match Regular Expression vi performs somewhat slower than the Match Pattern vi, so I started with the latter. But since it doesn't work for me, I'll use the former. But does anyone have any idea of the speed difference? I'd assume it is negligible in such a simple example.
    Thanks!
    Solved!
    Go to Solution.

    Yep-
    You hit a point that's frustrated me a time or two as well (and incidentally, caused some hair-pulling that I can ill afford)
    The hint is in the help file:
    for Match regular expression "The Match Regular Expression function gives you more options for matching
    strings but performs more slowly than the Match Pattern function....Use regular
    expressions in this function to refine searches....
    Characters to Find
    Regular Expression
    VOLTS
    VOLTS
    A plus sign or a minus sign
    [+-]
    A sequence of one or more digits
    [0-9]+
    Zero or more spaces
    \s* or * (that is, a space followed by an asterisk)
    One or more spaces, tabs, new lines, or carriage returns
    [\t \r \n \s]+
    One or more characters other than digits
    [^0-9]+
    The word Level only if it
    appears at the beginning of the string
    ^Level
    The word Volts only if it
    appears at the end of the string
    Volts$
    The longest string within parentheses
    The first string within parentheses but not containing any
    parentheses within it
    \([^()]*\)
    A left bracket
    A right bracket
    cat, cag, cot, cog, dat, dag, dot, and dag
    [cd][ao][tg]
    cat or dog
    cat|dog
    dog, cat
    dog, cat cat dog,cat
    cat cat dog, and so on
    ((cat )*dog)
    One or more of the letter a
    followed by a space and the same number of the letter a, that is, a a, aa aa, aaa aaa, and so
    on
    (a+) \1
    For Match Pattern "This function is similar to the Search and Replace
    Pattern VI. The Match Pattern function gives you fewer options for matching
    strings but performs more quickly than the Match Regular Expression
    function. For example, the Match Pattern function does not support the
    parenthesis or vertical bar (|) characters.
    Characters to Find
    Regular Expression
    VOLTS
    VOLTS
    All uppercase and lowercase versions of volts, that is, VOLTS, Volts, volts, and so on
    [Vv][Oo][Ll][Tt][Ss]
    A space, a plus sign, or a minus sign
    [+-]
    A sequence of one or more digits
    [0-9]+
    Zero or more spaces
    \s* or * (that is, a space followed by an asterisk)
    One or more spaces, tabs, new lines, or carriage returns
    [\t \r \n \s]+
    One or more characters other than digits
    [~0-9]+
    The word Level only if it begins
    at the offset position in the string
    ^Level
    The word Volts only if it
    appears at the end of the string
    Volts$
    The longest string within parentheses
    The longest string within parentheses but not containing any
    parentheses within it
    ([~()]*)
    A left bracket
    A right bracket
    cat, dog, cot, dot, cog, and so on.
    [cd][ao][tg]
    Frustrating- but still managable.
    Jeff

  • Assistance with Regular Expression and Tcl

    Assistance with Regular Expression and Tcl
    Hello Everyone,
      I recently began learning Tcl to develop scripts for automating network switch deployments. 
    In my script, I want to name the device with a location and the last three octets of the base mac address.
    I can get the Base MAC address by : 
    show version | include Base
     Base ethernet MAC Address       : 00:00:00:DB:CE:00
    And I can get the last three octets of the MAC address using the following regular expression. 
    ([0-9a-f]{2}[:-]){2}([0-9a-f]{2}$)
    But I have not been able to figure out how to call the regular expression in the tcl script.
    I have checked several resources but have not been able to figure it out.  Suggestions?
    Ultimately, I want to set the last three octets to a variable (something like below) and then call the variable when I name the switch.
    set mac [exec "sh version | i Base"] (include the regular expression)
    ios_config "hostname location$mac"
    Thanks for any assistance in advance.
    Chris

    This worked for me.
    Switch_1(tcl)#set result [exec show ver | inc Base]   
    Base ethernet MAC Address       : 00:1B:D4:F8:B1:80
    Switch_1(tcl)#regexp {([0-9A-F:]{8}\r)} $result -> mac
    1
    Switch_1(tcl)#puts $mac                               
    F8:B1:80
    Switch_1(tcl)#ios_config "hostname location$mac"      
    %Warning! Hostname should contain at least one alphabet or '-' or '_' character
    locationF8:B1:80(tcl)#

  • Regular Expressions and Full-text Requests.

    Hi,
    i have just read that Berkeley DB XML doesnt support regexp in XQuery (what a pity),
    do you know how to look-alike regular expressions in Query?
    for example, i'd like to perform a full-text request, all tags that contains text like "Be.+ley" (it would return tags that contains text like "Berkeley" or "Beverley"), how can i do that?
    Thx.

    Hi,
    A performant way to do something like you want is with a query that mixes use of contains() (index optimized) and matches() (not index optimized). Something like this:
    collection()//tag[contains(., "Be") and contains(., "ley") and matches(., "Be.+ley")]John

  • Regular expressions for xml parsing

    I have a xml parsing problem that I have to solve using regular expressions. It's not possible for me to use a different method other than regular expression. But there is a problem that I cannot seem to rap my head around. I want to extract the contents of a tag but the problem is that this tag occurs serveral times in the XML file but I only want the contents of one particular occurence. Basically the problem is as follows;
    I want to extract
    <bp:NAME ***stufff***>(I want this part)</bp:NAME>This tag can occur is serval places. For example here;
    <bp:ORGANISM>
    ***bunch of tags***
    <bp:NAME ***stufff***>***stufff***</bp:NAME>
    ***bunch of tags***
    </bp:ORGANISM>or here;
    <bp:DATABASE>
    ***bunch of tags***
    <bp:NAME ***stufff***>***stufff***</bp:NAME>
    ***bunch of tags***
    </bp:DATABASE>I do not want the content of those tags. I want the content of the <NAME> tag that is not between either the <ORGANISM> tags or the <DATABASE> tags. These tags can be in any order. I for the life of me cannot seem to figure this problem out. I tried several different approaches. For example I tried using the following regex
    (?:<bp:NAME [^>]*>([^<]*).*?<bp:ORGANISM>.*?</bp:ORGANISM>|
    <bp:ORGANISM>.*?</bp:ORGANISM>.*?<bp:NAME [^>]*>([^<]*))This kind of works, the information I want is either in the first captured group or in the second one. So I just check which group is not empty and that is the one I want. But this only works if there is only one other tag containing the name tag (in this particular regular expression that is the organism tag). Since there is another tag (the database tag) I have to work around, and these tags can be in any order, the regular expression then becomes three times as large and then there are six different groups in which the information I want can occur. This does not seem like a good idea to me. There has to be another way to do this. So I tried using the following regex;
    (?:</bp:ORGANISM>)?.*?(?:</bp:DATABASE>)?.*?<bp:NAME [^>]*>([^<]*)I thought this would get rid of any occurences of the other tags in front of the name tag, but it doesn't work either. It seems like it is not greedy enough. Well I think you get the point. I don't know what to try next so I really need some help.
    Here is an example of the type of data I will run into. The tags can be in any order and they do not always have to occur. In the example below the <DATABASE> tag is not part of the data and the name tag I want just happens to be in front of the organism tag but this is not always the case. The name tag I want is the firstname tag in the file, namely;
    <bp:NAME rdf:datatype="xsd:string">Progesterone receptor</bp:NAME>So I don't want the name tag that is in between the organism tags.
    <bp:protein rdf:ID="CPATH-27885">
    &#8722;<bp:COMMENT rdf:datatype="xsd:string">
    Belongs to the nuclear hormone receptor family. NR3 subfamily. SIMILARITY: Contains 1 nuclear receptor DNA-binding domain. WEB RESOURCE: Name=NIEHS-SNPs; URL="http://egp.gs.washington.edu/data/pgr/"; WEB RESOURCE: Name=Wikipedia; Note=Progesterone receptor entry; URL="http://en.wikipedia.org/wiki/Progesterone_receptor"; GENE SYNONYMS: NR3C3. COPYRIGHT:  Protein annotation is derived from the UniProt Consortium (http://www.uniprot.org/).  Distributed under the Creative Commons Attribution-NoDerivs License.
    </bp:COMMENT>
    <bp:SYNONYMS rdf:datatype="xsd:string">Nuclear receptor subfamily 3 group C member 3</bp:SYNONYMS>
    <bp:SYNONYMS rdf:datatype="xsd:string">PR</bp:SYNONYMS>
    <bp:NAME rdf:datatype="xsd:string">Progesterone receptor</bp:NAME>
    &#8722;<bp:ORGANISM>
    &#8722;<bp:bioSource rdf:ID="CPATH-LOCAL-112384">
    <bp:NAME rdf:datatype="xsd:string">Homo sapiens</bp:NAME>
    &#8722;<bp:TAXON-XREF>
    &#8722;<bp:unificationXref rdf:ID="CPATH-LOCAL-112385">
    <bp:DB rdf:datatype="xsd:string">NCBI_TAXONOMY</bp:DB>
    <bp:ID rdf:datatype="xsd:string">9606</bp:ID>
    </bp:unificationXref>
    </bp:TAXON-XREF>
    </bp:bioSource>
    </bp:ORGANISM>
    <bp:SHORT-NAME rdf:datatype="xsd:string">PRGR_HUMAN</bp:SHORT-NAME>
    &#8722;<bp:XREF>
    &#8722;<bp:relationshipXref rdf:ID="CPATH-LOCAL-112386">
    <bp:DB rdf:datatype="xsd:string">ENTREZ_GENE</bp:DB>
    <bp:ID rdf:datatype="xsd:string">5241</bp:ID>
    </bp:relationshipXref>
    </bp:XREF>
    &#8722;<bp:XREF>
    &#8722;<bp:unificationXref rdf:ID="CPATH-LOCAL-112387">
    <bp:DB rdf:datatype="xsd:string">UNIPROT</bp:DB>
    <bp:ID rdf:datatype="xsd:string">P06401</bp:ID>
    </bp:unificationXref>
    </bp:XREF>
    &#8722;<bp:XREF>
    &#8722;<bp:unificationXref rdf:ID="CPATH-LOCAL-112388">
    <bp:DB rdf:datatype="xsd:string">UNIPROT</bp:DB>
    <bp:ID rdf:datatype="xsd:string">A7X8B0</bp:ID>
    </bp:unificationXref>
    </bp:XREF>
    &#8722;<bp:XREF>
    &#8722;<bp:relationshipXref rdf:ID="CPATH-LOCAL-112389">
    <bp:DB rdf:datatype="xsd:string">GENE_SYMBOL</bp:DB>
    <bp:ID rdf:datatype="xsd:string">PGR</bp:ID>
    </bp:relationshipXref>
    </bp:XREF>
    &#8722;<bp:XREF>
    &#8722;<bp:relationshipXref rdf:ID="CPATH-LOCAL-112390">
    <bp:DB rdf:datatype="xsd:string">REF_SEQ</bp:DB>
    <bp:ID rdf:datatype="xsd:string">NP_000917</bp:ID>
    </bp:relationshipXref>
    </bp:XREF>
    &#8722;<bp:XREF>
    &#8722;<bp:unificationXref rdf:ID="CPATH-LOCAL-112391">
    <bp:DB rdf:datatype="xsd:string">UNIPROT</bp:DB>
    <bp:ID rdf:datatype="xsd:string">Q9UPF7</bp:ID>
    </bp:unificationXref>
    </bp:XREF>
    &#8722;<bp:XREF>
    &#8722;<bp:unificationXref rdf:ID="CPATH-LOCAL-113580">
    <bp:DB rdf:datatype="http://www.w3.org/2001/XMLSchema#string">CPATH</bp:DB>
    <bp:ID rdf:datatype="http://www.w3.org/2001/XMLSchema#string">27885</bp:ID>
    </bp:unificationXref>
    </bp:XREF>
    </bp:protein>Edited by: Dani3ll3 on Nov 19, 2009 2:51 AM

    Dani3ll3 wrote:
    Thanks a lot after I did that the regular expression worked. :)Good. But remember that in real life, you would then have to apply the XML rules to get the actual contents of the text node. For example it might be a CDATA section or it might include characters like ampersands which have been escaped and which you need to unescape. That's why it's better to use a proper parser, as already suggested.
    It seems to me this forum is full of posts where people are doing homework questions which teach them to do things the wrong way. But of course there's nothing the student can do about that.

  • Reprasenting regular expression in XML

    hai,
    i have the following regular expression to reprasent 13digit number enclosed in a pair of brakets
    String r= "\\([0-9]{13,}+\\)"
    however when i reprasented in the XML tag as
    <my>
    <rs>\\([0-9]{13,}+\\)</rs>
    </my>
    and read using DOM parser,.unable to match the seqence is not matching at all.
    will anybody pls. tell me howto reprasent in String r in XML
    Thx

    If the content in the XML file is
    \\([0-9]{13,}+\\)
    to represent this in a Java String, you need to code
    String target = "\\\\([0-9]{13,}+\\\\)";
    The backslash (\) is an escape prefix in Java. So to get a single backslash, you need to specify 2 of them. Since you want two backslashes in the string, you need to specify 4 of them.
    Dave Patterson

Maybe you are looking for