Efficiency of Java String operations

Hi,
for an Information Retrieval project, I need to make extensive use of String operations on a vast number of documents, and in particular involving lots of substring() operations.
I'd like to get a feeling how efficient the substring() method of java.lang.String is implemented just to understand whether trying to optimize it would be a reasonable option (I was thinking of an algorithm for efficient string pattern matching such as the Knuth-Morris-Pratt algorithm, but if java.lang.String already applies similarly efficient algorithms I would not bother).
Can someone help?
J

Thanks for your comment. Yes of course you're right, I
mean indexOf(). If so (thanks DrClap), let me enter the discussion.
The indexOf() implements a so called "brute force algorithm".
The performance is O(n*m), where n is the length of the text, and
m is the length of the pattern, but is close to n on the average.
The KMP is O(n), so the performance gain should be hardly noticeable.
To get a real performance gain you should look at the BM (Boyer-Moore,
O(n/m)) algorithm or some of its descendants.
As for java.util.regex package, as far as i understand it should be
several times slower than indexOf(), because it reads EACH character through an interface method (as opposed to direct array access in indexOf()).
Though it's still to be proved experimentally.

Similar Messages

  • String operations on internal table text....

    Original table is consists of 2 columns:
    E        
    RFC error(SM_DHTCLNT010_READ): Error when opening connection
    E RFC error(SM_DHLCLNT010_READ): Error when opening connection
    E RFC error(SM_DHKCLNT010_READ): Error when opening connection
    E RFC error(SM_E10CLNT000_READ): 'tdhtci00.emea.gdc:sapgw02' E     No read RFC FOR SM_B72CLNT003_READ
    E     No read RFC FOR SM_B71CLNT003_READ
    S     Clients for system 'E21' found in RFC  'SM_E21CLNT001_READ'
    S     Clients for system 'E22' found in RFC  'SM_E22CLNT001_READ'
    S     Clients for system 'E23' found in RFC  'SM_E22CLNT001_READ'
    Now we need to apply string operations such that result table is 3 columns with new refined message:
    status       sid            
    Message NEW_TEXT
    E     DHT         
    RFC error               Error when opening connectionE     DHL         RFC error                       Error when opening connection
    E     DHK         RFC error                       Error when opening connection
    E     E10       RFC error                      tdhtci00.emea.gdc:sapgw02
    E     B72        No RFC LINK
    E     B71        No RFC LINK
    S     E21        DATA READ
    S     E22       DATA READ
    S     E23       DATA READ
    String conditions to arrive at new table is:
    1) to get SID column : the conditions are
    •     If the Status is “RFC Error” then next 3 Characters after the “_” must be extracted as SID
    •     Else the SID is between the first and the second inverted comma ‘
    Example:  Clients for system 'E21' found in RFC  'SM_E21CLNT001_READ'extracts “E21” as SID
    2) for message column
    ·         message “RFC Error” if the message text
    starts with “RFC Error”
    · message “no RFC Link” if the message text starts with “No read RFC*”
    · message “Data Read” if the Substring “found in RFC”</b> was found in the Message      
    3) •     If the Status is “RFC Error” then the whole Textstring behind the “: “ must be Extracted
    For example if message is RFC error(SM_DHLCLNT010_READ): Error when opening connection NEW_TEXT will be Error when opening connection
    Need ur inputs on these.
    Bset regards,
    Subba

    Hi,
    this u can acheive simply using offset:
    var_name+off(len). "
    e.g. wa_message-fld+0(3) = first threee characters
    wa_message-fld(3) same as above first three characters
    wa_message-fld+2(2) " will display second and third characte of wa_message-fld
    this u can use to set condtions like :
    if wa_message-fld(9) = 'RFC Error'.
    "process here
    endif.
    Hope this will help u...
    Jogdand M B

  • Setting compile date & time to a Java String variable

    I was wondering does the NetBeans IDE have a capability to grab the current compile date & time and set it to a Java String, updating it at each compile. I know you can set is in a template, but isn't that fixed once forever when the file is created.
    Just wondering.

    If you're using Ant you can have it maintain a file with the last build date in it, and that can be a .java file if you so desire.
    Applications should be completely recompiled before shipping or deployment so it doesn't really make sense to have a compile date per Java source file.

  • Maximum size of a Java String

    Hello
    Can anyone please tell me what would be the maximum size for a java string....
    Thanks
    Tapan

    You are asking the wrong questions. If you are going to have a really, really big string then the chances are you shouldn't be using strings.
    If you are trying to parse a file then you want to be reading it in chunks at a time.
    Tell us what you are trying to do and maybe we can suggest a better solution.
    See, I am a nice person. I just don't like people who are too lazy to writepublic static void main(String args[]) {
      System.out.println(Integer.MAX_VALUE);
    } That took me around 10 seconds.
    Ted.

  • Converting Oracle XML Query Result in Java String by using XSU

    Hi,
    I have a problem by converting Oracle XML Query Result in Java
    String by using XSU. I use XSU for Java.
    For example:
    String datum=new OracleXMLQuery(conn,"Select max(ps.datum) from
    preise ps where match='"+args[0]+"'");
    String datum1=datum;
    I become the following error:
    Prototyp.java:47: Incompatible type for declaration. Can't
    convert oracle.xml.sql.query.OracleXMLQuery to java.lang.String.
    Can somebody tell me a method() for converting to solve my
    problem??????
    Thanks

    Hmmm.. Pretty basic just look at the example:
    OracleXMLQuery qry = new OracleXMLQuery(conn,"Select max(ps.datum) from preise ps where match='"+args[0]+"'");
    String xmlString = qry.getXMLString();
    Hi,
    I have a problem by converting Oracle XML Query Result in Java
    String by using XSU. I use XSU for Java.
    For example:
    String datum=new OracleXMLQuery(conn,"Select max(ps.datum) from
    preise ps where match='"+args[0]+"'");
    String datum1=datum;
    I become the following error:
    Prototyp.java:47: Incompatible type for declaration. Can't
    convert oracle.xml.sql.query.OracleXMLQuery to java.lang.String.
    Can somebody tell me a method() for converting to solve my
    problem??????
    Thanks

  • String operations in internal table

    Dear friends..
            Good morning.
                        I wish to know.. how i segregate the field from a database table to internal table into two different internal table field. say for example.
    i have db table tab1 which has field number
    tab1 -> number
    and i have another internal table itab1 whic has two fields numa and numb
    tab1 -> numa
         -> numb
    i have value in tab1->number is 001 and 0001
    i wish to segregate this two values in to internal table
    if the value is 001 then it should be into 001 -> numa
    if the value is 0001 then it should be into 0001-> numb
    i dont know how to perform the string operations in internal table.. would you like to tell me how i fix this problem any suggetion, article, code will be great help of mine..
    thanking you
    Regards
    Naim

    Hi,
      what u can do is check the lenth
    lit_data_tab.
    lit_data_3
    lit_data_4.
    lv_char3 type char3.
    lv_char4 type char4.
    lv_length type i.
    loop at lit_data_tab.
    lv_length = STRLEN ( lit_data_tab-value ).
    if lv_length = 3.
       lv_char3 = lit_data_tab-value .
       append lv_char3 to lv_char3 type char3.
    else.
       lv_char4 = lit_data_tab-value .
       append lv_char4 to lv_char3 type char4.
    endif.
    endloop.
    if u want
    numa  numb
    003   0003.
    then u have to loop in one table and modify other.
    that is any one table should contains both the field.
    read the table with one field
    mark helpfull answers
    Regards
    Message was edited by: Manoj Gupta

  • To use Character string operator in ABAP

    HI,
    I have a problem with joining the two fields with different data length i.e
    OBJKY has length (30).
    tknum has length (10).
    the above read table i_nast works as long as both has the records not greater than 10 and I do have some records with greater than 10 for OBJKY in the database and my read is failing at that scenario, I need to use a charater string operator, as I am new to ABAP, can any one suggest me how to do .
    ...SQL..
    select OBJKY DATVR from nast
    into corresponding fields of table i_nast
    where KSCHL = 'ZBOL'.
    sort i_nast by OBJKY.
    LOOP at i_ship_data.
    read table i_nast with key
    OBJKY = i_ship_data-tknum binary search.
    if sy-subrc = 0.
    move: i_nast-datvr to i_ship_data-datvr.
    endif.
    modift i_ship_data.
    ENDLOOP.

    HI,
    Since OBJKY and TKNUM are with different lengths the Read statement works only
    when OBJKY has a 10 character value identical to TKNUM.
    but if we can assume that only first 10 characters of OBJKY are to be comapred with TKNUM then we can try the under mentioned approach:
    Create a new field in the Internal table I_NAST with length 10 characters.(I_NAST-OBJKY_TEMP).
    now assign the first 10 characters of OBJKY to this new field :
    LOOP AT I_NAST.
    MOVE I_INAST-OBJKY+0(10) TO OBJKY_TEMP.
    MODIFY I_NAST.
    ENDLOOP.
    Now you can Modify your READ STATEMENT :
    LOOP at i_ship_data.
    read table i_nast with key
    OBJKY_TEMP = i_ship_data-tknum binary search.
    if sy-subrc = 0.
    move: i_nast-datvr to i_ship_data-datvr.
    endif.
    modift i_ship_data.
    ENDLOOP.
    Hope this will help.
    Note: You can pick up any 10 characters starting from 1 to 20 th character of the field
    I_INAST-OBJKY.
    Reward Points if found helpfull..
    Cheers,
    Chandra Sekhar.

  • Slow to convert Oracle 11g XMLType into Java String or Document

    After retrieving a result set from an Oracle 11g database, it takes roughly 75 seconds to convert the XMLType (this is a structured XML Storage, registered with an xsd) into either a java String or Document. I'm using Java 1.6, have the xdb.jar and xmlparserv2.jar
    This xsd is <100 lines and the xml document is also <100 lines.
    Sample code:
    oracle.xdb.XMLType xml = oracle.xdb.XMLType.createXML((oracle.sql.OPAQUE)rset.getObject("XMLDATA"));
    the other way, but still took just as long:
    XMLType xml = (XMLType)rset.getObject("XMLDATA");
    xml.getStringVal();
    or
    org.w3c.dom.Document doc = xml.getDocument();
    either way of the above ways takes just as long.

    If I put this value into the database table, I can
    see only the date. Time part is missing. Is this the
    problem with java.sql.Date or Oracle datatype Date?This is not a problem, this is the defined behaviour of the Date type. RTFAPI and have a look at Timestamp, while you're at it.

  • Java or operator statement and expressions

    So I tried to simplify some code this morning using the java or operator and it doesn't seem to work. It went something like this:
    if ((time < 4) || (time > 10))}
    val = aVariable}
    else val=0;
    What I end up with is zero until 4 but after 10 the val does not return to 0.
    I hope this makes sense to somebody out there. I just went back to a stack of if else statements to solve the problem.
    Thanks

    I'm not sure what the problem is (except that your curly brackets are wrong), but this opacity expression works:
    if ((time < 4) || (time > 10)){
    0
    }else{
    100
    Dan

  • How many java String objects are created in string literal pool by executin

    How many java String objects are created in string literal pool by executing following five lines of code.
    String str = "Java";
    str = str.concat(" Beans ");
    str = str.trim();
    String str1 = "abc";
    String str2 = new String("abc").intern();
    Kindly explain thanks in advance
    Senthil

    virtuoso. wrote:
    jverd wrote:
    In Java all instances are kept on the heap. The "String literal pool" is no exception. It doesn't hold instances. It holds references to String objects on the heap.Um, no.
    The literal pool is part of the heap, and it holds String instances.
    [http://java.sun.com/docs/books/jvms/second_edition/html/Overview.doc.html#22972]
    [http://java.sun.com/docs/books/jvms/second_edition/html/ConstantPool.doc.html#67960]
    You're referring to the JVM. That's not Java.It's part of Java.
    There is nowhere in Java where it is correct to say "The string literal pool holds references, not String objects."

  • Perl versus other "scripting" languages when doing string operations

    I've been told that perl is a "scripting" language like the other languages mentioned in this forum.
    If that's true, can these other languages handle the following spec as well as perl can?  (See spec at end of this post.)
    Or is perl stronger in string operations than the other scripting languages mentioned here?
    Here's the spec:
    1. I give your program  a twenty-letter alphabet (any twenty letter alphabet)
    For example:
    ABCDEFGHIJKLMNOPQRST
    2.  I also give your program four groups (any four groups) of letters in this alphabet:
    For example:
    s:  A,B,C,D,E
    p:  F,G,H,I,J
    d:  K,L,M,N,O
    e:  P,Q,R,S,T
    3.  I also give your program a sequence over the twenty-letter alphabet that I gave you in Step (1) above:
    For example:
    ABCDEFGHIJKLMNOPQRSTSRQPONMLKJIHGFEDCBA
    4.  Given this sequence,you search for pairs of adjacent letters (x,y) where X and y are from different groups (the groups defined in Step (2) above.)
    Also, you return the results of this search by giving me back the following two strings:
    ABCD(EF)GHI(JK)LMN(OP)QRSTSRQ(PO)NML(KJ)IHG(FE)DCBA  
    ABCD(sp)GHI(pd)LMN(de)QRSTSRQ(ed)NML(dp)IHG(ps)DCBA
    5.  Note: if I give you a sequence that contains "overlapping" ordered pairs like:
    ...EFK...
    then you ignore the second ordered pair.  That is, you return:
    ...(EF)K

    OK - here is the final stuff on the "C" side.
    To execute the program, the command line is:
    20let.exe file1.txt file2.txt file3.txt > fileout.txt
    Below, I've provided:
    a) source code 20let.c
    b) sample input file1.txt
    c) sample input file2.txt
    d) sample input file3.txt
    e) output fileout.txt generated from these input files.
    As soon as Bill finishes the perl version of the source code, I'll post that also.
    source code of 20let.c
    // 20let.c5
    #include <stdio.h>
    #include <stdlib.h>
    int T[333],A[99999],G[333],B[99999],C[99999],N[299999],P[99999];
    int n1,n2,f,p,x1,x2,n,m,a,b,c,i,j,k,x,y,z;
    int E[233][233];
    FILE *file;
    int substrings(int x1,int x2);
    int main(int argc, char*argv[]) {
        if(argc<3){
            printf("\nusage:20let protein-file nucleotide-file pairs-include-file\n\n");
            printf("marks amino-acid-pairs from different groups in protein-file\n");
            printf("iff they are in the include-file\n");
            exit(1);
    //----------------define the groups        G['I'] = 's', e.g.
        x='s'; G['I']=x;G['M']=x;G['V']=x;G['A']=x;G['G']=x;
        x='p'; G['F']=x;G['L']=x;G['P']=x;G['W']=x;G['W']=x;
        x='d'; G['H']=x;G['Q']=x;G['D']=x;G['E']=x;G['E']=x;
        x='t'; G['S']=x;G['T']=x;G['Y']=x;G['N']=x;G['C']=x;G['K']=x;G['R']=x;
    //----------------the 4 bases              T['a'] = 0 thru 3
        for(x=0;x<222;x++)
            T[x]=-999;
        T['a']=0;T['c']=1;T['g']=2;T['t']=3;
        T['A']=0;T['C']=1;T['G']=2;T['T']=3;
      for(i=65;i<70;i++)G<i>='s';
      for(i=70;i<75;i++)G<i>='p';
      for(i=75;i<80;i++)G<i>='d';
      for(i=80;i<85;i++)G<i>='t';
    //---------------- read include-file   file3 xxxyyy pairs E[x][y] of interest
        f=0;
        for(x=0;x<222;x++)
            for(y=0;y<222;y++)
                E[x][y]=0;
        if((file=fopen(argv[3],"rb"))==NULL){
            printf("\ncan't open exclude-file %s\n",argv[1]);exit(1);
    mq1: if(feof(file))
            goto mq3;
        x=fgetc(file);y=fgetc(file);x=fgetc(file);
        x=T[fgetc(file)]*16+T[fgetc(file)]*4+T[fgetc(file)];
        y=T[fgetc(file)]*16+T[fgetc(file)]*4+T[fgetc(file)];
        if(x<64 && x>=0 && y<64 && y>=0){
            E[x][y]=1;
            f++;
    mq2: if(feof(file))
            goto mq3;
        a=fgetc(file);
        if(a!=10)
            goto mq2;
        goto mq1;
    mq3: fclose(file);
    //------------------read amino-acid file    file1 == P array
        if((file=fopen(argv[1],"rb"))==NULL){
            printf("\ncan't open file %s\n",argv[1]);exit(1);}
        p=0;
    m1p: if(feof(file))
            goto m2p;
        p++;
        P[p]=fgetc(file);
        if(G[P[p]]==0)
            p--;
        goto m1p;
    m2p:;
        fclose(file);
    //------------------read nucleotide file    file2 == N array
        if((file=fopen(argv[2],"rb"))==NULL){
            printf("\ncan't open file %s\n",argv[1]);exit(1);
        n=0;
    m1n: if(feof(file))
            goto m2n;
        n++;
        N[n]=fgetc(file);
        if(N[n]!='a' && N[n]!='c' && N[n]!='g' && N[n]!='t')
            n--;
        goto m1n;
    m2n:;
        fclose(file);
    //for(i=1;i<=p;i++)printf("%c",P<i>);printf("\n");
    //for(i=1;i<=n;i++)printf("%c",N<i>);printf("\n");
    //printf("%i include-pairs  %i nucleotides  %i proteins\n",f,n,p);
    //------------1st line------------------       B<i> = result
        m=0;
        for(i=1;i<=p;i++){
            n1=T[N[i*3-2]]*16+T[N[i*3-1]]*4+T[N[i*3]];
            n2=T[N[i*3+1]]*16+T[N[i*3+2]]*4+T[N[i*3+3]];
    //printf("\ni=%i p=%i n1=%i n2=%i\n",i,p,n1,n2);
            if(E[n1][n2]<1 || G[P<i>]==G[P[i+1]] /* || i==n */){
                printf("%c",P<i>);
                m++;
                B[m]=P<i>;
                goto m3;
            printf("(%c%c)",P<i>,P[i+1]);
            i++;
            m++;
            B[m]='(';
            m++;
            B[m]=P[i-1];
            m++;
            B[m]=P<i>;
            m++;
            B[m]=')';
    //printf("(%c)%c",G[A<i>],G[A[i+1]]);i++;
        m3:;
        printf("\n");
    //------------2nd line------------------       C<i> = result
        m=0;
        for(i=1;i<=p;i++){
            n1=T[N[i*3-2]]*16+T[N[i*3-1]]*4+T[N[i*3]];
            n2=T[N[i*3+1]]*16+T[N[i*3+2]]*4+T[N[i*3+3]];
            if(E[n1][n2]<1 || G[P<i>]==G[P[i+1]] /* || i==n */){
                printf("%c",P<i>);
                m++;
                C[m]=P<i>;
                goto m4;
            printf("(%c%c)",G[P<i>],G[P[i+1]]);
            i++;
            m++;
            C[m]='(';
            m++;
            C[m]=G[P[i-1]];
            m++;
            C[m]=G[P<i>];
            m++;
            C[m]=')';
    //printf("(%c)%c",G[A<i>],G[A[i+1]]);i++;
        m4:;
        printf("\n");
    //for(i=1;i<=m;i++)printf("%c",B<i>);printf("\n");
    //------------3rd line------------------         printf only
        m=0;
        for(i=1;i<=p;i++){
            n1=T[N[i*3-2]]*16+T[N[i*3-1]]*4+T[N[i*3]];
            n2=T[N[i*3+1]]*16+T[N[i*3+2]]*4+T[N[i*3+3]];
            if(E[n1][n2]<1 || G[P<i>]==G[P[i+1]] /* || i==n */){
                printf("%c%c%c",N[i*3-2],N[i*3-1],N[i*3]);
                goto m33;
            printf("(%c%c%c%c%c%c)",N[i*3-2],N[i*3-1],N[i*3],N[i*3+1],N[i*3+2],N[i*3+3]);
            i++;
        m33:;
        printf("\n");
    //--------------substrings------------
        substrings(20,29);
        substrings(30,39);
        substrings(40,49);
        substrings(50,59);
        substrings(60,69);
        return 0;
    int substrings(int x1,int x2)
        printf("\n");
        printf("lengths %i - %i : \n",x1,x2);
        for(i=1; i<p; i++)
            for (j=i+x1; j<i+x2; j++) {
                if (C<i>>95 && C[j]>95) {   // if lc letter in line2
                    for(x=i;x<=j;x++)
                        printf("%c",C[x]);
                    printf("|");
                    for(x=i;x<=j;x++)
                        if(B[x]>44)         // if not () in line 1
                            printf("%c",B[x]);
                    printf("|");
                    for(x=i;x<=j;x++)
                        if(C[x]>95)         // if lc letter line2
                            printf("%c",C[x]);
                    printf("\n");}
    input file1.txt
    MKKHTDQPIADVQGSPDTRH
    IAIDRVGIKAIRHPVLVADK
    DGGSQHTVAQFNMYVNLPHN
    FKGTHMSRFVEILNSHEREI
    SVESFEEILRSMVSRLESDS
    GHIEMTFPYFVNKSAPISGV
    KSLLDYEVTFIGEIKHGDQY
    GFTMKVIVPVTSLCPCSKKI
    SDYGAHNQRSHVTISVHTNS
    FVWIEDVIRIAEEQASCELF
    GLLKRPDEKYVTEKAYNNPK
    FVEDIVRDVAEILNHDDRID
    AYVVESEBFESIHNHSAYAL
    IERD
    input file2.txt
    atgaaaaaacatactgatcaacctatcgctgatgtgcagggctcaccggataccagacat
    atcgcaattgacagagtcggaatcaaagcgattcgtcacccggttctggtcgccgataag
    gatggtggttcccagcataccgtggcgcaatttaatatgtacgtcaatctgccacataat
    ttcaaagggacgcatatgtcccgttttgtggagatactaaatagccacgaacgtgaaatt
    tcggttgaatcatttgaagaaattttgcgctccatggtcagcaggctggaatcagattcc
    ggccatattgaaatgacttttccctacttcgtcaataaatcagcccctatctcaggtgta
    aaaagcttgctggattatgaggtaacctttatcggcgaaattaaacatggcgatcaatat
    gggtttaccatgaaggtgatcgttcctgttaccagcctgtgcccctgctccaagaaaata
    tccgattacggtgcgcataaccagcgttcacacgtcaccatttctgtacacactaacagc
    ttcgtctggattgaggacgttatcagaattgcggaagaacaggcctcatgcgaactgttc
    ggtctgctgaaacggccggatgaaaaatatgtcacagaaaaggcctataacaatccgaaa
    tttgtcgaagatatcgtccgtgatgtcgccgaaatacttaatcatgatgaccggatagat
    gcctatgttgttgaatcagaaaactttgaatccatacataatcactctgcatacgcactg
    atagagcgcgac
    input file3.txt
    FA tttgcc
    FA ttcgcc
    FA tttgct
    FA ttcgct
    LK ttaaaa
    LK ttgaaa
    LK ttaaag
    LK ttgaag
    LS ctgctc
    LS ctgctt
    LS ctactc
    LS ctactt
    LT ctcacc
    LT ctcact
    LT cttacc
    LT cttact
    LY ctctac
    LY ctctat
    LY ctttac
    LY ctttat
    LG ctcggc
    LG ctcggt
    LG cttggc
    LG cttggt
    IP attccc
    IP attcct
    IP atcccc
    IP atccct
    IP attcca
    IP attccg
    IP atccca
    IP atcccg
    ML atgctc
    ML atgctt
    ML atgctc
    ML atgctt
    VL gtgctg
    VL gtgcta
    VL gtactg
    VL gtacta
    VS gtgtcc
    VS gtatct
    VS gtgtcc
    VS gtatct
    VT gtcacc
    VT gtcact
    VT gttacc
    VT gttact
    VS gtcagc
    VS gtcagt
    VS gttagc
    VS gttagt
    SL tcgctg
    SL tcgcta
    SL tcactg
    SL tcacta
    SP tctcca
    SP tctccg
    SP tcccca
    SP tccccg
    PV ccggtg
    PV ccggta
    PV ccagtg
    PV ccagta
    PG cccggc
    PG cccggt
    PG cctggc
    PG cctggt
    TL acgctg
    TL acgcta
    TL acactg
    TL acacta
    TP acgccg
    TP acgcca
    TP acaccg
    TP acacca
    AL gcttta
    AL gctttg
    AL gcctta
    AL gccttg
    AP gcgccg
    AP gcgcca
    AP gcaccg
    AP gcacca
    AP gctcca
    AP gctccg
    AP gcccca
    AP gccccg
    AN gctaat
    AN gctaac
    AN gccaat
    AN gccaac
    AS gccagc
    AS gccagt
    AS gctagc
    AS gctagt
    YP tatccg
    YP tatcca
    YP tacccg
    YP taccca
    HP catccg
    HP catcca
    HP cacccg
    HP caccca
    QR cagcga
    QR cagcgg
    QR caacga
    QR caacgg
    DL gatttg
    DL gattta
    DL gacttg
    DL gactta
    EN gaaaat
    EN gaaaac
    EN gagaat
    EN gagaac
    EK gaaaaa
    EK gaaaag
    EK gagaaa
    EK gagaag
    ER gagcga
    ER gagcgg
    ER gaacga
    ER gaacgg
    WR tggcga
    WR tggcgg
    RV cgggtg
    RV cgggta
    RV cgagtg
    RV cgagta
    RW cggtgg
    RW cgatgg
    SG agtgga
    SG agtggg
    SG agcgga
    SG agcggg
    GF ggtttt
    GF ggtttc
    GF ggcttt
    GF ggcttc
    GL gggctg
    GL gggcta
    GL ggactg
    GL ggacta
    GY gggtat
    GY gggtac
    GY ggatat
    GY ggatac
    GY ggttat
    GY ggttac
    GY ggctat
    GY ggctac
    GK ggaaaa
    GK ggaaag
    GK gggaaa
    GK gggaag
    GK ggcaag
    GK ggcaaa
    GK ggtaag
    GK ggtaaa
    GW ggctgg
    GW ggttgg
    GR gggcgg
    GR gggcga
    GR ggacgg
    GR ggacga
    GS ggcagc
    GS ggcagt
    GS ggtagc
    GS ggtagt
    output fileout.txt
    MKKHTDQPIADVQGSPDTRHIAIDRVGIKAIR(HP)VLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSM(VS)RLESDSGHIEMTFPYFVNKSAPISGVKSLLDYEVTFIGEIKHGDQYGFTMKVIVP(VT)SLCPCSKKISDYGAHNQRSH(VT)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(EK)YVT(EK)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(EF)ESIHNHSAYALIERD
    MKKHTDQPIADVQGSPDTRHIAIDRVGIKAIR(dp)VLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSM(st)RLESDSGHIEMTFPYFVNKSAPISGVKSLLDYEVTFIGEIKHGDQYGFTMKVIVP(st)SLCPCSKKISDYGAHNQRSH(st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt)YVT(dt)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(dp)ESIHNHSAYALIERD
    atgaaaaaacatactgatcaacctatcgctgatgtgcagggctcaccggataccagacatatcgcaattgacagagtcggaatcaaagcgattcgt(cacccg)gttctggtcgccgataaggatggtggttcccagcataccgtggcgcaatttaatatgtacgtcaatctgccacataatttcaaagggacgcatatgtcccgttttgtggagatactaaatagccacgaacgtgaaatttcggttgaatcatttgaagaaattttgcgctccatg(gtcagc)aggctggaatcagattccggccatattgaaatgacttttccctacttcgtcaataaatcagcccctatctcaggtgtaaaaagcttgctggattatgaggtaacctttatcggcgaaattaaacatggcgatcaatatgggtttaccatgaaggtgatcgttcct(gttacc)agcctgtgcccctgctccaagaaaatatccgattacggtgcgcataaccagcgttcacac(gtcacc)atttctgtacacactaacagcttcgtctggattgaggacgttatcagaattgcggaagaacaggcctcatgcgaactgttcggtctgctgaaacggccggat(gaaaaa)tatgtcaca(gaaaag)gcctataacaatccgaaatttgtcgaagatatcgtccgtgatgtcgccgaaatacttaatcatgatgaccggatagatgcctatgttgttgaatca(gaaaac)tttgaatccatacataatcactctgcatacgcactgatagagcgc
    lengths 20 - 29 :
    st)SLCPCSKKISDYGAHNQRSH(s|VTSLCPCSKKISDYGAHNQRSHV|sts
    st)SLCPCSKKISDYGAHNQRSH(st|VTSLCPCSKKISDYGAHNQRSHVT|stst
    t)SLCPCSKKISDYGAHNQRSH(s|TSLCPCSKKISDYGAHNQRSHV|ts
    t)SLCPCSKKISDYGAHNQRSH(st|TSLCPCSKKISDYGAHNQRSHVT|tst
    lengths 30 - 39 :
    st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(d|VTISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDE|std
    t)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(d|TISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDE|td
    t)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt|TISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDEK|tdt
    dt)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(d|EKAYNNPKFVEDIVRDVAEILNHDDRIDAYVVESE|dtd
    dt)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(dp|EKAYNNPKFVEDIVRDVAEILNHDDRIDAYVVESEF|dtdp
    t)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(d|KAYNNPKFVEDIVRDVAEILNHDDRIDAYVVESE|td
    t)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(dp|KAYNNPKFVEDIVRDVAEILNHDDRIDAYVVESEF|tdp
    lengths 40 - 49 :
    st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt)YVT(d|VTISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDEKYVTE|stdtd
    st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt)YVT(dt|VTISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDEKYVTEK|stdtdt
    t)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt)YVT(d|TISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDEKYVTE|tdtd
    t)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt)YVT(dt|TISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDEKYVTEK|tdtdt
    dt)YVT(dt)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(d|EKYVTEKAYNNPKFVEDIVRDVAEILNHDDRIDAYVVESE|dtdtd
    dt)YVT(dt)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(dp|EKYVTEKAYNNPKFVEDIVRDVAEILNHDDRIDAYVVESEF|dtdtdp
    t)YVT(dt)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(d|KYVTEKAYNNPKFVEDIVRDVAEILNHDDRIDAYVVESE|tdtd
    t)YVT(dt)AYNNPKFVEDIVRDVAEILNHDDRIDAYVVES(dp|KYVTEKAYNNPKFVEDIVRDVAEILNHDDRIDAYVVESEF|tdtdp
    lengths 50 - 59 :
    t)RLESDSGHIEMTFPYFVNKSAPISGVKSLLDYEVTFIGEIKHGDQYGFTMKVIVP(s|SRLESDSGHIEMTFPYFVNKSAPISGVKSLLDYEVTFIGEIKHGDQYGFTMKVIVPV|ts
    lengths 60 - 69 :
    dp)VLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSM(s|HPVLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSMV|dps
    dp)VLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSM(st|HPVLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSMVS|dpst
    p)VLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSM(s|PVLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSMV|ps
    p)VLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSM(st|PVLVADKDGGSQHTVAQFNMYVNLPHNFKGTHMSRFVEILNSHEREISVESFEEILRSMVS|pst
    st)RLESDSGHIEMTFPYFVNKSAPISGVKSLLDYEVTFIGEIKHGDQYGFTMKVIVP(st|VSRLESDSGHIEMTFPYFVNKSAPISGVKSLLDYEVTFIGEIKHGDQYGFTMKVIVPVT|stst
    st)SLCPCSKKISDYGAHNQRSH(st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(d|VTSLCPCSKKISDYGAHNQRSHVTISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDE|ststd
    st)SLCPCSKKISDYGAHNQRSH(st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt|VTSLCPCSKKISDYGAHNQRSHVTISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDEK|ststdt
    t)SLCPCSKKISDYGAHNQRSH(st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(d|TSLCPCSKKISDYGAHNQRSHVTISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDE|tstd
    t)SLCPCSKKISDYGAHNQRSH(st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt|TSLCPCSKKISDYGAHNQRSHVTISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDEK|tstdt
    t)SLCPCSKKISDYGAHNQRSH(st)ISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPD(dt)YVT(d|TSLCPCSKKISDYGAHNQRSHVTISVHTNSFVWIEDVIRIAEEQASCELFGLLKRPDEKYVTE|tstdtd
    Edited by: David Halitsky on Mar 18, 2008 4:21 AM
    Edited by: David Halitsky on Mar 18, 2008 4:22 AM

  • Regarding string operations

    Hi All,
    i've a string holding the value as given below.
    AA,17,2/19/2003,"9,999.00",USD,00,10,318,"193,275.31"
    by performing some string operations i want the result string in the format as given below:
    AA,17,2/19/2003,"9999.00",USD,00,10,318,"193,275.31"
      i.e., i want to remove all the commas(,)that are included in between a pair of " " only.
    can anyone provide me a sample code for the same

    Hi vijay,
    A bit complex but works for sure, check the following logic,
    REPORT zsritest.
    DATA: gs_string TYPE string.
    gs_string = 'AA,17,2/19/2003,"9,999.00",USD,00,10,318,"193,275.31"'.
    WRITE: / gs_string.
    PERFORM string_trim CHANGING gs_string.
    *       FORM string_trim                                              *
    *  -->  LS_STRING                                                     *
    FORM string_trim CHANGING ls_string.
      DATA: lt_string TYPE string OCCURS 0 WITH HEADER LINE,
            lv_tabix TYPE i,
            lv_start.
      SPLIT gs_string AT '"' INTO TABLE lt_string.
      CHECK sy-subrc EQ 0.
      CLEAR gs_string.
      LOOP AT lt_string.
        lv_tabix = sy-tabix MOD 2.
        IF lv_tabix EQ 0.
          TRANSLATE lt_string USING ', '.
          CONDENSE lt_string NO-GAPS.
        ENDIF.
        IF lv_tabix EQ 0 OR lv_start EQ 'X'.
          CONCATENATE gs_string lt_string INTO gs_string SEPARATED BY '"'.
          IF lv_start EQ 'X'.
            CLEAR lv_start.
          ELSE.
            lv_start = 'X'.
          ENDIF.
        ELSE.
          CONCATENATE gs_string lt_string INTO gs_string.
        ENDIF.
      ENDLOOP.
      IF lv_start EQ 'X'.
        CONCATENATE gs_string '"' INTO gs_string.
      ENDIF.
      WRITE: / gs_string.
    ENDFORM.
    Hope this helps..
    Sri

  • How do I get the proper UTF-8 NVARCHAR2 DB Value into a Java String?

    Hello, I have a mixed char 8.1.7 database with UTF-8 as my NLS
    charset. I used SQL Worksheet to enter test polish unicode
    characters 50309,50310... into an NVARCHAR2 column using
    insert...char(nnn using NCHAR_CS) and get the following result
    from DUMP(columnname,1016):
    Typ=1 Len=12 CharacterSet=UTF8:
    c4,85,c4,86,c4,87,c4,88,c4,89,c4,8a.
    Everything looks good.
    Now, how do I get them out into a Java String and verify that I
    have received the correct hex codes?
    I have tried:
    1. CHAR lChr = OracleResultSet.getCHAR(ColumnName, csUTF8); or
    and lChr.characterStreamValue();
    2. InputStream lIStr = arsResultSet.getBinaryStream(ColumnName);
    3. String lStr = (String)arsResultSet.getObject(ColumnName);
    and then:
    lStr.getBytes("UTF-8"); or lStr.toCharArray();
    I always get questions marks and negative byte values or the
    values: 261,262,263,264,265,266.
    I am using the latest 9.0.1. JDBC Thin drivers and the Oracle
    extensions: OracleResultSet, OracleStatement etc...
    Please let me know what class/method I need, to get the Oracle
    NVARCHAR2 unicode string from the result set into a Java string
    and what method to use to look at the underlying hex codes.
    TIA for any pointers.

    If I use in bdInt=sc.nextInt();a substring cannot be used.
    Does anyone know how I solve this question without substring? Perhapse something I've mentioned above
    sincerely h

  • Need help in String operations

    HI all,
    I need help in String operations.I am getting file path of an image as
    c:\test\img\abc.gif"
    I need to convert it in to c:/test/img/abc.gif".
    Can any one suggest the solution for this.
    Thanks,
    Durga.

    [email protected] wrote:
    I used String replace method but I am not able to do it because "/" is a special character."/" is not a special character, "\" is a special character, which needs to be escaped by "\" itself.

  • Help:How to get a java String value from a C char array?

    Hi,everyone,could you help me?
    the following is a C struct that i want to recieve a short message:
    struct MO_msg{
    unsigned long long msgID;          //Message ID
    char      dest_id[21]; //Destination Mobile Phone Number
    char      service_id[10];     //
    Now I want to put the "dest_id " value of this struct into a Java String variable.But I dont know how to implement it!
    The following is a block of source code that i implement this functions.But it cant get a String value ,and throw out a Exception:
    java.lang.NullPointerException
    at java.lang.StringBuffer.append(StringBuffer.java:389)
    JNIEXPORT jint JNICALL Java_md_EMAP_thread_RubeMOTSSX_getMO
    (JNIEnv * env, jobject obj, jint connId, jobject mo){
         struct MO_msg MO;
         tssx_cmpp_api_debug_flag = 1;
         int result = CMPP_Get_MO((int)connId,&MO);
         if (result == 0){
              jclass cls = (*env)->GetObjectClass(env,mo);
              jfieldID msgId = (*env)->GetFieldID(env,cls,"msgId","J");
              jfieldID dest_Id = (*env)->GetFieldID(env,cls,"dest_Id","Ljava/lang/String;");
              jfieldID serviceId = (*env)->GetFieldID(env,cls,"serviceId","Ljava/lang/String;");          
              (*env)->SetLongField(env,mo,msgId,MO.msgID);               
              (*env)->SetCharField(env,mo,dest_Id,*destId);
              (*env)->SetCharField(env,mo,serviceId,MO.service_id);
         return result;
    Please help me!Thanks!

    bschauwe:Thank you for your help!
    Yes,just as you say,using NewString Or NewStringUTF can import a C char array into a Java String variable! But now I have another question,when i use these two functions ,i found that it cant deal with Chinese character!
    do you have such experiences to deal with another language charset?if you have ,can you tell me how to deal with it.

Maybe you are looking for

  • Part of Web page not getting displayed in IE6.0

    Hi, I m facing an strange problem, my saftware was properly working in IE 5.5 but when i check it on IE 6.0 part of web page which gets loaded through a Servlet does not gets properly displayed, ie the lower portion gets hidden, to an extent of surpr

  • OLE DB BETA

    The OLE DB provider automatically retrieves information for the metadata columns. For example whether a column is a key column. Is there are a way to control whether the provider shoudl retrieve this information ? The impact on performance is signifi

  • [UCCx] Change volume of an audio file (java code)

    Hello guys, Thanks to the many examples I compiled on the subject, I was able to create a script that mixes 2 audio wav files into a 3rd one. Basically the goal is to mix a first audio file containing some speech with a second one containing some mus

  • BI Content option not displayed in RSA1

    We are using BI 7.0 SP12 with BI Content 703 SP5.  I am not able to see "BI Content" option in RSA1 in our development system.  We activated several BI content objects before in the same system and we used to see "BI content" option before. I checked

  • Cisco Agent Desktop incorrect stats Agent Statistics Display

    Hi, I have a very strange problem with a cisco agent desktop user. The times/numbers as displayed in the Agent Statistics Display are incorrect. As you can see from image 1, the agent was presented with 14 calls. The total talking time is 7:22. Howev