Checking a Sub String in a String of a very large file ?

Hi All,
I am having a 20mb file and i am coverting that 20mb file to a string.now i like to search whether the following substring appears in the string
a) One
b) Two
c) Three
I am just checking with String.indexOf() for each .
Is there is any other efficient way of searching this rather than the above method String.indexOf
please give me some suggestion how to perform searching in a very large string.
Thanks,
J.Kathir

We have to read line and check for the three strings
right ?
Is there is any other way to check the substring in
the whole string ?
Reading line by line and searching for the three
strings or Reading the entire file as string and
searching for the substring
which is better ?Depends on your definition of "better" - reading line by line saves memory, but is probably slower. Reading the entire file into memory then searching it is probably faster but takes more memory.
As a third option, if you're using version 1.5, you can use the Scanner class to read the file and search for all 3 strings at the same time...something like this:
Scanner scn = new Scanner(new File("myfile.txt"));
String regExp = "a|b|c";
String s;
while ((s = scn.findWithinHorizon(regExp)) != null) {
  System.out.println("Found: " + s);
scn.close();I'm pretty sure this is more efficient as it looks for anything matching the regular expression as it goes through the String...but I haven't done any timings on each approach.

Similar Messages

  • Null String and Empty String problem

    Hello everyone,
    since i am totally new in JSP, i am getting problem in handling strings.
    Suppose i have a variable users = ""; then
    I want to ask when to use:
    if (users.equals(""))
    and
    if(users == "")
    in my code, variable users has value "regional" for regional users.
    and i am checking this code as:
    if (users.equals{"regional")) {
    out.print ("I am inside code");
    at that time, the code is throwing error (run time error)
    and when i changed the code as:
    if (users == "regional") {
    out.print ("I am inside code");
    this time, the code is not generating error but the part message "I am inside code " is not displaying. The code do not inserts inside the if condition
    I hope u understand my problem. Can anybody help me out with this.

    This has basically nothing to do with JSP, but with basic Java knowledge.
    When using the '==' operator to compare Objects (yes, String is actually a subclass of Object), then it will look if they are of the same reference. Using the '==' operator to compare primitive datatypes (int, boolean, char, etc) will look if they have the same value.
    That is why the Object class has the equals() method to give the ability compare with another objects. And you can only invoke it when the Object is actually instantiated. So if it is not null.
    if (string != null && string.equals("somevalue")) {
    // or
    if ("somevalue".equals(string)) {
    }should work.
    Edit rym82: this will not throw a NPE, but an ordinary compilation error ;)
    Message was edited by:
    BalusC

  • Double.parseDouble(String) - problems when string is in scientific notation

    Hello guys,
    I'm doing some numerical calculations and I wonder whether it is possible for Double.parseDouble(String) to parse string in the scientific notation i.e. 1.0824234234E-10. Is it the notation itself causing the exception : NumberFormatException or the number is just too big/small and double can't hold it ?
    If it's just the notation how can I fix it ?
    Regards

    i'm not quite sure whether double odoes not allow it.
    perhaps consider the api Double.valueOf() and the testing code provided; reproduced below:To avoid calling this method on a invalid string and having a NumberFormatException be thrown, the regular expression below can be used to screen the input string:
            final String Digits     = "(\\p{Digit}+)";
      final String HexDigits  = "(\\p{XDigit}+)";
            // an exponent is 'e' or 'E' followed by an optionally
            // signed decimal integer.
            final String Exp        = "[eE][+-]?"+Digits;
            final String fpRegex    =
                ("[\\x00-\\x20]*"+  // Optional leading "whitespace"
                 "[+-]?(" + // Optional sign character
                 "NaN|" +           // "NaN" string
                 "Infinity|" +      // "Infinity" string
                 // A decimal floating-point string representing a finite positive
                 // number without a leading sign has at most five basic pieces:
                 // Digits . Digits ExponentPart FloatTypeSuffix
                 // Since this method allows integer-only strings as input
                 // in addition to strings of floating-point literals, the
                 // two sub-patterns below are simplifications of the grammar
                 // productions from the Java Language Specification, 2nd
                 // edition, section 3.10.2.
                 // Digits ._opt Digits_opt ExponentPart_opt FloatTypeSuffix_opt
                 "((("+Digits+"(\\.)?("+Digits+"?)("+Exp+")?)|"+
                 // . Digits ExponentPart_opt FloatTypeSuffix_opt
                 "(\\.("+Digits+")("+Exp+")?)|"+
           // Hexadecimal strings
           "((" +
            // 0[xX] HexDigits ._opt BinaryExponent FloatTypeSuffix_opt
            "(0[xX]" + HexDigits + "(\\.)?)|" +
            // 0[xX] HexDigits_opt . HexDigits BinaryExponent FloatTypeSuffix_opt
            "(0[xX]" + HexDigits + "?(\\.)" + HexDigits + ")" +
            ")[pP][+-]?" + Digits + "))" +
                 "[fFdD]?))" +
                 "[\\x00-\\x20]*");// Optional trailing "whitespace"
      if (Pattern.matches(fpRegex, myString))
                Double.valueOf(myString); // Will not throw NumberFormatException
            else {
                // Perform suitable alternative action
    http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Double.html

  • Compairing a string to multiple strings

    This is what I need to accomplish in seudocode
    if (string != "this" or "that" or "this" or "that")
    Id do it with 12 different or statements
    if(string != "this" || string != "that" || ...etc){}but Im wondering if there is a cleaner way to go about this.

    String bla = "String4";
    String[] checks = { "String1", "String2", "String3", ... };
    boolean match = false;
    for (int i = 0; i < checks.length; i++) {
      if (checks.equals(bla)) {
    match = true;
    break;
    if (match) {
    Message was edited by:
    Simeon
    Message was edited by:
    Simeon

  • How can I substitute a string with another string in a file

    I have a file. I have a substitute a keyword with another string in all the occurences of a file. How can I do this?

    I'm gonna give you the benifit of the doubt and assume you didn't mean to double post your question.
    As to substitute a keyword with a string one way is to read in the file and run it through a StreamTokenizer class to break it into words. Pull one word at a time, check it, and stick it into a StringBuffer. Once your done with the file overwrite it with what is in the StringBuffer.
    Another way might be to use the RandomAccessFile class, but I'm not really familiar with that because I hardly ever use it.

  • Verify a String is a String

    This may appear to be a trivial question. I want to ascertain a String is a String. I thought of using a string method on a variable passed into a function and catch the exception. I have done quite a bit of searching around Java Platform SE 7. This seems like a strange thing to do and I have thought of testing everything else and if they fail it has to be a string but I would like something specific.

    832844 wrote:
    Edit: ok, it looks like you are trying to write some kind of parser here. You might be asking if the String looks like a Java String constant (i.e. it starts and ends with a double-quote). Is that what you want?I am asking if I have been provided a String that looks like a Java String constant. So I test if I have an int through parseInt and catch exception, same with float, and finally check if it is a String. Otherwise, return false.Ok, for the rest of this post I'll assume that variable is of type String and has been verified to not be null.
    This means that variable is-a String. There's no way around it. The question is if the information contained in that String can be interpreted as a floating-point literal, an integer literal or a String literal.
    Since floating-point literals and integer literals are everyday things, Java provides simple method to interpret them and you already use them.
    String literals as such are not so common in everyday use. Therefore Java doesn't provide a simple method to interpret them.
    You could use a regex to check if a given String "looks like" a String literal:
    String variable = nextToken();
    boolean isStringLiteral=variable.matches("\\\"([^\"]|\\\\\\\")*\\\");
    {code}
    If I didn't mis-type this, it should check if your variable starts and ends with a double-quote and contains no un-quoted double-quotes (there are ugly details that this regex doesn't handle, such as an un-quoted double-quote after a quoted backslash character, but it's just a demonstration).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • What is difference between Null String and Empty String ?

    Hi
    Just i have little confusion that the difference bet'n NULL String and Empty String ..
    Please clear my doubte.
    Thankx

    For the same reason I think it's okay to say "null
    String" and "empty String "as long as you know they
    really mean "null String reference" and "empty String
    object" respectively. Crap. It's only okay to say that as long as *the one you're talking to" knows what it really means. Whether you know it or not is absolutely irrelevant. And there is hardly any ambiguity about the effects that a statement like "assign an object to a reference" brings. "Null String" differs in that way.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • Difference between String and final String

    Hi friends,
    This is Ramana. Can u suggest me in this Question
    What is the difference between String and final String? Means
    String str="hai";
    final String str="hai";
    Regards,
    Ramana.

    *******REPEAT POST***********
    We already answered your question why post in a different section?
    http://forum.java.sun.com/thread.jspa?threadID=5201549

  • Replace String in a String

    How can I replace a String in a String ,for example :
    in the String "Hello World" replace Hello with Bye" ?
    Is there a class or code I can find for that purpose?

    use an utility method like this :
    I did not found an existing method doing this in java.lang.String or StringBufer
    public static String replaceFirstSubstring(
    String stringToSubstitute, String searchString, String replaceString) {
    if (stringToSubstitute == null) {
    return null;
    int prefixEndIndex = stringToSubstitute.indexOf(searchString);
    if (prefixEndIndex == -1) {
    return stringToSubstitute;
    int suffixStartIndex = prefixEndIndex + searchString.length();
    StringBuffer newString = new StringBuffer(stringToSubstitute.substring(0, prefixEndIndex));
    newString.append(replaceString).append(stringToSubstitute.substring(suffixStartIndex));
    return newString.toString();
    put it in a while if you need to replace more than 1 time

  • Creating String frm new String(charBuffer.array()) Vs charBuffer.toString()

    Whats the difference in creating String from CharBuffer by using array and by using toString() ?
    When ever i have some UTF-8 chars in my file (""someFile"), String created from new String( charBuffer.array()) appends some extra null/junk charaters at the very end of the file.
    How ever when i try charBuffer.toString() its working fine.
    For simple ASCII i.e ISO-*** charset both methods are working fine.
    Please see below code for reproducing. Here "someFile" is any text file with some UTF-8 characters.
    public char[] getCharArray()
    throws IOException
    Charset charset = Charset.forName("UTF-8");
    CharsetDecoder decoder = charset.newDecoder();
    FileInputStream fis = new FileInputStream("someFile");
    FileChannel channel = fis.getChannel();
    int size = (int) channel.size();
    MappedByteBuffer mbb = channel.map(FileChannel.MapMode.READ_ONLY, 0 , size);
    CharBuffer cb = decoder.decode(mbb);
    channel.close();
    fis.close();
    return cb.array();
    public String getAsString()
    throws IOException
    Charset charset = Charset.forName("UTF-8");
    CharsetDecoder decoder = charset.newDecoder();
    FileInputStream fis = new FileInputStream("someFile");
    FileChannel channel = fis.getChannel();
    int size = (int) channel.size();
    MappedByteBuffer mbb = channel.map(FileChannel.MapMode.READ_ONLY, 0 , size);
    CharBuffer cb = decoder.decode(mbb);
    channel.close();
    fis.close();
    return cb.toString();
    String fromToString = getAsString();
    String fromCharArray = new String(getCharArray());

    Whats the difference in creating String from CharBuffer by using array and by using toString() ?array() returns the entire backing array regardless of offset and position. toString() takes those into account.
    When ever i have some UTF-8 chars in my file (""someFile"), String created from new String( charBuffer.array()) appends some extra null/junk charaters at the very end of the file.More probably you haven't filled the array.
    How ever when i try charBuffer.toString() its working fine.So there you go.

  • Replacing a special character in a string with another string

    Hi
    I need to replace a special character in a string with another string.
    Say there is a string -  "abc's def's are alphabets"
    and i need to replace all the ' (apostrophe) with &apos& ..which should look like as below
    "abc&apos&s def&apos&s are alphabets" .
    Kindly let me know how this requirement can be met.
    Regards
    Sukumari

    REPLACE
    Syntax Forms
    Pattern-based replacement
    1. REPLACE [{FIRST OCCURRENCE}|{ALL OCCURRENCES} OF]
    pattern
              IN [section_of] dobj WITH new
              [IN {BYTE|CHARACTER} MODE]
              [{RESPECTING|IGNORING} CASE]
              [REPLACEMENT COUNT rcnt]
              { {[REPLACEMENT OFFSET roff]
                 [REPLACEMENT LENGTH rlen]}
              | [RESULTS result_tab|result_wa] }.
    Position-based replacement
    2. REPLACE SECTION [OFFSET off] [LENGTH len] OF dobj WITH new
                      [IN {BYTE|CHARACTER} MODE].
    Effect
    This statement replaces characters or bytes of the variable dobj by characters or bytes of the data object new. Here, position-based and pattern-based replacement are possible.
    When the replacement is executed, an interim result without a length limit is implicitly generated and the interim result is transferred to the data object dobj. If the length of the interim result is longer than the length of dobj, the data is cut off on the right in the case of data objects of fixed length. If the length of the interim result is shorter than the length of dobj, data objects of fixed length are filled to the right with blanks or hexadecimal zeroes. Data objects of variable length are adjusted. If data is cut off to the right when the interim result is assigned, sy-subrc is set to 2.
    In the case of character string processing, the closing spaces are taken into account for data objects dobj of fixed length; they are not taken into account in the case of new.
    System fields
    sy-subrc Meaning
    0 The specified section or subsequence was replaced by the content of new and the result is available in full in dobj.
    2 The specified section or subsequence was replaced in dobj by the contents of new and the result of the replacement was cut off to the right.
    4 The subsequence in sub_string was not found in dobj in the pattern-based search.
    8 The data objects sub_string and new contain double-byte characters that cannot be interpreted.
    Note
    These forms of the statement REPLACE replace the following obsolete form:
    REPLACE sub_string WITH
    Syntax
    REPLACE sub_string WITH new INTO dobj
            [IN {BYTE|CHARACTER} MODE]
            [LENGTH len].
    Extras:
    1. ... IN {BYTE|CHARACTER} MODE
    2. ... LENGTH len
    Effect
    This statement searches through a byte string or character string dobj for the subsequence specified in sub_string and replaces the first byte or character string in dobj that matches sub_string with the contents of the data object new.
    The memory areas of sub_string and new must not overlap, otherwise the result is undefined. If sub_string is an empty string, the point before the first character or byte of the search area is found and the content of new is inserted before the first character.
    During character string processing, the closing blank is considered for data objects dobj, sub_string and new of type c, d, n or t.
    System Fields
    sy-subrc Meaning
    0 The subsequence in sub_string was replaced in the target field dobj with the content of new.
    4 The subsequence in sub_string could not be replaced in the target field dobj with the contents of new.
    Note
    This variant of the statement REPLACE will be replaced, beginning with Release 6.10, with a new variant.
    Addition 1
    ... IN {BYTE|CHARACTER} MODE
    Effect
    The optional addition IN {BYTE|CHARACTER} MODE determines whether byte or character string processing will be executed. If the addition is not specified, character string processing is executed. Depending on the processing type, the data objects sub_string, new, and dobj must be byte or character type.
    Addition 2
    ... LENGTH len
    Effect
    If the addition LENGTH is not specified, all the data objects involved are evaluated in their entire length. If the addition LENGTH is specified, only the first len bytes or characters of sub_string are used for the search. For len, a data object of the type i is expected.
    If the length of the interim result is longer than the length of dobj, data objects of fixed length will be cut off to the right. If the length of the interim result is shorter than the length of dobj, data objects of fixed length are filled to the right with blanks or with hexadecimal 0. Data objects of variable length are adapted.
    Example
    After the replacements, text1 contains the complete content "I should know that you know", while text2 has the cut-off content "I should know that".
    DATA:   text1      TYPE string       VALUE 'I know you know',
            text2(18)  TYPE c LENGTH 18  VALUE 'I know you know',
            sub_string TYPE string       VALUE 'know',
            new        TYPE string       VALUE 'should know that'.
    REPLACE sub_string WITH new INTO text1.
    REPLACE sub_string WITH new INTO text2.

  • I need to append a string to another string

    I'm working with some inherited code, I'm a Colf Fusion
    novice myself, and I'm trying to make this order form display the
    correct data. The problem is a lot of data in the database is
    missing. Description in the QStockDB query can contain a lot of
    stuff. For our full color work the text "4/0", "4/BLACK", and "4/4"
    are consistent so I'm changing the newitem (I know, not very
    descriptive but it's not my code) to CMYK Printing.
    <cfif #QStockDB.description# contains "4/0"><cfset
    newitem = "CMYK Printing"></cfif>
    <cfif #QStockDB.description# contains
    "4/4/BLACK"><cfset newitem = "CMYK Printing"></cfif>
    <cfif #QStockDB.description# contains "4/4"><cfset
    newitem = "CMYK Printing"></cfif>
    I want to then go back through description and compare it
    more to add more description. For instance with this:
    <cfif #QStockDB.description# contains "12 pt"><cfset
    newitem = newitem + " - BC"></cfif>
    I know that the job is a business card. So I want to append
    the newitem variable with " - BC". Likewise:
    <cfif #QStockDB.description# contains
    "catalog"><cfset newitem = newitem + " - Catalog
    Sheets"></cfif>
    displays CMYK Printing - Catalog Sheets. Or, it should... or,
    more precisely, I want it to. :)
    How do I append a string to another string?

    + is the addiion operator and works with numbers. Because
    your string is a ...well, string... you need to use an ampersand.
    Thus, instead of:
    <cfif #QStockDB.description# contains
    "catalog"><cfset newitem = newitem + " - Catalog
    Sheets"></cfif>
    Use...
    <cfif #QStockDB.description# contains
    "catalog"><cfset newitem = newitem & " - Catalog
    Sheets"></cfif>
    <cfoutput>#newitem#</cfoutput>
    At least I think that will work - haven't tested it
    though.

  • To count number of occurances of a char in a very large string

    Hi All,
    I like to count the no of occurances of a char in a a very large string.
    for example
    char ch - 'c'
    string str - "practical example is always needed"
    c occured 2 times in the above string.
    Thanks,
    J.Kathir

    > string str - "practical example is always needed"
    Try to finish this:
            String str = "practical example is always needed";
            char search = 'c';
            int occurrence = 0;
            for(int i = 0; i < str.length(); i++) {
                // Use a method from the String-class to get a
                // char from a specific location in the String.
                // See: http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html
            System.out.println("Occerrence of char "+search+" in \""+str+"\" is: "+occurrence);

  • Converting from xsd:string into soapenc:string

    Hi.
    I was successfull in invoking an axis web service with complex type definitions.
    However, when building and deploying the process flow, I get a warning message as follows:
    " [bpelc] [Warning]: Trying to assign incompatible types
    [bpelc] [Description]: in line 53 of "C:\eclipse\workspace\ComplexTypeWSFlow\TimeSheetProcess.bpel", <from> value type "{http://www.w3.org/2001/XMLSchema}string" is not compatible with <to> value type "{http://schemas.xmlsoap.org/soap/encoding/}string".
    [bpelc] [Potential fix]: Please make sure that the return value of from-spec query is compatible with the to-spec query.
    [bpelc]
    [bpelc] [Warning]: Trying to assign incompatible types
    [bpelc] [Description]: in line 58 of "C:\eclipse\workspace\ComplexTypeWSFlow\TimeSheetProcess.bpel", <from> value type "{http://www.w3.org/2001/XMLSchema}string" is not compatible with <to> value type "{http://schemas.xmlsoap.org/soap/encoding/}string".
    [bpelc] [Potential fix]: Please make sure that the return value of from-spec query is compatible with the to-spec query."
    My question is:
    How can I convert between xsd:string and soapenc:string types when using the "assign" activity so I don't keep getting this warning message?
    Regards

    Hi Paulo,
    This warning has been fixed in recent builds. eventhough soapenc:string is different type, it extends xs:string and its simple content. the compiler shouldn't display warnings for this case.

  • Very large XML String parameters

    Hi !
    I'm using AXIS 1.x, websphere 5 --- The problem is - when i call webservice with xml (String) parameter upto size of 10kb-400kb.. it works fine..
    But my application could genrate very large xml, like 900kb-1000kb even more.. When this large XML is sent as String parameter.. no reply is recieved back..
    Can some body throw some light.. what is going wrong... and which approch to be followed.
    Thanks a lot
    @mit

    Maybe this example on the XDB forum will be helpful...
    XMLType view of Relational Content
    XML type questions are best asked in that forum.
    ;)

Maybe you are looking for

  • If statement in drop down

    I have dynamic values in drop down menu. that contains user names. it is used to send feedback. I want that if admin logins then show all user names, else show only 'admin' in the drop down box. how can I use if statement in drop down? ResultSet rset

  • Can't connect to wireless network with Apple Computers (WRT54G)

    Hi. I have had a password protected router (WRT54G) for over two years that any pc will connect to if I give them the password.  However, my roommate has an apple laptop and he can't connect, no matter how many times we try.  I just got an ipod touch

  • Vbscript sendkeys to a chosen window

    I would like to send simple letters and numbers to a window. I tested it in another notepad, and it works. But when I start it and click on the window (a game actually..), nothing happens. Do you have any idea? I think I'm missing only one command

  • Interanl table output problem

    Hi Friends,              I am selecting condition types(kschl) & value(kbetr) from konv table. My requirement is I need to generate output columns(alv output) based on number of or available condition type  in internal table. Internal table records:

  • Print PDF in Adobe form

    Hi EXpert... I have  a documen in PDF format.. I want to print this in classical adobe form  as a photo.. is this possible??? thanks, Shri