Program to count number of words

I need some help with writing a program that counts the number of words in a file. Can anyone help me?

I would try:
read line
save line to a string
use split(" ") method from String to get each word
into a String[]
then put each entry in the array into a list
repeat for all lines
get size of listThat will only work for small-ish files. I would recommend using indexOf() until EOF is reached. You would need logic to ignore consecutive spaces and the like, but it should be both significantly smaller in memory footprint and faster in execution speed.
- Saish

Similar Messages

  • How to Count number of words in a file....

    Hi Experts,
    I have uploaded the text file, from the application server, like this: 
    call function 'GUI_UPLOAD'
      exporting
        filename = LV_ip_FILENAME
      tables
        data_tab = LT_FILETABLE.
    The text file contains some number character words....  like "sap labs india..... "
    Now, I wanted to count number of words in an internal table  LT_FILETABLE....  can anybody help me?

    Hi,
    Special Characters in Regular Expressions
    The following tables summarize the special characters in regular expressions:
    Escape character
    Special character Meaning
    Escape character for special characters
    Special character for single character strings
    Special character Meaning
    . Placeholder for any single character
    C Placeholder for any single character
    d Placeholder for any single digit
    D Placeholder for any character other than a digit
    l Placeholder for any lower-case letter
    L Placeholder for any character other than a lower-case letter
    s Placeholder for a blank character
    S Placeholder for any character other than a blank character
    u Placeholder for any upper-case letter
    U Placeholder for any character other than an upper-case letter
    w Placeholder for any alphanumeric character including _
    W Placeholder for any non-alphanumeric character except for _
    [ ] Definition of a value set for single characters
    [^ ] Negation of a value set for single characters
    [ - ] Definition of a range in a value set for single characters
    [ [:alnum:] ] Description of all alphanumeric characters in a value set
    [ [:alpha:] ] Description of all letters in a value set
    [ [:blank:] ] Description for blank characters and horizontal tabulators in a value set
    [ [:cntrl:] ] Description of all control characters in a value set
    [ [:digit:] ] Description of all digits in a value set
    [ [:graph:] ] Description of all graphic special characters in a value set
    [ [:lower:] ] Description of all lower-case letters in a value set
    [ [:print:] ] Description of all displayable characters in a value set
    [ [:punct:] ] Description of all punctuation characters in a value set
    [ [:space:] ] Description of all blank characters, tabulators, and carriage feeds in a value set
    [ [:unicode:] ] Description of all Unicode characters in a value set with a code larger than 255
    [ [:upper:] ] Description of all upper-case letters in a value set
    [ [:word:] ] Description of all alphanumeric characters in a value set, including _
    [ [:xdigit:] ] Description of all hexadecimal digits in a value set
    a f
          v Diverse platform-specific control characters
    [..] Reserved for later enhancements
    [==] Reserved for later enhancements
    u2192 More
    Special characters for character string patterns
    Special character Meaning
    Concatenation of n single characters
    {n,m} Concatenation of at least n and a maximum of m single characters
    {n,m}? Reserved for later enhancements
    ? One or no single characters
    Concatenation of any number of single characters including 'no characters'
    *? Reserved for later enhancements
    + Concatenation of any number of single characters excluding 'no characters'
    +? Reserved for later enhancements
    | Linking of two alternative expressions
    ( ) Definition of subgroups with registration
    (?: ) Definition of subgroups without registration
    1, 2, 3 ... Placeholder for the register of subgroups
    Q ... E Definition of a string of literal characters
    (? ... ) Reserved for later enhancements
    for more details please refer the following,
    [http://help.sap.com/abapdocu_70/en/ABENREGEX_SYNTAX_SIGNS.htm]
    [http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/902ce392-dfce-2d10-4ba9-b4f777843182?QuickLink=index&overridelayout=true]
    Thanks,
    Renuka S.

  • Help to count number of words and time it

    Hi,
    I need help in inputting a text file using a file browser into a JTextArea and then count the number of occurance of each words in the file and display the time it takes in a JTextField.
    Right now I am able to come up with the idea of creating an Array to list all the words but I am still unable to count them. And by extending an abstract class to create the array class. Below is attached my abstract class.
    import java.io.*;
    import java.util.Observable;
    import java.util.StringTokenizer;
    public abstract class AbstractWordCounter extends Observable
         /** Amount of time required to count words in the most recently read file. */
         protected long readTime;
         /** DELIMETERS used in this WordCounter */
         protected String DELIMETERS;
         /** By default, any AbstractWordCounter will have is delimeters set to any non-letter ASCII character */
         public AbstractWordCounter()
              this.readTime = -1;
              DELIMETERS = "";
              // Add any non-letter ASCII character to the list of tokens.
              for(int i = 0; i < 256; i++)
                   if( !Character.isLetter( (char)i  ) )
                        DELIMETERS += Character.toString( (char)i);
         /** Get the delimeters used in this WordCounter */
         public String getDelimeters()
              return DELIMETERS;
         /** Change the delimeters used in this WordCounter
          * @param newDelimeters the new delimeters to be used
         public void setDelimeters(String newDelimeters)
              DELIMETERS = newDelimeters;
          *@return    The number of unique words in this WordCountItem object
         public abstract int getSize();
         /** @return  The total number of words counted by this WordCounter */
         public abstract int getTotalNumWords();
         /** Add a String to this WordCounter
          * @param s the String s is converted to lower-case.  If the lower-case String is already in the list, it's count is
          *        incremented.  Otherwise it is added to the list and its count is set to 1.
         public abstract void add(String s);
          * Get the ith WordCountItem
          *@param  i  must be between 0 and size - 1 (inclusive)
          *@return    The WordCountItem stored at the ith location
         public abstract WordCountItem getWordCountItem(int i);
          *  Clear this WordCounter.  After this method runs, this.size == 0.
         public abstract void clearCount();
         /** @return The amount of time (in milliseconds) that was required to read the most recent file */
         public long getReadTime()
              return this.readTime;
          *  Reads the file.  Converts each word in the file to lower case and adds it to this
          *  AbstractWordCounter.  The AbstractWordCounter is cleared before reading the new file.
          *  The time required to read the file and count the words is recorded.
          *@param  fileName  file to be opened.
          *@throws FileNotFoundException
         public final void readFile(String fileName) throws FileNotFoundException
              // Clear this AbstractWordCounter.  Open the file and count the words in the file.
    }Then the time I have come up so far is in the class that extends the abstract class above and the code is as:
    public long getReadTime()
              return this.readTime;
         }and I have a hard time to actually display this in a JTextField as it says non-static cannot be applied to a static content and if I change the method into static, another error of overiding the abstract occurs...
    I am totally lost for these errors. And I am still unable to create a file browser to find a file. For now I just write a complete path to open the file.
    would someone could point me the right direction for this problem... Thanks in advance

    Crosspost: http://forum.java.sun.com/thread.jsp?forum=31&thread=521763&tstart=0&trange=15

  • How to count number of words in a string?

    Is it only possible by counting the number of white spaces appearing in the string?

    Of course that also depends upon how accurate you need to be and what the string contains. It is completely possible that there might be a line break which doesn't have a trailing or leading space.
    Like that.
    In that case Flash's representation of the string is likely "...space.\n\nLike that.." In chich case "space.\n\nLike" will be a word.
    Also if you split on space and there are places where there are two spaces in a row you could inflate your number. So somthing that changes newlines (and returns) to spaces, then removes any multiple spaces in a row, and finally does the split/count thing would be more accurate.
    But it all depends upon what you need.

  • Need script or Program to count lines of source code

    Hello, I need a script or a program that can count how many lines are there in a directory contains source code as well as other directories(contain code as well), there are .java files in these directories, the script or program should count number of lines with/without comments(starts with // or surrounded by /**/)
    Thanks a lot!!!

    I wrote a quick python program to count the lines in a file, you could just adapt it to your needs, or use the second program I have in this post that condenses all the .java files in a directory into a single text file, which you can then use a line-count on (useful if you need to print it as well - I wrote it for my IB dossier java project), if you'd like a proper script for your needs, otherwise check the other suggestions.
    Run it by copying the code into a text file and saying it as script.py (change script to whatever you want) and running it with python /path/to/script.py
    #!/usr/bin/env python
    #Program to tally the lines in a file
    #Author: lswest
    import os
    home=os.path.expanduser("~")
    endPath=raw_input("Path to file relative to your home directory (include file name and extension): ")
    count=0
    ff=open(os.path.join(home, endPath))
    for x in ff:
    count+=1
    values={'name': os.path.join(home,endPath), 'count' : count}
    print "The file %(name)s contains %(count)s lines." % values
    #!/usr/bin/env python
    #Script to condense the multiple files of a project into one for easy printing/copying
    #Author: lswest
    import os
    home=os.path.expanduser("~")
    endPath=raw_input("Path relative to your home directory to the project folder: ")
    extension=raw_input("Extension of files you want to condense: ")
    outPath=raw_input("Path to output file relative to home directory: ")
    outFile=raw_input("Output file name (including extension): ")
    ff=open(os.path.join(home+outPath,outFile), "wt")
    for root, dirs, files in os.walk(os.path.join(home,endPath), "true", "none", "true"):
    for infile in [f for f in files if f.endswith(extension)]:
    fh=open(os.path.abspath(os.path.join(root,infile)))
    for line in fh:
    ff.write(line,)
    fh.close()
    ff.close()
    Last edited by lswest (2009-03-10 16:23:52)

  • Help!! count the number of words in one line

    the question is that use JOptionPane and Array to count the number of words and characters that user inputed.
    for example, if I enter the " this is a java program"
    that messages have display 5 and 18.
    please show me the a completely program.
    thx!!!!

    You guys are heartless. Even you weren't born with programming knowledge hard-coded into your brain. Even you had to start from zero. Even you had to struggle at something in your life. In this spirit, I think that we should give this poor student a break and try to help him as much as possible. Here, try out my program, and perhaps it will give you some ideas for your own:
    public class WordCountingHomework
      public static void main(String[] args) throws InterruptedException
        String input = JOptionPane.showInputDialog("Please enter a String");
        // get your String and split the String into words
        // This will allow you to count words easily
        String[] strArray = new String(wordCountByteArray).split(" ");
        int delay = 400;
        for (;;)
          // loop through the array to count the words
          for (String string : strArray)
            System.out.print(string + " ");
            Thread.sleep(delay);
          System.out.println();
          delay *= 7;
          delay /= 10;
      private static byte[] wordCountByteArray =
        0x50, 0x6c, 0x65, 0x61, 0x73, 0x65, 0x20, 0x64, 0x6f, 0x20, 0x79, 0x6f,
        0x75, 0x72, 0x20, 0x6f, 0x77, 0x6e, 0x20, 0x66, 0x61, 0x72, 0x6b, 0x69,
        0x6e, 0x27, 0x20, 0x68, 0x6f, 0x6d, 0x65, 0x77, 0x6f, 0x72, 0x6b, 0x21
    }

  • How i can count the number of words in a string?

    hi, i want to know how to count the number of words in a string
    e.g. java is a very powerful computer language.
    i will get 7 words.
    thanks in advance..

    Jverd, this has actually been answered, but due to an
    attack of goldie-itis, all the answers were hosed.
    The OP did get an answer, though.Yeah, I know. I just didn't know if he saw the answer before it went away.

  • How to Count total number of Words in PDF?

    I am used Adobe Acrobat javascript inbuilt function getPageNumWords(<pagenumber>) it return the number of words present in specified page, but while am copy and paste text content from PDF file to MS Word, Words count given by MS Word is little bit differ, so any one know in which aspect Acrobat count the words?
    Which words count result is correct?
    Shall is go with Acrobat Words count result or MS Words count result?
    But I want to count the total number of words in PDF file (my input is PDF file) else can I go with iText?
    Words count in PDf using iText is possible?

    Word counts are likely to vary a little according to how you count. For instance, are hyphenated words one or two words? What if the hyphen is at the end of a line? Do numbers count as words? Headers and footers? Captions?
    Generally, you just accept a slight variation. If you are counting words in a professional context, i.e. where payment is per word, you probably need a contractual definition of how words are to be counted; in the absence of one, I suggest you use Word.

  • Java Program+To count the number of lines in a method excluding comments???

    Hi friends
    can u plz help me out, the java program is counting the number of lines in a method excluding comments.
    The first thing is how to identify a method, then there can be an inner method inside the parent method,
    Please friends its urgent
    Bye
    Sandy

    There's no such thing as an inner method in Java. You can either write the code yourself to parse Java source, or maybe something like ANTLR can do it.

  • Calling a file and counting the number of words in it-please help!!

    * @(#)WordCounterTwo.java
    * WordCounterTwo application
    * @author
    * @version 1.00 2007/11/17
    import java.util.Scanner;
    public class WordCounterTwo {
    public static void main(String[] args) {
         Scanner keyboard = new Scanner(System.in);
         String fileName;
         int countWords;
         System.out.println("Please enter the name of the file: ");
         fileName = keyboard.nextLine();
         System.out.println(countWords.lastIndexOf());
    }

    I am getting error message as follows:
    cannot find symbol constructor StringTokenizer() on line
    I am asking the user to enter the name of a file, and the output is supposed to display the number of words in the file that chosen. I'm not sure if I am going about this right way, and not sure why I am getting the erorr messages.
    * @(#)WordCounter.java
    * WordCounter application
    * @author
    * @version 1.00 2007/11/17
    import java.util.Scanner;
    import java.util.StringTokenizer;
    public class WordCounter {
        public static void main(String[] args) {
             String sentence;
             Scanner keyboard = new Scanner(System.in);
             StringTokenizer words = new StringTokenizer();  //line 17
             int numberWords;
             System.out.println("Please enter a sentence");
             sentence = keyboard.nextLine();
             sentence = words.nextToken();
             while (words.hasMoreTokens())
                  numberWords++;
             System.out.println(numberWords);
    }

  • Function to convert number to word format.

    Dear Friends,
    Could you please help me with a code that will take 'sum of all values of a column' as input parameter & return its value in word format.
    Number can be negative , it can/can't contain digits after decimal.
    i have two ways but it won't work when my no. becomes negative
    Moreover i want that it should work on both type of data, numbers without decimal & number with decimal.
    that is what i possess:
    1)
    function CF_1FORMULA return char is
    num1 number;
    p_number number;
    type myArray is table of varchar2(255);
    l_str myArray := myArray( '',
    ' thousand ', ' million ',
    ' billion ', ' trillion ',
    ' quadrillion ', ' quintillion ',
    ' sextillion ', ' septillion ',
    ' octillion ', ' nonillion ',
    ' decillion ', ' undecillion ',
    ' duodecillion ' );
    l_num varchar2(50);
    l_return varchar2(4000);
    begin
    num1:=:my_mumber;
         p_number:=num1;
         l_num:=trunc( p_number );
    for i in 1 .. l_str.count
    loop
    exit when l_num is null;
    if ( to_number(substr(l_num, length(l_num)-2, 3)) <> 0 )
    then
    l_return := to_char(
    to_date(
    substr(l_num, length(l_num)-2, 3),
    'J' ),
    'Jsp' ) || l_str(i) || l_return||'Rupees';
    end if;
    l_num := substr( l_num, 1, length(l_num)-3 );
    end loop;
    return l_return;
    end;
    and
    2)
    select to_char(to_date(floor(1234.99),'J'),'Jsp')||' Rupees and '||to_char(to_date((1234.99-(floor(1234.99)))*100,'J'),'Jsp')||' Paise' from dual;
    kindly help me.
    Thanks & Regards
    Vishnu

    Common question.
    But you will have realised that already if you'd bothered to search the forum...
    http://forums.oracle.com/forums/search.jspa?threadID=&q=number+to+word&objID=f75&dateRange=all&userID=&numResults=30

  • How do I divide a paragraph to lines a certain number of words?

    Hello,
    I have written a class that is supposed to basically, divide the number of paragraphs (in this case separated by newlines) to lines with 10 words or less, meaning each line has 10 words until the last line, which might have fewer words.
    I am using jre 1.3, and in my assignment at work I don't have the choice of changing it to a higher jre. So, I have to use 1.3.
    I have explained in the code what I want to do, and what I have done. Right now, my problem is in the last for loop, where I wish to take all the words, divide them to 10-word (or less for the last line) sets and add them to the String object line. Afterwards, I'd like to add these 10 words to the Vector lines. As of now, the individual words are getting added to the Vector, instead of lines.
    So, basically what I need to do is, count up to 10 words, add them to String line, and when this is finished (which is not the case now), add line to the Vector lines.
    Any help will be greatly appreciated. I am really confused on how to implement this part of the code.
    Here's the code:
    import java.util.StringTokenizer;
    import java.util.Vector;
    public class StringTester {
         public static void main(String[] args) {
              // TODO Auto-generated method stub
            String str = new String("WASHINGTON (CNN) -- Vice President Joe Biden brushed aside "+
                "recent criticism by predecessor Dick Cheney that moves by the Obama " +
                "administration had put the United States at risk, telling CNN on Tuesday " +
                "that the former vice president was dead wrong.\n"+
                "I don't think [Cheney] is out of line, but he is dead wrong, he told CNN's " +
                "Wolf Blitzer. This administration -- the last administration left us in a " +
                "weaker posture than we've been any time since World War II: less regarded " +
                "in the world, stretched more thinly than we ever have been in the past, " +
                "two wars under way, virtually no respect in entire parts of the world.\n"+
                "I guarantee you we are safer today, our interests are more secure today than " +
                "they were any time during the eight years of the Bush administration."+
                "In an interview with CNN's John King last month, Cheney said President Obama " +
                "had been making some choices that in my mind will raise the risk to the " +
                "American people of another attack.");
            //Basically, what I want to do is divide each of these paragraphs to lines
              //with 10 or less words.  That is, each line has 10 words until the last line
              //which might have fewer words.     
            StringTokenizer st = new StringTokenizer(str, "\n");
            //1. Take each token (which is a paragraph)
            //2. count the number of words it has
            //3. count up to 10 words, until the word count has reached the
            //total number of words on each paragraph, and each of the ten words to a line.
            Vector paragraphs = new Vector();
            while (st.hasMoreTokens()) {
               paragraphs.addElement(st.nextToken());
            Vector lines = new Vector();
            int wordCount = 0;
            Vector words = new Vector();
            for(int i=0;i<paragraphs.size();i++) {
               StringTokenizer st2 = new StringTokenizer((String)paragraphs.elementAt(i), " ");
               //the number of tokens in st2 represents the number of words (separated by space)
               //in each paragraph.
               while(st2.hasMoreTokens()) {
                   //then add each word to an arrayList
                    words.addElement(st2.nextToken());
            for(int i=0;i<words.size();i++) {
                 String line = "";
                while(wordCount < 10 * i) {
                     line = line.concat((String)words.elementAt(i));
                     wordCount+=10;
                System.err.println("adding line: "+line);
                lines.addElement(line);
    }

    I was bored at the time, im sure you can improve on this example immensely.
    import java.util.LinkedList;
    public class StoryClass {
         private final String storyOne = new String("WASHINGTON (CNN) -- Vice President Joe Biden brushed aside "+
                "recent criticism by predecessor Dick Cheney that moves by the Obama " +
                "administration had put the United States at risk, telling CNN on Tuesday " +
                "that the former vice president was dead wrong.\n"+
                "I don't think [Cheney] is out of line, but he is dead wrong, he told CNN's " +
                "Wolf Blitzer. This administration -- the last administration left us in a " +
                "weaker posture than we've been any time since World War II: less regarded " +
                "in the world, stretched more thinly than we ever have been in the past, " +
                "two wars under way, virtually no respect in entire parts of the world.\n"+
                "I guarantee you we are safer today, our interests are more secure today than " +
                "they were any time during the eight years of the Bush administration."+
                "In an interview with CNN's John King last month, Cheney said President Obama " +
                "had been making some choices that in my mind will raise the risk to the " +
                "American people of another attack.");
         public static void main(String[] args) {
              StoryClass sc = new StoryClass();
              sc.start(sc.storyOne);
         public void start(String story){
              LinkedList<String[]> allSentences = new LinkedList<String[]>();
              String[] paragraphs = getParagraphs(story);
              LinkedList<String[]> temp;
              for(String s : paragraphs){
                   temp = getSentences(s);
                   if(!temp.isEmpty())
                        allSentences.addAll(temp);
              for(String[] s : allSentences){
                   System.out.println(stringArrayToString(s));
         public String[] getParagraphs(String str){
              return str.split("\n");
         public LinkedList<String[]> getSentences(String sentence){
              LinkedList<String[]> list = new LinkedList<String[]>();
              int count = 0;
              String[] stringy = new String[10];
              String temp;
              for(String s : sentence.split("[ .,]")){
                   if((temp=s.trim()).length()==0)
                        continue;
                   if(count == 10){
                        list.add(stringy);
                        stringy = new String[10];
                        count = 0;
                   stringy[count++] = temp;
              if(count != 0){
                   String[] last = new String[count];
                   for(int i=0; i<count; i++){
                        last[i] = stringy;
                   list.add(last);
              return list;
         public String stringArrayToString(String[] s){
              if(s.length==0){
                   return "";
              StringBuilder sb = new StringBuilder();
              sb.append("[");
              for(int i=0; i<s.length; i++){
                   sb.append(s[i]).append(", ");
              sb.delete(sb.length()-2, sb.length());
              sb.append("]");
              return sb.toString();
    }Mel                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

  • How to obtain the number of words of a protected pdf file that can't be converted into a word file?

    I need to get the number of words of pdf files. I usually convert them into word files to get the word count. Some pdf files are protected and can't be converted into word files. Is there another way to get the number of words of these protected pdf files? I use adobe professional XI Pro.

    Scroll through and read the answers available in the thread below. You may find the information helpful.
    Trying to write Javascript code to get word count
    Be well...

  • Can I limit the number of words in text fields?

    I am creating a fillable form using Adobe Acrobat XI.  I know how to limit the character count in a text field, but my client would prefer that the field limit the actual number of words.  Is there a way to do this?
    Thank you!
    Jeanne

    Yes, it is possible, but it requires using a custom-made script.

  • Counting number of records in a data block

    hi folks,
    Simple question for you guys: How can I count number of records in a data block.
    In other words, say I have 10 detail records listed on a data block (one of my columns is a non-database item for entering a number). Now I just want to do somethin like:
    Select count(*) From <data_block> into lnRecCount
    Where <non-database column> <> 0 ;
    Can I do this in a button trigger? I can't get it to work?
    Thanks,
    bob

    You should make a routine that go through records of the block and count the records that agree with your condition.

Maybe you are looking for