Results 1 to 16 of 16
  1. #1
    KAS
    KAS is offline Member
    Join Date
    Mar 2011
    Posts
    19
    Rep Power
    0

    Default Counting occurence of a word in file

    I can't seem to get the current code to count all occurences of specified word in the file. It should display 10, but only counts 8. Any obvious reason why it doesnt do this properly?

    Java Code:
    import java.io.*;
    
    class lestal {
    
        public static void main(String[] args) throws Exception {
    
            FileReader fil = new FileReader("d:\\lab9_3.txt");
            BufferedReader br = new BufferedReader(fil);
            StreamTokenizer st = new StreamTokenizer(br);
    
            
            String SearchFor = "Java", word = "";
            int input, counter = 0;
            
            while ((input = st.nextToken()) != StreamTokenizer.TT_EOF) {
    
                if ((input == StreamTokenizer.TT_WORD)) {
                    word = st.sval;
                }
                if (word.compareTo(SearchFor) == 0) {
                    System.out.println(st.sval);
                    counter++;
                }
            }
            System.out.println("Fant ordet \"Java\" " + counter + " ganger.");
            fil.close();
        }
    }
    This is the source of which it is supposed to count: http://pastebin.com/P7UdF8g7
    Last edited by KAS; 05-15-2011 at 05:40 PM.

  2. #2
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,332
    Rep Power
    25

    Default

    Try debugging the code by adding print statements to show what tokens are being read and what Strings are being compared. The output should show you where the problem is.

    What does the input file contain?
    Last edited by Norm; 05-15-2011 at 06:22 PM.

  3. #3
    KAS
    KAS is offline Member
    Join Date
    Mar 2011
    Posts
    19
    Rep Power
    0

    Default

    Quote Originally Posted by Norm View Post
    Try debugging the code by adding print statements to show what tokens are being read and what Strings are being compared. The output should show you where the problem is.

    What does the input file contain?
    Input file is equal to the text here: Primitive Data Types The Java programming language is statically-typed, which m - Pastebin.com

  4. #4
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,332
    Rep Power
    25

    Default

    How does the program work with other input?

  5. #5
    KAS
    KAS is offline Member
    Join Date
    Mar 2011
    Posts
    19
    Rep Power
    0

    Default

    If I just type "Java" 13 times in the same file(removing the rest), it counts all 13. So it seems the problem is located elsewhere.

  6. #6
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,457
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by KAS View Post
    There are more than 10 occurrances of the word "Java" in that file ...

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  7. #7
    dlorde is offline Senior Member
    Join Date
    Jun 2008
    Posts
    339
    Rep Power
    7

    Default

    Just a reminder about StringTokenizer - the API docs say "StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead."

    For some reason they haven't deprecated it yet, but the sentiment is similar.

  8. #8
    KAS
    KAS is offline Member
    Join Date
    Mar 2011
    Posts
    19
    Rep Power
    0

    Default

    Quote Originally Posted by JosAH View Post
    There are more than 10 occurrances of the word "Java" in that file ...

    kind regards,

    Jos
    If you count in "java." yes there is, but there are 10 consisting of "Java".

  9. #9
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,332
    Rep Power
    25

    Default

    So have you tried debugging it using print statements to see where the problem is?

  10. #10
    KAS
    KAS is offline Member
    Join Date
    Mar 2011
    Posts
    19
    Rep Power
    0

    Default

    Quote Originally Posted by Norm View Post
    So have you tried debugging it using print statements to see where the problem is?
    It seems to skip here(marked in bold)
    Primitive Data Types
    The Java programming language is statically-typed, which means that all variables must first be declared before they can be used. This involves stating the variable's type and name, as you've already seen:

    int gear = 1;

    Doing so tells your program that a field named "gear" exists, holds numerical data, and has an initial value of "1". A variable's data type determines the values it may contain, plus the operations that may be performed on it. In addition to int, the Java programming language supports seven other primitive data types. A primitive type is predefined by the language and is named by a reserved keyword. Primitive values do not share state with other primitive values. The eight primitive data types supported by the Java programming language are:

    byte: The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive). The byte data type can be useful for saving memory in large arrays, where the memory savings actually matters. They can also be used in place of int where their limits help to clarify your code; the fact that a variable's range is limited can serve as a form of documentation.
    If I remove the ' it works just fine. Suggestions?
    Last edited by KAS; 05-15-2011 at 07:54 PM.

  11. #11
    masijade is offline Senior Member
    Join Date
    Jun 2008
    Posts
    2,571
    Rep Power
    9

    Default

    If the file will always be small simply read the entire file into a String and then use indexOf(String, int) in a loop with a increment variable.

  12. #12
    KAS
    KAS is offline Member
    Join Date
    Mar 2011
    Posts
    19
    Rep Power
    0

    Default

    Tried to write the code a bit different:
    Java Code:
    import java.io.*;
    
    class OcccurenceCounter {
    
        public static void main(String[] args) throws Exception {
    
            FileReader fil = new FileReader("d:\\lab9_3.txt");
            BufferedReader br = new BufferedReader(fil);
            StreamTokenizer st = new StreamTokenizer(br);
    
            
            String SearchFor = "Java", word = "";
            int input, counter = 0;
            
            do{
                do{
                input = st.nextToken();            
                } while (input != StreamTokenizer.TT_WORD);
                word = st.sval;            
                if(word.matches(SearchFor)){
                    System.out.println(word);
                    counter++;                
                }
            } while (input != StreamTokenizer.TT_EOF);
            fil.close();
            
            System.out.println("Fant \"Java\" " + counter + " ganger.");
            
        }
    }
    It still seem to get stuck at the same spot.

  13. #13
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,332
    Rep Power
    25

    Default

    There are many methods to call to control how StreamTokenizer works.
    Like ordinaryChar or quoteChar.

    Have you tried some of those?

  14. #14
    KAS
    KAS is offline Member
    Join Date
    Mar 2011
    Posts
    19
    Rep Power
    0

    Default

    It seems the StreamTokenizer is bugged somehow, so I'm trying to solve this in another way.

  15. #15
    JosAH's Avatar
    JosAH is offline Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,457
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by KAS View Post
    It seems to skip here(marked in bold)

    If I remove the ' it works just fine. Suggestions?
    Yep, a StreamTokenizer treats a ' as the start of a String, ending when another quote character is read or the end of the line is scanned. A StreamTokenizer is a very old class; as the API documentation says, don't use it.
    cenosillicaphobia: the fear for an empty beer glass

  16. #16
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    SW Missouri
    Posts
    17,332
    Rep Power
    25

    Default

    If you add
    st.wordChars('\'', '\'');

    my version of the program prints out: Found "Java" 10 times.


    See post #13

Similar Threads

  1. Replies: 4
    Last Post: 05-07-2010, 02:06 PM
  2. Counting specific word from a file
    By jaq in forum New To Java
    Replies: 2
    Last Post: 12-02-2009, 06:12 PM
  3. Searching the first occurence
    By The Hawk in forum New To Java
    Replies: 7
    Last Post: 11-29-2009, 12:36 PM
  4. count occurence of word in a line of text
    By sinyi88 in forum New To Java
    Replies: 19
    Last Post: 02-28-2009, 07:37 AM
  5. [SOLVED] Help on Word and Character counting in java
    By Alistair in forum New To Java
    Replies: 2
    Last Post: 05-15-2008, 03:48 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •