Problem with an exercise in java
Hello, i have an exercise in java that, we want to find word frequency in documents.
e.g.We have 3 documents a1.txt, a2.txt, a3.txt.
a1 = {aaa, aaa, bbb, kkk, ccc, hhh}
a2 = {kkk, aaa, hhh, ddd}
a3 = {mmm, ccc, ccc, hhh}
So, the df(document frequency) for the distinct words are:
aaa = 2
bbb = 1
ccc = 2
ddd = 1
hhh = 3
mmm = 1
kkk = 2 etc
I have an ArrayList of the above class Word that i store various information about distinct words
Code:
....
private ArrayList<Word> words = new ArrayList<Word>();
public static class Word{
public int lineNum;
public int index;
private int df; //document frequency
public String name;
public int wordcount;
private String filename;
public Word(int lineNum, int index, String name, int wordcount, String filename, int documentfrequency) {
this.lineNum = lineNum;
this.index = index;
this.name = name;
this.wordcount = wordcount;
this.filename = filename;
this.df = df;
}
public int getLineNum() {
return lineNum;
}
public int getIndex() {
return index;
}
public String getName() {
return name;
}
//class to sort words by lineNumber
public static class CompLineNum implements Comparator<Word> {
@Override
public int compare(Word arg0, Word arg1) {
return arg0.lineNum - arg1.lineNum;
}
}
public static class CompName implements Comparator<Word> {
private int mod = 1;
public CompName(boolean desc) {
if (desc) mod =-1;
}
public int compare(Word arg0, Word arg1) {
return mod*arg0.name.compareTo(arg1.name);
}
}
}
....
so, to find df i read again the documents and i have the above method to find duplicates of words in documents
Code:
@SuppressWarnings("empty-statement")
public void findDfOfWord() throws FileNotFoundException, UnsupportedEncodingException, IOException{
String wordtemp = null;
String filetemp = null;
String[] word = null;
int duplicatewordcount = 0;
Map<String, Integer> unique = new LinkedHashMap<String, Integer>();
//scan all files that we choose for the word
for(int k=0; k<filelist.length; k++){
filename = listOfFiles[filelist[k]].getName();
FileInputStream fstream = new FileInputStream(folderpath+"/"+filename);
DataInputStream in = new DataInputStream(fstream);
BufferedReader input = new BufferedReader(new InputStreamReader(in,"UTF-8"));
String wordd = input.readLine();
for(int j=0; j<wordswithoutduplicates.size(); j++){
wordtemp = wordswithoutduplicates.get(j).name;
filetemp = wordswithoutduplicates.get(j).filename;
while(wordd!=null){
word = wordd.split("\\s+");
System.out.println("dddd:::"+wordtemp.compareTo(word.toString()));
[I]if(wordtemp.compareTo(word.toString()) ==0 && filename.compareTo(wordtemp)==0
&& duplicatewordcount<1){[/I]
if(unique.get(wordtemp) == null)
unique.put(wordtemp, 1);
else
unique.put(wordtemp, unique.get(wordtemp) + 1);
String uniqueString = join(unique.keySet(), ", ");
List<Integer> value = new ArrayList<Integer>(unique.values());
System.out.println("Output = " + uniqueString);
System.out.println("Values = " + value);
duplicatewordcount++;
}
wordd = input.readLine();
}
}
}
}
but at the code in line 26-27, it doesn't find duplicates at the if condition.
What is the error?
Re: Problem with an exercise in java
Re: Problem with an exercise in java
Use some debugging.
Put println() statements in there so you can see what it's comparing, because I don't think this:
Code:
wordtemp.compareTo(word.toString())
is doing what you think it is.
WHat do you think the output of 'word.toString()' is going to be?