Results 1 to 4 of 4
Thread: Spell checker
- 03-15-2012, 10:09 PM #1
Member
- Join Date
- Mar 2012
- Posts
- 2
- Rep Power
- 0
Spell checker
Hello guys, i'm new to the forum so i hope i am posting this in the right section , if not i apologize in advance.I need to implement a spell checker in java , let me give you an example for a string lets say "sch aproblm iseasili solved" my output is "such a problem is easily solved".The maximum length of the string to correct is 64.As you can see my string can have spaces inserted in the wrong places or not at all and even misspelled words.I need a little help in finding a efficient algorithm of coming up with the corrected string. I am currently trying to delete all spaces in my string and inserting spaces in every possible position , so lets say for the word (it apply to a sentence as well) "hot" i generate the next possible strings to afterwords be corrected word by word using levenshtein distance : h o t ; h ot; ho t; hot. As you can see i have generated 2^(string.length() -1) possible strings. So for a string with a length of 64 it will generate 2^63 possible strings, which is damn high, and afterwords i need to process them one by one and select the best one by a different set of parameters such as : - total editing distance (must take the smallest one)
-if i have more strings with same editing distance i have to choose the one with the fewer number of words
-if i have more strings with the same number of words i need to choose the one with the total maximum frequency the words have( i have a dictionary of the most frequent 8000 words along with their frequency )
-and finally if there are more strings with the same total frequency i have to take the smallest lexicographic one.
So basically i generate all possible strings (inserting spaces in all possible positions into the original string) and then one by one i calculate their total editing distance, nr of words ,etc. and then choose the best one, and output the corrected string. I want to know if there is a easier(in terms of efficiency) way of doing this , like not having to generate all possible combinations of strings etc.
- 03-18-2012, 04:31 AM #2
Member
- Join Date
- Feb 2012
- Posts
- 35
- Rep Power
- 0
Re: Spell checker
Well, if you want a spell checker that correct's something like "cukumber" to "cucumber" then you might need a source of ALL the english words.
- 03-18-2012, 01:16 PM #3
Member
- Join Date
- Mar 2012
- Posts
- 2
- Rep Power
- 0
Re: Spell checker
Well i have a dictionary ofcourse, i should state my current progress. I manage to correct a sentence of length ~10-12 characters. I would like a little bit of help in implementing a dynamic programming algorithm that splits my string into lets say n strings of length 10 and corrects them separately and then combines the solutions. A small problem will be the splitting because i have to somehow check if i split in the middle of a word . Let's say for "schaproblemcanbeeasilisolved" i will split it into something like "schaproblmcan" "beeasilisolved" , and i have to be careful to not split it like "schaproblmc" "anbeeasilisolved" (in the middle of can)
- 03-19-2012, 05:22 PM #4
Re: Spell checker
Hi,
This is an intresting project.
I am afriad that it is also quite a complex one, I would therfore suggest contacting someone who has worked on a similar project, such as openoffice.
see if you can contact someone who worked on the team and see how they conduct their tests on strings, Even if it is in another programming language, it will still be the same method.
I wish you luck in the project, and Hope to see some progress posts.
Similar Threads
-
Spell Checker using hash tables not working need help.
By chris1 in forum Advanced JavaReplies: 15Last Post: 12-11-2011, 09:20 PM -
Spell Checker in Swing
By dhaivat in forum New To JavaReplies: 2Last Post: 07-01-2010, 05:08 PM -
How to Spell out numbers?
By syntrax in forum New To JavaReplies: 5Last Post: 10-22-2009, 03:34 PM -
Spell check in Textarea.
By surya_dks in forum Advanced JavaReplies: 2Last Post: 08-22-2008, 12:33 AM -
spell check
By kazitula in forum Java AppletsReplies: 2Last Post: 12-20-2007, 11:37 AM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks