1. Member
Join Date
Mar 2012
Posts
2
Rep Power
0

## Spell checker

Hello guys, i'm new to the forum so i hope i am posting this in the right section , if not i apologize in advance.I need to implement a spell checker in java , let me give you an example for a string lets say "sch aproblm iseasili solved" my output is "such a problem is easily solved".The maximum length of the string to correct is 64.As you can see my string can have spaces inserted in the wrong places or not at all and even misspelled words.I need a little help in finding a efficient algorithm of coming up with the corrected string. I am currently trying to delete all spaces in my string and inserting spaces in every possible position , so lets say for the word (it apply to a sentence as well) "hot" i generate the next possible strings to afterwords be corrected word by word using levenshtein distance : h o t ; h ot; ho t; hot. As you can see i have generated 2^(string.length() -1) possible strings. So for a string with a length of 64 it will generate 2^63 possible strings, which is damn high, and afterwords i need to process them one by one and select the best one by a different set of parameters such as : - total editing distance (must take the smallest one)
-if i have more strings with same editing distance i have to choose the one with the fewer number of words
-if i have more strings with the same number of words i need to choose the one with the total maximum frequency the words have( i have a dictionary of the most frequent 8000 words along with their frequency )
-and finally if there are more strings with the same total frequency i have to take the smallest lexicographic one.

So basically i generate all possible strings (inserting spaces in all possible positions into the original string) and then one by one i calculate their total editing distance, nr of words ,etc. and then choose the best one, and output the corrected string. I want to know if there is a easier(in terms of efficiency) way of doing this , like not having to generate all possible combinations of strings etc.

2. Member
Join Date
Feb 2012
Posts
35
Rep Power
0

## Re: Spell checker

Well, if you want a spell checker that correct's something like "cukumber" to "cucumber" then you might need a source of ALL the english words.

3. Member
Join Date
Mar 2012
Posts
2
Rep Power
0

## Re: Spell checker

Well i have a dictionary ofcourse, i should state my current progress. I manage to correct a sentence of length ~10-12 characters. I would like a little bit of help in implementing a dynamic programming algorithm that splits my string into lets say n strings of length 10 and corrects them separately and then combines the solutions. A small problem will be the splitting because i have to somehow check if i split in the middle of a word . Let's say for "schaproblemcanbeeasilisolved" i will split it into something like "schaproblmcan" "beeasilisolved" , and i have to be careful to not split it like "schaproblmc" "anbeeasilisolved" (in the middle of can)

4. ## Re: Spell checker

Hi,
This is an intresting project.
I am afriad that it is also quite a complex one, I would therfore suggest contacting someone who has worked on a similar project, such as openoffice.
see if you can contact someone who worked on the team and see how they conduct their tests on strings, Even if it is in another programming language, it will still be the same method.

I wish you luck in the project, and Hope to see some progress posts.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•