# Thread: A beginner's question on String matching

1. Originally Posted by nassar
@JosAH:
Well, I have actually taken your advice; my dictionary is now an instance of a Map class, complete with an alphabetized index key and Lists of strings as values, containing all anagrams. The problem is, the game starts with 9 provided letters. What i can do now is test those 9 letters against the dictionary, and it will give me an instantaneous (<1sec) response if the jumble of 9 letters is indexed in the dictionary. However, if and when the first test fails, I'm left trying to make as many possible different 8-letter words out of the 9 initial letters, and trying to match those against my dictionary; the same process has to be repeated if no match is found for any 8-letter word, and so on.
You don't need recursion for generating combinations, e.g. given the 'word' ABCD (all letters alphabetized), you want all the words ABCD, ABC, ABD, AB, ACD, AC, AD, A, BCD, BC, BD, B, CD, C, D and possibly the empty words. Let met put this is a table:

Java Code:
```ABCD present?
---- --------
ABCD 1111
ABC  1110
AB D 1101
AB   1100
A CD 1011
A C  1010
A  D 1001
A    1000
BCD 0111
BC  0110
B D 0101
B   0100
CD 0011
C  0010
D 0001
0000```
As you can see a 1 represents a letter present in the result, a 0 means absence of the letter in the result word. The second column is counting in binary; the counting is done backwards so the longest result comes first; there is no particular order w.r.t. the length of the other words.

Counting backwards in binary is easy: the following method finds the next combination of booleans in an array; it either returns the array or null when there is no next combination:

Java Code:
```boolean[] next(boolean[] present) {
for (int i= present.lenght; i-- > 0; )
if (present[i]) {
for (present[i]= false; ++i < present.lenght; present[i]= true);
return present;
}
return null;
}```
You can use that array of booleans to build up your result String, i.e. if an element is true, include the corresponding character in your result, otherwise skip the character.

kind regards,

Jos

2. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
7
You should also look into memoizing your results. That will make a big difference in efficiency. Keep track of the Strings you've seen already (hint: HashMap<String, String> or HashSet<String>) and either return the same result you got the first time, or simply return null if you get a repeat.

-Gary-

3. Originally Posted by gcalvin
You should also look into memoizing your results. That will make a big difference in efficiency. Keep track of the Strings you've seen already (hint: HashMap<String, String> or HashSet<String>) and either return the same result you got the first time, or simply return null if you get a repeat.
No need to do all that; see my previous reply.

kind regards,

Jos

4. Member
Join Date
May 2010
Posts
13
Rep Power
0
Wow JosAH the boolean arrays threw me off for a second there, I've mulled everything over, and well, from this standpoint, this is what I figure I have to do:

intialise a boolean array with true all over it,
write a toString method that takes a String and a boolean array and converts to a str,
and then this follows:

Java Code:
```
boolean[]=b
initialize(b);
while ( next(b)!=null){
String str= toString(String query, boolean[] b);
if(isInDictionary(str)){
return wordFoundInDictionary;
} else{
return null;
}
}```
I'm getting to work on it.. I'm just a bit hesitant about the condition for my while loop, what do you guys think?

5. Senior Member
Join Date
Mar 2010
Posts
952
Rep Power
7
The longest results still don't come first with this method. AD comes before BCD, for instance. But it is a good, efficient way to generate the combinations. There will still be repeated combinations if there are repeated letters in the initial set, but definitely fewer repeats than the recursive method would generate. Recursion generates N! combinations, while the binary countdown method generates 2^N. Even with N=4 that's 24 vs. 16. When N=5, it's 120 vs. 32, when N=6 it's 720 vs. 64, and after that it starts to make a really big difference. :)

-Gary-

6. Member
Join Date
May 2010
Posts
13
Rep Power
0
Alright, I just finished a 10-minute joy-jumping session. In the past 24 hours I've been literally running on caffeine and trying to beat this deadline. I am beside myself right now.

To Jos, Gary, I really can't thank you guys enough, all of this would have been completely impossible had it not been for you. If someday you find yourselves in the south of - nay, in France, just let me know. I owe you guys quite a few drinks. You guys are awesome. Just awesome.

7. Originally Posted by gcalvin
The longest results still don't come first with this method.
Well, the absolute longest combination does come first because that's where the algorithm starts. If you want all the combinations in descending order w.r.t. their length a more complicated (iterative) method is needed. Here's an example:

Java Code:
```import java.util.Arrays;

public class Combination {

public static boolean combine(int[] c, int n) {

for (int i= c.length; --i >= 0;) {
if (++c[i] <= n-(c.length-i)) {
for (; ++i < c.length; )
c[i]= c[i-1]+1;
return true;
}
}
return false;
}

public static void main(String[] args) {

int N= 5;

for (int m= N; m > 0; m--) {
int c[]= new int[m];
for (int i= 0; i < c.length; c[i]= i++);
do {
System.out.println(Arrays.toString(c));
}
while(combine(c, N));
}
}
}```
kind regards,

Jos
Last edited by JosAH; 05-23-2010 at 01:39 PM.

8. I've lost track of what you're trying to find. Is it the combination of n items taken n-1, n-2 ... to 1 times?
for "ABCD" we have "ABC", "ABD", "ACD", "AD", "AC", "AB" "BC", "BD", "CD", "A", "B", "C", "D"
The output from a program I found via Google gives:
Running: "C:\Program Files\Java\j2re1.4.2_08\bin\java.exe" -cp D:\JavaDevelopment;. GetCombinations

possibilities=4 for 3 out of 4
A B C
A B D
A C D
B C D
created 4
possibilities=6 for 2 out of 4
A B
A C
A D
B C
B D
C D
created 6
possibilities=4 for 1 out of 4
A
B
C
D
created 4

0 error(s)

9. Member
Join Date
May 2010
Posts
13
Rep Power
0
Hehe well, since I mapped my dictionary, the number of combinations of letters that i need to use has been drastically cut down, so I basically need all subsets of the original set, not taking into consideration the order in which elements of the subset/set are placed.

It's been solved, and rather elegantly I might add, thanks to you, Jos and Gary's help!

Page 2 of 2 First 12

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•