Page 1 of 3 123 LastLast
Results 1 to 20 of 50

Thread: dictionary

  1. #1
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default dictionary

    Hello,
    I have to write a code, that when it an input of several lines, such as the following

    bag
    sun
    day
    moon
    Sunday
    Monday
    airbag
    Moonbag
    and i have to find how many words that can be broken to exactly to sub-words
    the sub-words in this case are: bag, sun, day, moon.
    and the normal words are the Sunday, Monday, airbag, and Moonbag

    The rules are:
    * no sub-word is shorter than 3 letters
    * case is insignificant
    *each dictionary has atleast one word but no more than 50,000 words
    *each word is at least one character long, but no longer than 16 characters
    *all the words consist of letters only

    my output should be the following

    2
    which is the number of words that are divided into 2 sub-words, and these sub-words are found in the dictionary (bag, sun, day, moon)
    and these 2 words are Sunday, and Moonday

    I didnt start to write a code yet, nor a pseudo code becuase i dont know from where to start, and if someone can explain it to me I'd appreciate it cuz i didnt understand the problem quite well

    Thanks in advance

  2. #2
    user0 is offline Senior Member
    Join Date
    Dec 2010
    Posts
    100
    Rep Power
    0

    Default

    I think the first thing you need to do is consider how you will read the lines of input from the user and where to store those words? The below code should give you a start, it stores the words in an ArrayList of Strings and reads words from the console using a Scanner object. Change the name of the dictionary file to whatever it is on your system and also to signal the end of input after you have typed several words, press CTRL + D.

    Java Code:
    import java.util.*;
    import java.io.*;
    
    public class SubWords {
    	
    	public static void main(String[] args) throws IOException	{
    		Scanner console = new Scanner(System.in);
    		
    		ArrayList<String> words = new ArrayList<String>();
    		
    		System.out.println("Type each word followed by Enter: ");
    		
    		while(console.hasNext())	{
    			words.add(console.next());
    		}
    		
    		System.out.println();
    		System.out.println(countSubs(words));
    	}
    	
    	public static int countSubs(ArrayList<String> words) throws IOException 	{
    		int count = 0;
    		
    		File f = new File("dictionary.txt");
    		
    		
    		for(String eachWord: words)	{
    			
    			Scanner reader = new Scanner(f);
    
    			while(reader.hasNext())	{
    				String dictWord = reader.next();
    				if(dictWord.length() >= 3 && dictWord.length() < eachWord.length())	{
    					if(eachWord.contains(dictWord))	{
    						count ++;
    					}
    				}
    				else	{
    					if(reader.hasNext())	{
    						reader.next();
    					}
    				}
    			}
    		}
    		
    		return count;
    	}
    }
    --user0--

  3. #3
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default

    Quote Originally Posted by user0 View Post
    I think the first thing you need to do is consider how you will read the lines of input from the user and where to store those words? The below code should give you a start, it stores the words in an ArrayList of Strings and reads words from the console using a Scanner object. Change the name of the dictionary file to whatever it is on your system and also to signal the end of input after you have typed several words, press CTRL + D.

    Java Code:
    import java.util.*;
    import java.io.*;
    
    public class SubWords {
    	
    	public static void main(String[] args) throws IOException	{
    		Scanner console = new Scanner(System.in);
    		
    		ArrayList<String> words = new ArrayList<String>();
    		
    		System.out.println("Type each word followed by Enter: ");
    		
    		while(console.hasNext())	{
    			words.add(console.next());
    		}
    		
    		System.out.println();
    		System.out.println(countSubs(words));
    	}
    	
    	public static int countSubs(ArrayList<String> words) throws IOException 	{
    		int count = 0;
    		
    		File f = new File("dictionary.txt");
    		
    		
    		for(String eachWord: words)	{
    			
    			Scanner reader = new Scanner(f);
    
    			while(reader.hasNext())	{
    				String dictWord = reader.next();
    				if(dictWord.length() >= 3 && dictWord.length() < eachWord.length())	{
    					if(eachWord.contains(dictWord))	{
    						count ++;
    					}
    				}
    				else	{
    					if(reader.hasNext())	{
    						reader.next();
    					}
    				}
    			}
    		}
    		
    		return count;
    	}
    }
    wow
    great thnx for the code, but i dont know the arraylist, so can i substitute the arraylist with normal array
    if not can you tell me what should i change

  4. #4
    user0 is offline Senior Member
    Join Date
    Dec 2010
    Posts
    100
    Rep Power
    0

    Default

    The reason I use array list is because you never mentioned how many words you are going to type. If you know how many you are going to type, you can definitely use an array of String like follows: String[] words = new String[6]; this will create an array of six words. If you want you can ask the user how many words they want to type, and then create an array using the response from the user.

    if you do use an array instead, then you need to change the parameter in the countSubs() method to String[] words.

    Hope that helps.
    --user0--

  5. #5
    JosAH's Avatar
    JosAH is online now Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,344
    Blog Entries
    7
    Rep Power
    20

    Default

    That algorithm doesn't solve the problem; e.g. let the dictionary be:

    abba
    abb

    and let the addtional words be:

    abbaabb
    abbaaabb

    the algorithm produces 4 which is incorrect. My guess is that this problem is NP complete but I haven't thought about it any further.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  6. #6
    user0 is offline Senior Member
    Join Date
    Dec 2010
    Posts
    100
    Rep Power
    0

    Default

    Oh I see what the problem is. I thought you were using a dictionary file. can you please define exactly what you mean by dictionary? Does the dictionary have only subwords? My code assumed you were using a full english word list dictionary file.
    --user0--

  7. #7
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default

    Sorry Jos but what is NP?

    @user0: the name of the problem and the class is dictionary

    I think i need to put the sub-words in an array and the complete words in another array, and compare the two arrays if at least 2 sub-words are found in an element on the complete words array, then increment 1 to a count

    but the problem is i dont know how to start my code, or how to start thinking of the program
    Last edited by aizen92; 12-30-2010 at 04:56 PM.

  8. #8
    JosAH's Avatar
    JosAH is online now Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,344
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by aizen92 View Post
    Sorry Jos but what is NP?
    I thought about it a bit more but your problem is O(n^3). Given a set of words w1, w2 ... wn and given a set of other words W1, W2 ... Wm find i, j and k such that Wk == wi wj; if you merge the sets w and W it is getting more difficult already. The classic Post Correspondence Problem comes to mind but I still haven't thought about it enough. b.t.w. NP == Non Polynomial bound which basically means that a problem of size n needs a^n steps to solve it. Those problems are terrible to solve but can be quite interesting. Also b.t.w. that Post Correspondence problem isn't even solvable, but that's an entirely different story ;-)

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  9. #9
    user0 is offline Senior Member
    Join Date
    Dec 2010
    Posts
    100
    Rep Power
    0

    Default

    Aizen - we can track the unique number of words using another integer array. Take a look at the below code and see if it makes sense. This should catch the case that JoSah was talking about earlier also. It's an interesting problem and I'm learning as well, so if anyone can suggest a more efficient solution please post! :)

    Java Code:
    import java.util.*;
    import java.io.*;
    
    public class SubWords {
    	
    	public static void main(String[] args) throws IOException	{
    		Scanner console = new Scanner(System.in);
    		
    		System.out.println("How many words: ");
    		
    		int num = console.nextInt();
    		
    		String[] words = new String[num];
    		
    		System.out.println("Type each word followed by Enter: ");
    		
    		for(int i = 0; i < words.length; i++)	{
    			words[i] = console.next();
    		}
    		
    		System.out.println();
    		System.out.println(countSubs(words));
    	}
    	
    	public static int countSubs(String[] words) throws IOException 	{
    		int count = 0;
    		
    		File f = new File("dictionary.txt");
    		
    		int[] counts = new int[words.length];
    		
    		for(int i = 0; i < words.length; i++)	{
    			
    			String eachWord = words[i];
    			
    			Scanner reader = new Scanner(f);
    
    			while(reader.hasNext())	{
    				String dictWord = reader.next();
    				
    				if(dictWord.length() >= 3 && eachWord.indexOf(dictWord) != -1)	{
    					
    					if(counts[i] == 0)	{
    						counts[i] = 1;
    					}
    					
    				}
    				else	{
    					if(reader.hasNext())	{
    						reader.next();
    					}
    				}
    			}
    		}
    		
    		for(int c: counts)	{
    			if(c == 1)	{
    				count++;
    			}
    		}
    		return count;
    	}
    	
    
    }
    --user0--

  10. #10
    JosAH's Avatar
    JosAH is online now Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,344
    Blog Entries
    7
    Rep Power
    20

    Default

    That is never going to work; better start with this skeleton:

    Java Code:
    public class T  {
        public static void main(String[] args) {
        	String[] words= { "ab", "abb", "ba", "abba", "abbba" };
        	
        	for (int k= 0; k < words.length; k++)
            	for (int i= 0; i < words.length; i++)
                		for (int j= 0; j < words.length; j++)
                	            	if (words[k].equals(words[i]+words[j]))
                				System.out.println(words[k]+"="+words[i]+"+"+words[j]);
        }
    }
    kind regards,

    Jos
    Last edited by JosAH; 12-30-2010 at 06:01 PM.
    cenosillicaphobia: the fear for an empty beer glass

  11. #11
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default

    well, im kinda getting lost in some parts of the code, but can anyone explain to me what are the conditions of the sub-words and the conditions of the complete word
    I didnt get them all

  12. #12
    JosAH's Avatar
    JosAH is online now Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,344
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by aizen92 View Post
    well, im kinda getting lost in some parts of the code, but can anyone explain to me what are the conditions of the sub-words and the conditions of the complete word
    I didnt get them all
    Huh? It was you who coined the problem so you should know the definitions of it all; my suggestion simply takes all pairs of words and checks if the concatenation of them is also a word in your dictionary. It's up to you to implement additional constraints.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  13. #13
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default

    but what I meant is how can I know when to stop saving the words as a sub-word?

  14. #14
    JosAH's Avatar
    JosAH is online now Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,344
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by aizen92 View Post
    but what I meant is how can I know when to stop saving the words as a sub-word?
    Please elaborate on this because I don't think I understand your intentions. Feel free to show examples.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  15. #15
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default

    what Iam trying to say is, in my input
    bag
    sun
    day
    moon
    Sunday
    Monday
    airbag
    Moonbag
    how can I know when to stop at moon for the sub-word array.
    from what iam trying to understand from the given, the requirement for a sub-word is to be greater or equal to 3 letters
    and that a word can be of 1 character and not more than 16 characters, how can it be 1 character long, and what a character is, a letter or what?

    these are the things that got me confused with this problem

    I really appreciate the help guys
    thnx for helping

  16. #16
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default

    Ok, i think ill be testing a little bit before heading to the whole array thingy from input, im gonna assume i already made the arrays and had them arranged, and made 2 arrays sub and word with each one's element in them, and i want to see how to test for if to count for the word or not.
    this is the code ive come up with

    Java Code:
    public class testing
    {
        public static void main (String[] args)
        {
            String [] sub = {"bag", "sun", "day", "moon"};
            String [] word = {"Sunday", "Monday", "airbag", "Moonbag"};
            
            int count = 0;
            
            for ( int i = 0; i < word.length; i++)
            {
                for ( int j = 0; j < sub.length; j++)
                {
                    for ( int k = 0; k < sub.length; k++)
                    {
                        if ( word[i].equals(sub[j] + sub[k]))
                        {
                            count++;
                        }
                    }
                }
            }
            
            System.out.println(count);
        }
    }
    but the problem is my output is "0", so i think there is something wrong with my code
    any help with this
    thanks in advance

  17. #17
    user0 is offline Senior Member
    Join Date
    Dec 2010
    Posts
    100
    Rep Power
    0

    Default

    JosaH guided you in the right direction. I now understand that you have to be able to split one word into exactly two subwords. But your code does not account for lowercase/uppercase. You mentioned that the case is insignificant. Therefore, you may want to use a toLowerCase() conversion of your word before comparing. You are using lowercase in your sub array but uppercase in your words array this is what is causing the wrong output of 0. Change the last IF statement to follows:
    Java Code:
                        if ( word[i].toLowerCase().equals(sub[j].toLowerCase() + sub[k].toLowerCase()))
                        {
                            count++;
                        }
    --user0--

  18. #18
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default

    Ok, it worked this way
    All what is left is to read from input file and put them in arrays
    any suggestions

    my input file is:

    dictionary.in:
    bag
    sun
    day
    moon
    Sunday
    Monday
    airbag
    Moonbag
    now as i said before, that I need to put the sub-words in an array, and the complete words in another array

  19. #19
    JosAH's Avatar
    JosAH is online now Moderator
    Join Date
    Sep 2008
    Location
    Voorschoten, the Netherlands
    Posts
    13,344
    Blog Entries
    7
    Rep Power
    20

    Default

    Quote Originally Posted by aizen92 View Post
    Ok, it worked this way
    All what is left is to read from input file and put them in arrays
    any suggestions
    Don't use arrays, stick all your words in a single set A. Use my algorithm and if you find a word that is the catenation of two other words (the inner if-statement), stick the word in another set B. When the loops have finished remove the words in the second set B from the first set A (read the API documentation for the Set interface). Set B contains the compound words and set A contains the sub-words.

    kind regards,

    Jos
    cenosillicaphobia: the fear for an empty beer glass

  20. #20
    aizen92 is offline Senior Member
    Join Date
    Nov 2010
    Posts
    155
    Rep Power
    4

    Default

    can explain it more to me, i didint understand it very well ur method
    what are sets? are they like arrays or what?

Page 1 of 3 123 LastLast

Similar Threads

  1. Replies: 2
    Last Post: 12-27-2010, 04:28 AM
  2. Phrases in Lucene dictionary?
    By TheShar in forum Lucene
    Replies: 0
    Last Post: 05-27-2010, 02:42 PM
  3. add dictionary
    By monir6464 in forum New To Java
    Replies: 2
    Last Post: 04-07-2008, 06:27 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •