Results 1 to 19 of 19
  1. #1
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Help with indexer

    Hi, I am completely new to Java and I am currently on my second assignment in my Java class.

    I am having a bit of trouble getting my code to do what I need it to do. I'm not getting any errors, but when I runt the program I the console just displays the word that is given and doesn't search the webp pages like it is supposed to. I know this is very simple but again I am a complete beginner and anything helps at this point. Here is the code:


    public class SearchEngine {
    public static void main(String[] args) {
    System.out.println("Creating Index and Scraping Pages");


    // 1. Create a new Indexer object
    Indexer i = new Indexer();
    /* 2. Use the addPage method from this class
    * to add the following websites to your index
    * FSU's Program in Interdisciplinary Computing
    * FSU's Program in Interdisciplinary Computing
    * FSU's Program in Interdisciplinary Computing
    * FSU's Program in Interdisciplinary Computing */
    addPage(i, "http://pic.fsu.edu/");
    addPage(i, "http://pic.fsu.edu/about.php");
    addPage(i, "http://pic.fsu.edu/people.php");
    addPage(i, "http://pic.fsu.edu/students.php");
    System.out.println("System ready for queries");

    /* 3. Open a input dialog using the JOptionPane class
    * and ask for a word to search the index.
    * Get the word to search for as a result String.
    */
    String word = JOptionPane.showInputDialog(null, "Provide a Word to search the index:");
    // 4. Search the index for the word requested, get the result.
    i.search(word);



    /* 5. Output the search result either to the console
    * or using a JOptionPane message dialog.
    */
    System.out.println(word);
    }

    /**
    * Describe what this method does here.
    * @param i
    * @param location
    */
    public static void addPage(Indexer i, String location) {
    // 1. Get the html code from the location provided to this method
    String site = WebGet.httpget(location);
    // 2. Create a new page object and set its html code to the code you downloaded
    Page p = new Page(site);

    // 3. Get the array of words from the page object.
    p.getWords();
    // 4. Add an entry to the indexer, providing the location and the array of words
    i.addEntry(location, p.getWords());
    }
    }

  2. #2
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,585
    Rep Power
    23

    Default Re: Help with indexer

    What packages are you using? The posted code does not have any import statements.
    What does the API doc for the Indexer class say?
    If you don't understand my response, don't ignore it, ask a question.

  3. #3
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer

    Sorry at the top it says
    import javax.swing.JOptionPane;

    Also there are three other files that this is tied in with:

    First one:

    Java Code:
    import java.util.ArrayList;
    import java.util.Arrays;
    
    public class Page	{
    
    	private String html;
    	private ArrayList<String> links;
    	private ArrayList<String> words;
    	
    	/** Creates a page object with no HTML data.
    	*/
    	public Page()	{
    		this(new String(""));
    	}
    
    	/** Creates a page object and automatically calls the setHTML method to 
    	 * process the links on the page for the provided html String.
    	 * @param html The HTML page as a String
    	*/
    	public Page(String html)	{
    		setHTML(html);
    	}
    	
    	/** A setter for providing a String html page.  If you do not set any HTML page, the 
    	 * getter methods will return blank strings
    	 * @param html The HTML page as a String.
    	*/
    	public void setHTML(String html)	{
    		this.html = html;
    		processLinks();
    		processWords();
    	}
    	
    	/** Provides the number of links found on the page. */
    	public int numLinks()	{
    		return links.size();
    	}
    	
    	/** Gets a link at a specified index.  Will throw exceptions if you do not specify a valid index 
    	 * @param index The index to get from the list of links on the page.
    	 * @return The link as a String
    	*/
    	public String getLink(int index)	{
    		return links.get(index);
    	}
    
    	/** A simple way to get all of the links on the page as a String array.
    	 * @return A string array of all of the links found on the page */
    	public String[] getLinks()	{
    		return links.toArray(new String[links.size()]);
    	}
    
    	/** Returns all of the words on the page as a String array.  
    	 * These words are not checked by a dictionary, in the case the word 
    	 * is a proper noun or acronym.  This function will attempt to remove HTML code.
    	 * @return A String array of all of the words on the page.
    	*/
    	public String[] getWords()	{
    		return words.toArray(new String[words.size()]);
    	}
    	
    	private void processWords() {
    		String data = this.html;
    		// Attempt to remove HTML tags
    		String noHTML = data.replaceAll("\\<.*?\\>", "");
    		// Get all of the distinct phrases left, make them all lowercase
    		String[] tempWords = noHTML.split(" ");
    		for (String t : tempWords) {
    			t = t.toLowerCase();
    		}
    	
    		words = new ArrayList<String>(Arrays.asList(tempWords));
    	
    	}
    	
    	private void processLinks()	{
    		String data = this.html;
    		String link = "a href";
    		ArrayList<String> links = new ArrayList<String>();
    		int find;
    		int lastFind = 0;
    		try {
    			while((find = data.indexOf(link,lastFind)) != -1)	{
    				int startFind = data.indexOf("\"", find);
    				int endFind = data.indexOf("\"",startFind+1);
    				String theLink = data.substring(startFind+1,endFind);
    				links.add(theLink);
    				lastFind = endFind;
    			}
    		}
    		catch (Exception e)	{
    
    		}
    		this.links = links;
    	}
    
    }
    Second one:

    Java Code:
    import java.util.ArrayList;
    import java.util.HashMap;
    /** This class creates an index of words to locations.  One can search for locations 
     * by providing a word.
     *
     * <h6>This code is released for educational purposes only and does not come with any
     * warranty for the merchantability or fitness for any particular purpose.</h6>
     *
     * @author Geoffery Miller [email]geoffery.miller@gmail.com[/email]
     * @version 0.1
    */
    public class Indexer {
    	private HashMap<String, ArrayList<String>> index;
    	/** Creates an empty Index object.  You will need to use the method addEntry to 
    	 * add locations and words to the index.  Then, you will be able to search the index
    	 * for locations, given a search term.
    	*/
    	public Indexer() {
    		index = new HashMap<String, ArrayList<String>>();
    	}
    	/** This method takes a location String and words String array to 
    	 * create an index for those words to that location.  This method can be 
    	 * called multiple times with new locations and word arrays to increase the 
    	 * usefulness of the index.
    	 * 
    	 * @param location  The location of the data (the URL as a String)
    	 * @param words An array of words on the page at that location
    	 */
    	public void addEntry(String location, String[] words) {
    		for (int i = 0; i < words.length; i++) {
    			String w = words[i].toLowerCase();
    			if (index.containsKey(w)) {
    				ArrayList<String> locations = index.get(w);
    				if (!locations.contains(location)) {
    					locations.add(location);
    				}
    			}
    			else {
    				ArrayList<String> locations = new ArrayList<String>();
    				locations.add(location);
    				index.put(w, locations);
    			}
    		}
    	}
    	/**
    	 * This method allows you to search for a word in the index.  You will be returned
    	 * a String of locations which are the resulting websites found for the search term.
    	 * @param word The word you want to search for in the Index.
    	 * @return The websites found for this word as a String, comma separated.
    	 */
    	public String search(String word) {
    		String searchTerm = word.toLowerCase();
    		ArrayList<String> locations = index.get(searchTerm);
    
    		String result = "";
    		if (locations == null)
    			return result;
    		
    		for (int i = 0; i < locations.size(); i++) {
    			String location = locations.get(i);
    			result = result + location + ", ";
    		}
    		// Remove the last ", "
    		result = result.substring(0, (result.length()-2));
    		return result;
    	}
    }
    Third one:

    Java Code:
    import java.net.URL;
    import java.net.URLConnection;
    import java.io.InputStreamReader;
    import java.io.BufferedReader;
    /**
     * A utility class to get data from the web.  
     * Currently only supports an http get request without parameters.
     *
     * <h6>This code is released for educational purposes only and does not come with any 
     * warranty for the merchantability or fitness for any particular purpose.</h6>
     * 
     * @author Geoffery Miller [email]geoffery.miller@gmail.com[/email]
     * @version 0.2
    */
    public class WebGet	{
    	/**
    	 * Creates a connection to a URL provided as a String and downloads text data.  The data 
    	 * is returned as a String.
    	 * When this method cannot make a connection to the specified url, or there is an error in 
    	 * receiving data, it will return an empty String.
    	 * @param location The URL to download, this must include http:// prepended to the url.
    	 * @return The data from the web resource returned as a String.
    	*/
    	public static String httpget(String location)	{
    		String webpage = "";
    		try {
    			URL url = new URL(location);
    			URLConnection urlConnection = url.openConnection();
    			InputStreamReader isr = new InputStreamReader(urlConnection.getInputStream());
    			BufferedReader buff = new BufferedReader(isr);
    			String line;
    			while((line = buff.readLine()) != null)	{
    				webpage = webpage + line + "\n";
    			}
    		}
    		catch(Exception e)	{
    			webpage = "";
    		}
    		return webpage;
    	}
    	private WebGet() {}
    }
    Last edited by aaalex93; 09-29-2012 at 02:47 AM.

  4. #4
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer


  5. #5
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,585
    Rep Power
    23

    Default Re: Help with indexer

    Please edit your posts and wrap the code in code tags. See: BB Code List - Java Programming Forum

    Also can you post what the program currently outputs and add some comments to the post showing what it is supposed to output.
    Last edited by Norm; 09-29-2012 at 02:49 AM.
    If you don't understand my response, don't ignore it, ask a question.

  6. #6
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer

    Ok fixed

  7. #7
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,585
    Rep Power
    23

    Default Re: Help with indexer

    Thanks. Also what is the input to the program? What should the value of word be?
    Can you post what the program currently outputs and add some comments to the post showing what it is supposed to output.
    If you don't understand my response, don't ignore it, ask a question.

  8. #8
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer

    The program outputs:
    Creating Index and Scraping Pages
    System ready for queries

    Then an input dialog pops up and asks for the word to search for
    but when I type in a word to search, the console displays the word then terminates instead of displaying the search results

  9. #9
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer

    the input word can be anything, it searches the given URLs for the word that is given in the dialog box.

  10. #10
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,585
    Rep Power
    23

    Default Re: Help with indexer

    You didn't post all of what the program printed out. Where is then word you entered?
    the input word can be anything
    I need to know what was entered so I can do the same test.

    What did you enter for input? What should the program print out when it is given that input?

    Where in the code does it print the results of the search?
    If you don't understand my response, don't ignore it, ask a question.

  11. #11
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer

    I typed in "geo" as that is my instructor's name and certainly shows up on those pages.

    It should print out the websites found for the given word.

    And that is where I am confused, I'm not sure how to get it to print out the correct results

  12. #12
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,585
    Rep Power
    23

    Default Re: Help with indexer

    What lines of code did you write? Did you read the API doc for the methods that the code calls?
    There should be clues in the doc on how to get the results and display them.

    Who coded this statement:
    Java Code:
      System.out.println(word);
    You missed adding code tags to the first post.
    If you don't understand my response, don't ignore it, ask a question.

  13. #13
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer

    I coded that statement. Everything on the first post (Search Engine) is what I coded

  14. #14
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,585
    Rep Power
    23

    Default Re: Help with indexer

    Did you read the API doc for what each of the Indexer class's methods that you called to see what they did?

    I coded that statement.
    That statement prints out the contents of word on the screen.
    You said:
    console just displays the word that is given
    Your code does that.

    If you want to print more, you need to read the API doc and see where the results are and how to get them.
    If you don't understand my response, don't ignore it, ask a question.

  15. #15
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer

    The API says that the search returns the word as a String, comma separated.

  16. #16
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,585
    Rep Power
    23

    Default Re: Help with indexer

    OK. Now you need to write the code that receives the returned value and prints it.
    If you don't understand my response, don't ignore it, ask a question.

  17. #17
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

    Default Re: Help with indexer

    ok
    i replaced i.search(word);
    with String foundWord = i.search(word);

    and then System.out.println(foundWord);

    This now prints out the URLs correctly

  18. #18
    Norm's Avatar
    Norm is offline Moderator
    Join Date
    Jun 2008
    Location
    Eastern Florida
    Posts
    16,585
    Rep Power
    23

    Default Re: Help with indexer

    Is the problem solved now?
    If you don't understand my response, don't ignore it, ask a question.

  19. #19
    aaalex93 is offline Member
    Join Date
    Sep 2012
    Posts
    11
    Rep Power
    0

Similar Threads

  1. Help with Struts2 tags and Array indexer
    By delig in forum Web Frameworks
    Replies: 0
    Last Post: 04-18-2011, 09:41 PM
  2. Files Indexer
    By carlneto in forum New To Java
    Replies: 4
    Last Post: 02-01-2011, 10:33 PM
  3. Lucene as Conditional Evaluator / Indexer?
    By cuebei in forum Lucene
    Replies: 0
    Last Post: 01-11-2010, 07:36 PM
  4. Lucene Indexer Encoding problem
    By svirid in forum Lucene
    Replies: 5
    Last Post: 02-18-2009, 09:26 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •