Results 1 to 2 of 2
  1. #1
    yasmin k is offline Member
    Join Date
    Mar 2009
    Rep Power

    Question hello need help :)

    Hello i am doing application recursion and sets.

    i will need to write a program thats given a url 'u' thats prints out set of all url's which are reachable from 'u' but are also in the same domain. the domian is a command line argument. i assume a link is in the domain if it contains the domain in the string e.g Google (two command line arguments) should print out all the url's reachable from Google. i cannot visit link outside the domain (i.e links which do not contain second command line argument.

    i have already started the coding and pretty much did most of it i have three method, i need help on the first 2 methods, my first method need to contain the string and the domain and my second method will need to contain a string, domain and a hash set.

    Java Code:
    import org.htmlparser.util.*;	
    import org.htmlparser.*;
    import org.htmlparser.tags.*;
    import org.htmlparser.filters.*;
    import java.util.HashSet;
    class betterRecursiveCrawler1
    	public static HashSet<String> visit (String url) 
    		HashSet <String> s1 = new HashSet(); 
    			Parser parser1 = new Parser (url);
    			NodeList list1 = parser1.parse (new LinkStringFilter("http:")); // no filter
    			for (int i=0;i<list1.size();i++)
    				String st = ((LinkTag)(list1.elementAt(i))).extractLink();
    			return s1;
    		catch (Exception e)
    			return new HashSet();
    	public static HashSet<String> visit (String url, int depth,HashSet<String> already) 
    		HashSet <String> s= new HashSet();
    		if (depth==0) {s.add(url); return s;}
    		else {  
    			HashSet <String> t= visit(url);
    			for (String u:t) 
    			   if (!already.contains(u))
    		return s;	
    	public static void main(String args[]) throws Exception	
    		int depth = Integer.parseInt(args[1]);
    		HashSet <String> already=new HashSet();
    		HashSet <String> s = visit(args[0],depth,already); 
    		//for (String u:s) System.out.println(u);

  2. #2
    Turtle is offline Member
    Join Date
    Nov 2007
    New Zealand
    Rep Power


    Hi Yasmin_K,

    Neat program. But I fail to understand what your problem is.

    --- Instructions to others interested in running this code... ---

    download library from: HTML Parser - HTML Parser
    compile using: javac -cp .;htmlexer.jar;htmlparser.jar
    run using: java -cp .;htmlexer.jar;htmlparser.jar betterRecursiveCrawler1]Google 2

    --- end ---
    Last edited by Turtle; 02-02-2010 at 10:09 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts