Results 1 to 2 of 2
Thread: hello need help :)
- 02-02-2010, 04:38 PM #1
Member
- Join Date
- Mar 2009
- Posts
- 23
- Rep Power
- 0
hello need help :)
Hello i am doing application recursion and sets.
i will need to write a program thats given a url 'u' thats prints out set of all url's which are reachable from 'u' but are also in the same domain. the domian is a command line argument. i assume a link is in the domain if it contains the domain in the string e.g Google (two command line arguments) should print out all the url's reachable from Google. i cannot visit link outside the domain (i.e links which do not contain second command line argument.
i have already started the coding and pretty much did most of it i have three method, i need help on the first 2 methods, my first method need to contain the string and the domain and my second method will need to contain a string, domain and a hash set.
Java Code:import org.htmlparser.util.*; import org.htmlparser.*; import org.htmlparser.tags.*; import org.htmlparser.filters.*; import java.util.HashSet; class betterRecursiveCrawler1 { public static HashSet<String> visit (String url) { HashSet <String> s1 = new HashSet(); try{ Parser parser1 = new Parser (url); NodeList list1 = parser1.parse (new LinkStringFilter("http:")); // no filter for (int i=0;i<list1.size();i++) { String st = ((LinkTag)(list1.elementAt(i))).extractLink(); s1.add(st); } return s1; } catch (Exception e) { return new HashSet(); } } public static HashSet<String> visit (String url, int depth,HashSet<String> already) { HashSet <String> s= new HashSet(); if (depth==0) {s.add(url); return s;} else { already.add(url); HashSet <String> t= visit(url); for (String u:t) if (!already.contains(u)) { already.add(u); s.addAll(visit(u,depth-1,already)); } } return s; } public static void main(String args[]) throws Exception { int depth = Integer.parseInt(args[1]); HashSet <String> already=new HashSet(); HashSet <String> s = visit(args[0],depth,already); //for (String u:s) System.out.println(u); System.out.println(s.size()); } }
- 02-02-2010, 10:06 PM #2
Member
- Join Date
- Nov 2007
- Location
- New Zealand
- Posts
- 36
- Rep Power
- 0
Hi Yasmin_K,
Neat program. But I fail to understand what your problem is.
--- Instructions to others interested in running this code... ---
download library from: HTML Parser - HTML Parser
compile using: javac -cp .;htmlexer.jar;htmlparser.jar betterRecursiveCrawler1.java
run using: java -cp .;htmlexer.jar;htmlparser.jar betterRecursiveCrawler1 http://google.com]Google 2
--- end ---Last edited by Turtle; 02-02-2010 at 10:09 PM.


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks