View Single Post
  #2 (permalink)  
Old 01-08-2008, 05:54 AM
roots's Avatar
roots roots is offline
Moderator
 
Join Date: Jan 2008
Location: Dallas
Posts: 263
roots is on a distinguished road
I went through that code. Elaborate a bit more on how you want your web crawler to be.
Things to consider will be.
1. Java networking concept
2. HTTP Protocol and HTML
3. Text Parsing (Remember how he extracted <a href ..

Commons HTTP/HTTP Component, Regex, Simple IO are welcome as well.

Program you used is not ROBUST enough. Better you start from scratch.
__________________
dont worry newbie, we got you covered.
Reply With Quote