Results 1 to 2 of 2
  1. #1
    rizzo3710 is offline Member
    Join Date
    Nov 2010
    Rep Power

    Default Help with HREF Parser


    I am trying to write a java program that takes an entire webpage's html code and scans the code looking for the <a href> trags and then puts each full url in an array list...I'm able to view the page's html contents as a big string and can't figure out how to get beyond the initial first is what I have for the linkParser method:

    public static ArrayList<String> linkParser(String entirePage){
    //this method takes the html code and breaks it up by the "a href" tags and puts
    //each URL in the "links" ArrayList

    ArrayList<String> links = new ArrayList<String>();
    String fullURL = "";

    int i = entirePage.indexOf("a href")+8;
    while (entirePage.indexOf("a href")!=-1){
    while (entirePage.charAt(i)!= '>'){
    fullURL = entirePage.substring(entirePage.indexOf("a href")+8, i);





    return links;

  2. #2
    eRaaaa is offline Senior Member
    Join Date
    Oct 2010
    Rep Power


    why you dont use a java html parser, as for example HTML Parser - HTML Parser or jericho ?

    not tested and i`m sure that this will not match all urls (you could search for a better regex):
    Java Code:
    	public static List<String> linkParser(String entirePage) {
    		List<String> links = new ArrayList<String>();
    		Pattern p = Pattern.compile("<a.*?href=\"(.+?)\"");
    		Matcher m = p.matcher(entirePage);
    			links.add(; // get the string inside href="......" <- 
    		return links;

Similar Threads

  1. Can we append more than 255 chars to <a href>??
    By freddieMaize in forum Advanced Java
    Replies: 22
    Last Post: 07-18-2008, 05:04 PM
  2. Parser API
    By sruti_mohan in forum Advanced Java
    Replies: 0
    Last Post: 06-09-2008, 08:23 AM
  3. JSP Parser????
    By chathu03j in forum JavaServer Pages (JSP) and JSTL
    Replies: 0
    Last Post: 04-10-2008, 01:08 PM
  4. DNS name parser 1.2.1
    By JavaBean in forum Java Software
    Replies: 0
    Last Post: 07-14-2007, 09:21 PM
  5. DKP Log Parser 1.4.1
    By JavaBean in forum Java Software
    Replies: 0
    Last Post: 06-25-2007, 09:49 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts