Question related to htmlParser
Hello, i have to use an htmlParser and i have to visit the nodes/children in the DOM tree which attribute is href.
It's just that i need to parse the html and gather all the links included in " " marks
For example : <a href=" url ">
So i have the following code :
Code:
import org.htmlparser.Parser;
import org.htmlparser.Tag;
import org.htmlparser.Text;
import org.htmlparser.util.ParserException;
import org.htmlparser.visitors.NodeVisitor;
public class MyVisitor extends NodeVisitor
{
public MyVisitor ()
{
}
/**/ @Override
public void visitTag (Tag tag)
{
System.out.println ("\n" + tag.getTagName () + tag.getStartPosition ());
}
public static void main (String[] args) throws ParserException
{
Parser parser = new Parser (" whatever, link");
NodeVisitor visitor = (NodeVisitor) new MyVisitor ();
parser.visitAllNodesWith(visitor);
visitor.visitTag(/* Here i cannot place a string but only a Tag */); // I would like to visit every Tag in the page and i'm stuck,
// I don't know how to do that!
}
}
Hm...it's just that i have to visit every children/node and find the nodes/elements with attribute href
May you please help me?
Thanks in advance! :rolleyes: