|
Parsing HTML
I am trying to parse the HTML, and for a HTML code like this
<span class="authorName">Paul Abbott </span>
I want to retrieve the value "Paul Abbott"
Right now, I am using a parser, to generate the HTML tree and by using the this code..
************************************************** *******************
Tidy tidy = new Tidy();
tidy.setXHTML(xhtml);
d = tidy.parseDOM(in,out);
NodeList spanNode = d.getElementsByTagName("span");
int length = spanNode.getLength();
for(int i = 0;i<length;i++)
{
org.w3c.dom.Node span = spanNode.item(i);
String tempAltText = span.getAttributes().getNamedItem("class").getNode Value();
if(tempAltText.equals("authorName")){
System.out.println("the item is " + tempAltText);
}
else{
}
}
************************************************** ************
The tempAltText displays "authorName" but not "Paul Abott"
please give me some suggestions...how can i do that...
|