Results 1 to 2 of 2
  1. #1
    java_org_bsb is offline Member
    Join Date
    Jan 2009
    Posts
    1
    Rep Power
    0

    Question Question related to htmlParser

    Hello, i have to use an htmlParser and i have to visit the nodes/children in the DOM tree which attribute is href.
    It's just that i need to parse the html and gather all the links included in " " marks

    For example : <a href=" url ">

    So i have the following code :

    Java Code:
    import org.htmlparser.Parser;
     import org.htmlparser.Tag;
     import org.htmlparser.Text;
     import org.htmlparser.util.ParserException;
     import org.htmlparser.visitors.NodeVisitor;
     
     
    public class MyVisitor extends NodeVisitor
     {
        
         public MyVisitor ()
         {
         }
         /**/    @Override
         public void visitTag (Tag tag)
         {
             System.out.println ("\n" + tag.getTagName () + tag.getStartPosition ());
         }
     
     
         public static void main (String[] args) throws ParserException
         {
            
             Parser parser = new Parser (" whatever, link");
             NodeVisitor visitor = (NodeVisitor) new MyVisitor ();
             parser.visitAllNodesWith(visitor);
             
             visitor.visitTag(/* Here i cannot place a string but only a Tag */);   // I would like to visit every Tag in the page and i'm stuck,
                                                                                                           // I don't know how to do that! 
       
          
         }
     }
    Hm...it's just that i have to visit every children/node and find the nodes/elements with attribute href

    May you please help me?
    Thanks in advance! :rolleyes:

  2. #2
    VeasMKII's Avatar
    VeasMKII is offline Member
    Join Date
    Jan 2009
    Posts
    18
    Rep Power
    0

    Default

    I've been doing something like this recently. I don't use a html parser, but maybe you could adapt my code somewhat:

    I used a buffered reader and read the lines in one at a time feeding it to this code:

    Java Code:
                          String slice = "< a href=";
                          int beg = inputLine.indexOf(slice) + slice.length();
                          int end = inputLine.indexOf(">");
                          String sliced = inputLine.substring(beg, end)
    It basically extracts text between two values and creates a substring, storing it into a string
    Last edited by VeasMKII; 02-01-2009 at 02:45 AM.

Similar Threads

  1. Related to JTrees
    By swathi in forum AWT / Swing
    Replies: 1
    Last Post: 11-21-2008, 06:50 AM
  2. Jboss related.
    By CharanZ in forum Introductions
    Replies: 0
    Last Post: 10-27-2008, 10:09 AM
  3. Sql Server Related
    By Ganeshag777 in forum JDBC
    Replies: 1
    Last Post: 08-28-2008, 05:54 PM
  4. JNDI Related
    By Ganeshag777 in forum Advanced Java
    Replies: 0
    Last Post: 08-13-2008, 01:18 PM
  5. JDBC Related
    By Ganeshag777 in forum JDBC
    Replies: 2
    Last Post: 08-13-2008, 12:42 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •