Results 1 to 7 of 7

Thread: HTMLEditorKit

  1. #1
    makpandian's Avatar
    makpandian is online now Senior Member
    Join Date
    Dec 2008
    Location
    Chennai
    Posts
    441
    Rep Power
    6

    Default HTMLEditorKit

    Hi to all.
    I want to find out the value of "src" attribute form Img Tag in Html file.
    I know that HtmlEditorKit is a class that help for parse the html .
    I could find out the HREF value of A Tag by the program given by java.sun.com

    Here i attached that program for you.

    How can i find out attribute value from all tags of Html.?


    I am here waiting for your response.
    Attached Files Attached Files
    Mak
    (Living @ Virtual World)

  2. #2
    Eranga's Avatar
    Eranga is offline Moderator
    Join Date
    Jul 2007
    Location
    Colombo, Sri Lanka
    Posts
    11,372
    Blog Entries
    1
    Rep Power
    20

  3. #3
    Webuser is offline Senior Member
    Join Date
    Dec 2008
    Posts
    526
    Rep Power
    0

    Default

    what you mean? Do you want to analyse the html code with a java code or what?

  4. #4
    makpandian's Avatar
    makpandian is online now Senior Member
    Join Date
    Dec 2008
    Location
    Chennai
    Posts
    441
    Rep Power
    6

    Default

    I want to analyze HTML file with the help of java.I know that HTMLEditorKit is used for Html Parsing.I could Extract anchor tag .But i am not able to extract IMG Tag .
    I please you to see my below code and send your command.
    import java.io.*;
    import java.net.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;

    class GetLinks {
    public static void main(String[] args) {
    EditorKit kit = new HTMLEditorKit();
    Document doc = kit.createDefaultDocument();

    // The Document class does not yet
    // handle charset's properly.
    doc.putProperty("IgnoreCharsetDirective",
    Boolean.TRUE);
    try {

    // Create a reader on the HTML content.
    Reader rd = getReader(args[0]);

    // Parse the HTML.
    kit.read(rd, doc, 0);

    // Iterate through the elements
    // of the HTML document.
    ElementIterator it = new ElementIterator(doc);
    javax.swing.text.Element elem;
    while ((elem = it.next()) != null) {
    MutableAttributeSet s = (MutableAttributeSet)
    elem.getAttributes().getAttribute(HTML.Tag.A);
    System.out.println(s);
    if (s != null) {
    System.out.println(
    s.getAttribute(HTML.Attribute.HREF));
    }
    }
    } catch (Exception e) {
    e.printStackTrace();
    }
    System.exit(1);
    }

    // Returns a reader on the HTML data. If 'uri' begins
    // with "http:", it's treated as a URL; otherwise,
    // it's assumed to be a local filename.
    static Reader getReader(String uri)
    throws IOException {
    if (uri.startsWith("http:")) {

    // Retrieve from Internet.
    URLConnection conn =
    new URL(uri).openConnection();
    return new
    InputStreamReader(conn.getInputStream());
    } else {

    // Retrieve from file.
    return new FileReader(uri);
    }
    }
    }


    Thanking You.
    Mak
    (Living @ Virtual World)

  5. #5
    Webuser is offline Senior Member
    Join Date
    Dec 2008
    Posts
    526
    Rep Power
    0

    Default

    what makes you analyse html code? Do you making an applet connecting to a *.js script or what? Or you just want to find img value right in a locally located html file?

  6. #6
    makpandian's Avatar
    makpandian is online now Senior Member
    Join Date
    Dec 2008
    Location
    Chennai
    Posts
    441
    Rep Power
    6

    Default

    I want to access all file name which are in Html file.The file may be Html,Jpg,Js,mp3 and son on.Actually i want to download all files .before do that i want to extract all file names from html.
    That's why i ask this...
    Thanking You.
    Mak
    (Living @ Virtual World)

  7. #7
    shanthini is offline Member
    Join Date
    Jan 2011
    Posts
    1
    Rep Power
    0

    Default

    private void AnchorExtractionActionPerformed(java.awt.event.Act ionEvent evt) {
    // TODO add your handling code here:
    FileInputStream s1=null;

    try {
    // URL webURL = new URL("http://www.google.com");
    // URLConnection conn = webURL.openConnection();

    s1 = new FileInputStream("page.html");
    DataInputStream n = new DataInputStream(s1);
    BufferedReader br = new BufferedReader(new InputStreamReader(n));
    // BufferedReader br = new BufferedReader(new InputStreamReader("page.html"));

    CallBack callback = new CallBack();
    ParserDelegator delegator = new ParserDelegator();
    //delegator.parse(br, cb, ignoreCharSet)
    delegator.parse(br, callback, true);
    System.out.println(callback.pageText.length());
    System.out.println(callback.pageText);

    } catch (IOException ex) {
    Logger.getLogger(title.class.getName()).log(Level. SEVERE, null, ex);
    }

    }
    class CallBack extends HTMLEditorKit.ParserCallback {

    Stack<HTML.Tag> stack = new Stack();

    public String pageText = "";

    @Override
    public void handleStartTag(HTML.Tag tag, MutableAttributeSet a, int pos) {
    //Get a tag and push it onto a stack
    String http;

    stack.push(tag);
    if (tag.toString().equals("a")) {


    String link = (String) a.getAttribute(HTML.Attribute.HREF);
    if (link != null && link.length() > 0) {

    System.out.println(link);

    }
    }
    } This is my code i need to analyse the anchor tag of the webpage .The above code displaying all the anchor tag present in the html page but i dunno how to extract line by line for analysing purpose ."link" contain all the href of the "a" tag but it need to count how many anchors tags are there if i use counter it displaying 0 with all the extracted links....can you suggest any solution for count the no of links available in link

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •