Results 1 to 8 of 8
  1. #1
    Nerd is offline Member
    Join Date
    Nov 2010
    Posts
    5
    Rep Power
    0

    Default [HELP] Java with URLs/HTML

    So, I'm pretty new to Java. I was wondering if it would be possible for Java to open a webpage, and assign text found between HTML tags to a variable.
    For example, I want Java to open Google.
    Then it will check the source code for "<title>___</title>"
    Now it should assign "Google" to x.

    Help would be appreciated

  2. #2
    eRaaaa is offline Senior Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    787
    Rep Power
    5

    Default

    Java Code:
    	public static void main(String[] args) throws Exception {
    		final URL url = new URL("http://www.java-forums.org/java-applets/35032-help-java-urls-html.html");
    		Scanner sc = new Scanner(url.openStream());
    		sc.findWithinHorizon("<title>(.+?)</title>", 0);
    		System.out.println(sc.match().group(1));
    		sc.close();
    	}
    or use a java html parser :)

  3. #3
    Nerd is offline Member
    Join Date
    Nov 2010
    Posts
    5
    Rep Power
    0

    Default

    Thank you. But I still get some errors. What imports do I need for this?

    Here's the error:
    each red word has the same error.
    IDK about you, they look declared to me xD
    Last edited by Nerd; 11-20-2010 at 08:51 PM.

  4. #4
    eRaaaa is offline Senior Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    787
    Rep Power
    5

    Default

    Java Code:
    import java.net.URL;
    import java.util.Scanner;
    Scanner (Java Platform SE 6)
    URL (Java Platform SE 6)

  5. #5
    Nerd is offline Member
    Join Date
    Nov 2010
    Posts
    5
    Rep Power
    0

    Default

    Thank you SO much!! It works!!

  6. #6
    Nerd is offline Member
    Join Date
    Nov 2010
    Posts
    5
    Rep Power
    0

    Default

    OOPS, sorry. Thought this forum has auto-merge.

    Hmm, it isn't fully working for me. :[
    I'm trying to create a Hi score Scanner that prints usernames. But it doesn't seem to check the HTML correctly. I get an exception:
    Java Code:
    Exception in thread "main" java.lang.IllegalStateException: No match result available
            at java.util.Scanner.match(Scanner.java:1269)
            at hiscoresbot.HiscoresBot2.main(HiscoresBot2.java:12)
    Java Result: 1
    I'm using this website: RuneScape - MMORPG - The No.1 Free Online Multiplayer Game

    Check the HTML, on line 16 you can clearly see <title>___</title> (Yes, I'm using title for now, before I do usernames)

    EDIT: Oh, is it because it's not .HTML?
    Last edited by Nerd; 11-20-2010 at 09:14 PM.

  7. #7
    eRaaaa is offline Senior Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    787
    Rep Power
    5

    Default

    ?? the code above with the url: "http://services.runescape.com/m=hiscore/hiscores.ws" prints "RuneScape - MMORPG - The No.1 Free Online Multiplayer Game"

    but if you want to get usenrames, or other stuff, you should use a real html parser libraray like htmlparser.sourceforge.net oder jericho or ...

  8. #8
    Nerd is offline Member
    Join Date
    Nov 2010
    Posts
    5
    Rep Power
    0

    Default

    It doesn't for me :[
    But if you think so, can you give me an example code? Link me or something?

Similar Threads

  1. Manipulating URLs
    By TheFlying_Boy in forum Networking
    Replies: 0
    Last Post: 08-03-2009, 05:01 PM
  2. Web Spider - Extract URLS
    By heveen in forum Networking
    Replies: 2
    Last Post: 07-09-2009, 01:15 PM
  3. reading an Html file and checking for urls
    By sudukrish in forum Advanced Java
    Replies: 1
    Last Post: 04-25-2009, 01:39 AM
  4. getting URLs
    By Shiv in forum Networking
    Replies: 3
    Last Post: 04-16-2009, 05:48 PM
  5. Integrate images and urls in any java application
    By Engineeringserver.com in forum New To Java
    Replies: 2
    Last Post: 08-06-2008, 11:46 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •