Results 1 to 2 of 2
  1. #1
    anthropamorphic's Avatar
    anthropamorphic is offline Senior Member
    Join Date
    Jun 2011
    Rep Power

    Default Retrieve Text From A Webpage

    Hello, I would like to know how I could retrieve text from a webpage without getting all the code in it. I tried this:
    Java Code:
    try {
    			url = new URL("");
    			    BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
    			    String str;
    			    pan = new JEditorPane();
    			    JScrollPane p = new JScrollPane(pan);
    			    while ((str = in.readLine()) != null) {
    					pan.setText(pan.getText() + "\n" + str);
    		} catch (MalformedURLException e) {
    			// TODO Auto-generated catch block
    		} catch (IOException e) {
    			// TODO Auto-generated catch block
    		try {
    		} catch (IOException e) {
    			// TODO Auto-generated catch block
    but of course it ends up giving me everything:
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "">
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <meta http-equiv="Content-Style-Type" content="text/css">
    <meta name="Generator" content="Cocoa HTML Writer">
    <meta name="CocoaVersion" content="1138">
    <style type="text/css">
    p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Times}
    <p class="p1">@1@</p>
    <p class="p1">@Nothing New Has Been Added@</p>

    <!-- Free Web Hosting with PHP, MySQL and cPanel, No Ads Analytics Code -->
    <script type="text/javascript" src=""></script>
    <noscript><a href=""><img src="" alt="web hosting" /></a></noscript>
    <!-- End Of Analytics Code -->
    What I would like to get out of this is just the "@1@" and the "@nothing new has been added@"
    Last edited by anthropamorphic; 10-13-2011 at 02:02 AM.

  2. #2
    coiner is offline Member
    Join Date
    Jan 2010
    Rep Power

    Default Re: Retrieve Text From A Webpage

    the JVM has built in classes to parse XML. Take time to look at DOM manipulation examples using the classes found in javax.xml and you will easily be able to parse out the data you want from the page you are downloading.

Similar Threads

  1. please help urgent (get text from webpage)
    By 3ammary in forum Advanced Java
    Replies: 6
    Last Post: 07-24-2011, 09:29 AM
  2. Serving up a Webpage
    By kammce in forum Networking
    Replies: 7
    Last Post: 01-01-2011, 03:57 AM
  3. [HELP] How to run this applet in webpage? [HELP]
    By velianko1 in forum Java Applets
    Replies: 4
    Last Post: 12-16-2010, 06:09 PM
  4. Using regex to retrieve all text inside parentheses
    By adhoc334 in forum Advanced Java
    Replies: 5
    Last Post: 08-18-2010, 09:05 PM
  5. Retrieve values of Text boxes using LIST
    By Kayal in forum Web Frameworks
    Replies: 2
    Last Post: 03-20-2009, 12:00 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts