Results 1 to 2 of 2
  1. #1
    africanhacker is offline Senior Member
    Join Date
    Feb 2011
    Posts
    107
    Rep Power
    0

    Default Parse HTML, regex help

    I am working on a project where I need to get specific content from an HTML page template on a newspaper site.

    I need to get the heading and the body of the article

    Java Code:
    <html>
    
    Ignore everything until you meet this
    
    [SIZE="5"]<h1 id="article_headline">Spoilt ballot papers  spark controversy</h1>[/SIZE]
    
    Operate on that h1 tag and remain with: Spoilt ballot papers  spark controversy
    
    That is then passed to a String variable.
    
    Then move on ignoring everything until we meet this:
    
    [SIZE="5"]<span class="article_body">Whole article is here, but there is a catch here. Wait for it</span>[/SIZE]
    
    We remove the tags and tag everything in the middle and pass it to a variable
    
    </html>

    The catch is that within the main article there is some text that appears with an advert that I want to take out

    Java Code:
    <div class='articlecontinues'><img src='/images/icon_downarrow.gif' /> CONTINUES BELOW 
    <img src='/images/icon_downarrow.gif' /></div><center><div style='width:300px; height:250px;'>
    <iframe marginwidth='0' marginheight='0' scrolling='no' frameborder='0' width='300' height='250'
     src='/adframe/3/49/frame/843474552'/></iframe></div> </center>
    Can someone please help, I know this is a lot to ask but I really need the help.
    Last edited by africanhacker; 04-01-2011 at 12:12 PM.

  2. #2
    doWhile is offline Moderator
    Join Date
    Jul 2010
    Location
    California
    Posts
    1,638
    Rep Power
    13

Similar Threads

  1. parse simple string with regex?
    By zardos in forum New To Java
    Replies: 1
    Last Post: 03-01-2011, 12:14 PM
  2. Parse HTML
    By gab in forum New To Java
    Replies: 1
    Last Post: 02-21-2011, 10:53 PM
  3. parse html js and ajax
    By jotremar in forum Advanced Java
    Replies: 0
    Last Post: 05-21-2010, 12:54 PM
  4. How to parse in html
    By paty in forum New To Java
    Replies: 1
    Last Post: 07-24-2007, 12:29 AM
  5. How to parse HTML tags
    By Ada in forum Advanced Java
    Replies: 1
    Last Post: 05-31-2007, 09:42 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •