Results 1 to 2 of 2
  1. #1
    africanhacker is offline Senior Member
    Join Date
    Feb 2011
    Posts
    107
    Rep Power
    0

    Default Parse HTML, regex help

    I am working on a project where I need to get specific content from an HTML page template on a newspaper site.

    I need to get the heading and the body of the article

    Java Code:
    <html>
    
    Ignore everything until you meet this
    
    [SIZE="5"]<h1 id="article_headline">Spoilt ballot papers  spark controversy</h1>[/SIZE]
    
    Operate on that h1 tag and remain with: Spoilt ballot papers  spark controversy
    
    That is then passed to a String variable.
    
    Then move on ignoring everything until we meet this:
    
    [SIZE="5"]<span class="article_body">Whole article is here, but there is a catch here. Wait for it</span>[/SIZE]
    
    We remove the tags and tag everything in the middle and pass it to a variable
    
    </html>

    The catch is that within the main article there is some text that appears with an advert that I want to take out

    Java Code:
    <div class='articlecontinues'><img src='/images/icon_downarrow.gif' /> CONTINUES BELOW 
    <img src='/images/icon_downarrow.gif' /></div><center><div style='width:300px; height:250px;'>
    <iframe marginwidth='0' marginheight='0' scrolling='no' frameborder='0' width='300' height='250'
     src='/adframe/3/49/frame/843474552'/></iframe></div> </center>
    Can someone please help, I know this is a lot to ask but I really need the help.
    Last edited by africanhacker; 04-01-2011 at 01:12 PM.

  2. #2
    doWhile is offline Moderator
    Join Date
    Jul 2010
    Location
    California
    Posts
    1,641
    Rep Power
    7

Similar Threads

  1. parse simple string with regex?
    By zardos in forum New To Java
    Replies: 1
    Last Post: 03-01-2011, 01:14 PM
  2. Parse HTML
    By gab in forum New To Java
    Replies: 1
    Last Post: 02-21-2011, 11:53 PM
  3. parse html js and ajax
    By jotremar in forum Advanced Java
    Replies: 0
    Last Post: 05-21-2010, 01:54 PM
  4. How to parse in html
    By paty in forum New To Java
    Replies: 1
    Last Post: 07-24-2007, 01:29 AM
  5. How to parse HTML tags
    By Ada in forum Advanced Java
    Replies: 1
    Last Post: 05-31-2007, 10:42 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •