Results 1 to 3 of 3
  1. #1
    jeffsak is offline Member
    Join Date
    Jan 2017
    Posts
    1
    Rep Power
    0

    Default Manipulating Data from html file and converting to .txt

    I am trying to write a program that takes the source code from a html file and manipulates the data. So far, I have been able to write code to strip all html code and output onto a .txt file which is great. But I am struggling with the next part of my code. I am trying to model a class for entertainment review (the website I am taking info from is a blog about movie, play, and film reviews). I want to be able to make it organized so that the new .txt file will be sorted by the type of entertainment they watched (i.e. film, play, etc.), then display the cost of the movie and also show whether it was recommended or not. I have attached what some of the source code looks like below. I really don't even know where to start here, or how to manipulate the data to be organized. I feel I am supposed to make a constructor that takes each line of data, but I could be wrong.


    <p>Tonight I saw <em class="film">You Will Meet a Tall Dark Stranger,</em> a film by Woody Allen but without him anywhere in it. I'd say it's okay to see once, but not critical.... lots of intertwining relationship stories with endings you can never anticipate.
    </p><p>
    Today I saw <em class="play">Vanya and Sonia and Masha and Spike</em> at the Santa Paula Theater Center. <!-- $18. --> This is a funny play, definitely worth seeing, and the performances were good. There's one scene near the end where a woman receives a phone call from "Joe." She goes through all kinds of emotions and really, really pulls it off.
    </p><p>
    Tonight I saw <em class="film">Ant-Man</em> at the Mall. <!-- $8 --> Chris is going to watch this again, catching the very next showing (with Bob). I don't think it's worth seeing a second time.
    </p><p>
    Tonight I saw <em class="film">The Imitation Game</em> at Village Twin. <!-- $8 -->
    </p><p>

  2. #2
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    13,130
    Rep Power
    23

    Default Re: Manipulating Data from html file and converting to .txt

    So each <P></P> set is a review?
    If not then you need to figure out how to separate the individual reviews.

    What did you use to break down the HTML?
    I would have expected using JSoup or similar would have been a little easier to search through the code.

    I can see things that would tell you what (and what type) is being reviewed, in those <em> tags, and the price, in the comment tags, but there's nothing there to say whether it is a good or bad review.
    Please do not ask for code as refusal often offends.

    ** This space for rent **

  3. #3
    Tolls is offline Moderator
    Join Date
    Apr 2009
    Posts
    13,130
    Rep Power
    23

    Default Re: Manipulating Data from html file and converting to .txt

    Please do not ask for code as refusal often offends.

    ** This space for rent **

Similar Threads

  1. Replies: 1
    Last Post: 08-21-2015, 02:49 PM
  2. Replies: 3
    Last Post: 02-11-2013, 08:40 AM
  3. Manipulating Map data help needed please ;)
    By lannie1980 in forum New To Java
    Replies: 14
    Last Post: 04-28-2012, 09:36 PM
  4. Replies: 1
    Last Post: 11-04-2010, 03:42 PM
  5. Retrieving the data posted to a JSP file from HTML file
    By marie in forum JavaServer Pages (JSP) and JSTL
    Replies: 1
    Last Post: 10-21-2010, 08:37 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •