I want to write a program in Java which gets information from a standard HTML file and generates an XML file as output.
(In details: it gets an IMDB movie identifier as an input (for example tt0361862), downloads the HTML (http://www.imdb.com/title/tt0361862), gets information such as director and title and writes it to an XML file.
I think it can be done, because the HTML structure is always the same.)
Can you give me any tips what classes/API I shoud use?
Thanks in advance.