Results 1 to 4 of 4
Thread: html web page parsing/scraping
- 05-02-2007, 03:32 AM #1
html web page parsing/scraping
Hi, I am trying to automate some routine web browsing functionality. I need to log in/enter information/etc...however, the part that gets tricky (as far as finding a solution) is at some point after submitting information from a page, the links returned are undetermined...in other words, the results are not always the same (as far as number of naming)...and I need a way of accessing the links returned...determining their text, and being able to continue to specific links from there...certain screen scrapers out there come very close to doing what i want with the exception of the last part. any java api out there to handle this type of stuff?? I've tried httpunit & something very similar (forget the name), but they didn't work...i think issues with java script, etc...looking for perhaps a language or java api specifically geared around this type of stuff...if anyone has any insight, i would greatly appreciate it!! thanks...
- 05-02-2007, 03:35 AM #2
Member
- Join Date
- Apr 2007
- Location
- USA
- Posts
- 50
- Rep Power
- 0
For any given page you should use an HTML parser to parse and process the document in any way you see fit. This allows you to retreive all links etc. Apache also has some really nice libraries in the HTTPComponents sub project.
HTML Parser - HTML Parser
HttpComponents - HttpComponents Overview
Also, if you choose not to elect Java for the task, I would suggest Python.
- 10-21-2010, 01:31 PM #3
Member
- Join Date
- Sep 2010
- Posts
- 26
- Rep Power
- 0
html web page parsing/scraping
Hello dear orchid, Iam francojava1, who suggests you visit at this sample HTML Scraper Python recipes HTML Scraper « Python recipes « ActiveState Code. Please tell me what is the part of codes that want to codified there .Also, I could build a java API with respect to this parser Ok.
Thanks.
-
Perhaps we shouldn't wake up long dead posts. Locking.
Similar Threads
-
jeditorpane help parsing html
By asifsolkar in forum Advanced JavaReplies: 4Last Post: 12-14-2007, 05:23 AM -
How to view applet from html page.
By jwzumwalt in forum Java AppletsReplies: 2Last Post: 11-24-2007, 04:21 AM -
HTML page
By bbq in forum New To JavaReplies: 1Last Post: 07-05-2007, 03:46 AM -
Create a Applet in the page HTML
By Daniel in forum Java AppletsReplies: 2Last Post: 07-04-2007, 07:52 AM


LinkBack URL
About LinkBacks

Bookmarks