Results 1 to 1 of 1
- 05-12-2010, 11:11 AM #1
Member
- Join Date
- May 2010
- Posts
- 1
- Rep Power
- 0
Parsing Real World HTML with XPath support
Dear All:
I am using TagSoup+XOM per:
BadMagicNumber » Using XPath on real-world HTML documents
seems to work well except the following namespace problem:
Dom4j + XPath + TagSoup – Namespaces = sweet! :: Kelvin Tan - Lucene Solr Nutch Consultant
It seems other parsers are available:
Open Source HTML Parsers in Java
some of which support XPath.
Any ideas which is fastest for real-world HTML?
Any ideas if XOM is best way to go, or Dom4j, etc.?
Thank you
Misha
Similar Threads
-
Problems with parsing using XPath
By thooom in forum XMLReplies: 6Last Post: 04-26-2010, 09:47 AM -
Problems with XPath (XML Parsing)
By thooom in forum New To JavaReplies: 6Last Post: 04-25-2010, 04:56 PM -
any parser for parsing XPath expression into an data structure?
By yuyu200 in forum XMLReplies: 0Last Post: 11-13-2009, 09:51 PM -
Good real world practice
By Mr.Beans in forum Jobs DiscussionReplies: 1Last Post: 08-15-2009, 04:59 AM -
real world java
By Zosden in forum Forum LobbyReplies: 6Last Post: 06-25-2008, 05:39 AM


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks