Jericho HTML Parser is a powerful Java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML. It also provides high-level HTML form manipulation functions.

Changes

This version includes important bugfixes and the following enhancements. Non-server tags are no longer recognized inside server tags. Microsoft downlevel-revealed conditional comments are recognized. All unnecessary white space may be removed from a source document. Various other enhancements were made to existing features.

URL: Jericho HTML Parser