SAX for Survival
by, 04-27-2012 at 05:14 PM (516 Views)
Compared to the DOM API, the SAX API is an attractive approach. SAX doesn't have a generic object model, so it doesn't have the memory or performance problems associated with abusing thenew operator. And with SAX, there is no generic object model to ignore if you plan to use a specific problem-domain object model instead. Moreover, since SAX processes the XML document in a single pass, it requires much less processing time.
SAX does have a few drawbacks, but they are mostly related to the programmer, not the runtime performance of the API. Let's look at a few.
The first drawback is conceptual. Programmers are accustomed to navigating to get data; to find a file on a file server, you navigate by changing directories. Similarly, to get data from a database, you write an SQL query for the data you need. With SAX, this model is inverted. That is, you set up code that listens to the list of every available piece of XML data available. That code activates only when interesting XML data are being listed. At first, the SAX API seems odd, but after a while, thinking in this inverted way becomes second nature.
The second drawback is more dangerous. With SAX code, the naive "let's take a hack at it" approach will backfire fairly quickly, because the SAX parser exhaustively navigates the XML structure while simultaneously supplying the data stored in the XML document. Most people focus on the data-mapping aspect and neglect the navigational aspect. If you don't directly address the navigational aspect of SAX parsing, the code that keeps track of the location within the XML structure during SAX parsing will become spread out and have many subtle interactions. This problem is similar to those associated with overdependence on global variables. But if you learn to properly structure SAX code to keep it from becoming unwieldy, it is more straightforward than using the DOM API.