Chapter 8. Pull Parsing With StAX

The two APIs we’ve examined thus far—SAX and DOM—take two different approaches to XML document parsing. A SAX parser notifies your code, through predefined interfaces, of various events as the parser traverses the XML document. DOM creates a tree structure in memory that is then returned to your code as one whole piece.

This chapter looks at an additional API—StAX—that uses yet a third approach for XML parsing commonly referred to as pull parsing. Pull parsing is similar to SAX in that your code interacts with the document as it is being read by the parser. The difference lies in how this interaction occurs. As the name implies, when you use a pull parser, your code asks the parser for the next event. Your code need not implement any special interfaces, as is necessary with SAX. As a result, code that uses a pull parser may be more concise and easier to read than the corresponding SAX code.

In addition, StAX provides a set of classes for writing XML documents, something SAX doesn’t handle at all. Unlike DOM or any other tree-based parser, the document does not remain in memory while it is being built.

We will also look at an alternative pull parser API—XmlPull—which was the predecessor to StAX but continues to be useful in memory-constrained applications, specifically those that use J2ME.

StAX Basics

StAX is an acronym for Streaming API for XML. It is Java Specification Recommendation (JSR) 173, sponsored by BEA with the goal of standardizing the ...

Get Java and XML, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.