Pull-Based XML Processing

The most recent entrant into the XML processing arena is the so-called pull processing model. One of the most widely used pull processors is the Microsoft .NET XMLReader class. The pull model is most similar to the event-based model in that it makes the contents of the XML document available progressively as the document is parsed.

Unlike the event model, the pull approach relies on the client application to request content from the parser at its own pace. For example, a pull client might include the following code to parse the simple document shown in Example 18-1:

reader.ReadStartElement("name")
reader.ReadStartElement("given")
givenName = reader.ReadString( )
reader.ReadEndElement( )
reader.ReadStartElement("family")
familyName = reader.ReadString( )
reader.ReadEndElement( )
reader.ReadEndElement( )

The pull client requests the XML content it expects to see from the pull parser. In practice, this makes pull client code easier to read and understand than the corresponding event-based code would be. It also tends to reduce the need to create stacks and structures to contain document information, as the code itself can be written to mirror recursive descent parsing.

In the Java world, BEA, Sun, and several individual developers have collaborated to create the Streaming API for XML (StAX). StAX and other pull parsers share the advantages of streaming with SAX such as speed, parallelism, and memory efficiency while offering an API that is more comfortable to ...

Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.