8.2. SAX: An Event-Based API

Since XML hit the scene, hundreds of XML products have appeared, from validators to editors to digital asset management systems. All these products share some common traits: they deal with files, parse XML, and handle XML markup. Developers know that reinventing the wheel with software is costly, but that's exactly what they were doing with XML products. It soon became obvious that an application programming interface, or API, for XML processing was needed.

An API is a foundation for writing programs that handles the low-level stuff so you can concentrate on the real meat of your program. An XML API takes care of things like reading from files, parsing, and routing data to event handlers, while you just write the event-handling routines.

The Simple API for XML (SAX) is an attempt to define a standard event-based XML API (see Appendix B). Some of the early pioneers of XML were involved in this project. The collaborators worked through the XML-DEV mailing list, and the final result was a Java package called org.xml.sax. This is a good example of how a group of people can work together efficiently and develop a system--the whole thing was finished in five months.

SAX is based around an event-driven model, using call-backs to handle processing. There is no tree representation, so processing happens in a single pass through the document. Think of it as "serial access" for XML: the program can't jump around to random places in the document. On the ...

Get Learning XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.