SAX

The SAX API provides a procedural approach to parsing an XML file. As a SAX parser iterates through an XML file, it performs callbacks to a user-specified object. These calls indicate the start or end of an element, the presence of character data, and other significant events during the life of the parser.

SAX doesn’t provide random access to the structure of the XML file; each tag must be handled as it is encountered by the browser. This means that SAX provides a relatively fast and efficient method of parsing. Because the SAX parser deals only with one element at a time, implementations can be extremely memory-efficient, making it often the only reasonable choice for dealing with particularly large files.

SAX Handlers

The SAX API allows programs to define three kinds of objects, implementing the org.xml.sax.ContentHandler, org.xml.sax.ErrorHandler, and org.xml.sax.DTDHandler interfaces, respectively. Processing a document with SAX involves passing a handler implementation to the parser and calling the parse( ) method of SAXParser. The parser will read the contents of the XML file, calling the appropriate method on the handler when significant events (such as the start of a tag) occur. All handler methods may throw a SAXException in the event of an error.

We’ll take a closer look at the ContentHandler and ErrorHandler interfaces next.

ContentHandler

Most, if not all, SAX applications implement the ContentHandler interface. The SAX parser will call methods on a ContentHandler ...

Get Java Enterprise in a Nutshell, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.