Content Handlers

In order to let our application do something useful with XML data as it is being parsed, we must register handlers with the SAX parser. A handler is nothing more than a set of callbacks that SAX defines to let us interject application code at important events within a document’s parsing. Realize that these events will take place as the document is parsed, not after the parsing has occurred. This is one of the reasons that SAX is such a powerful interface: it allows a document to be handled sequentially, without having to first read the entire document into memory. We will later look at the Document Object Model (DOM), which has this limitation.

There are four core handler interfaces defined by SAX 2.0: org.xml.sax.ContentHandler , org.xml.sax.ErrorHandler, org.xml.sax.DTDHandler, and org.xml.sax.EntityResolver. In this chapter, we discuss ContentHandler, which allows standard data-related events within an XML document to be handled, and take a first look at ErrorHandler, which receives notifications from the parser when errors in the XML data are found. DTDHandler will be examined in Chapter 5. We briefly discuss EntityResolver at various points in the text; it is enough for now to understand that EntityResolver works just like the other handlers, and is built specifically for resolving external entities specified within an XML document. Custom application classes that perform specific actions within the parsing process can implement each of these interfaces. ...

Get Java and XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.