Content Handlers

In order to let an application do something useful with XML data as it is being parsed, you must register handlers with the SAX parser. A handler is nothing more than a set of callbacks that SAX defines to let programmers insert application code at important events within a document’s parsing. These events take place as the document is parsed, not after the parsing has occurred. This is one of the reasons that SAX is such a powerful interface: it allows a document to be handled sequentially, without having to first read the entire document into memory. Later, we will look at the Document Object Model (DOM), which has this limitation.[3]

There are four core handler interfaces defined by SAX 2.0: org.xml.sax.ContentHandler , org.xml.sax.ErrorHandler, org.xml.sax.DTDHandler, and org.xml.sax.EntityResolver. In this chapter, I will discuss ContentHandler and ErrorHandler. I’ll leave discussion of DTDHandler and EntityResolver for the next chapter; it is enough for now to understand that EntityResolver works just like the other handlers, and is built specifically for resolving external entities specified within an XML document. Custom application classes that perform specific actions within the parsing process can implement each of these interfaces. These implementation classes can be registered with the reader using the methods setContentHandler( ) , setErrorHandler( ), setDTDHandler( ), and setEntityResolver( ). Then the reader invokes the callback methods ...

Get Java and XML, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.