Name

The LexicalHandler Interface

Synopsis

LexicalHandler is a callback interface that provides information about aspects of the document that are not normally relevant, specifically:

  • CDATA sections

  • Entity boundaries

  • DTD boundaries

  • Comments

Without a LexicalHandler, the parser simply ignores comments and expands entity references and CDATA sections. By using the LexicalHandler interface, however, you can read the comments and learn which text came from regular character data, which came from a CDATA section, and which came from which entity reference.

To configure an XMLReader with a LexicalHandler, pass an instance of your handler class to the reader’s setProperty( ) method with the name http://xml.org/sax/properties/LexicalHandler:

try {
  parser.setProperty(
    "http://xml.org/sax/properties/LexicalHandler",
    new YourLexicalHandlerClass(  )
  );
}
catch(SAXException ex) {
  System.out.println("This parser does not provide lexical events.");
}

If the parser does not provide lexical events, it throws a SAXNotRecognizedException. If the parser cannot install a LexicalHandler at this moment (generally because it’s in the middle of parsing a document), then it throws a SAXNotSupportedException. If it doesn’t throw one of these exceptions, it calls back to the methods in the LexicalHandler as it encounters entity references, comments, and CDATA sections. The basic content of the resolved entities and CDATA sections are still reported through the ContentHandler interface, as normal:

package org.xml.sax.ext; ...

Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.