Filters

A SAX filter sits between the parser and the client application and intercepts the messages that these two objects pass to each other. It can pass these messages unchanged or modify, replace, or block them. To a client application, the filter looks like a parser, that is, an XMLReader. To the parser, the filter looks like a client application, that is, a ContentHandler .

SAX filters are implemented by subclassing the org.xml.sax.helpers.XMLFilterImpl class.[1] This class implements all the required interfaces of SAX for both parsers and client applications. That is, its signature is as follows:

public class XMLFilterImpl implements XMLFilter, XMLReader,
 ContentHandler, DTDHandler, ErrorHandler

Your own filters will extend this class and override those methods that correspond to the messages you want to filter. For example, if you wanted to filter out all processing instructions, you would write a filter that would override the processingInstruction() method to do nothing, as shown in Example 20-5.

Example 20-5. A SAX filter that removes processing instructions
import org.xml.sax.helpers.XMLFilterImpl;
     
public class ProcessingInstructionStripper extends XMLFilterImpl {
     
  public void processingInstruction(String target, String data) {
    // Because this does nothing, processing instructions read in the
    // document are *not* passed to client application
  }
     
}

If instead you wanted to replace a processing instruction with an element whose name was the same as the processing instruction’s ...

Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.