Advanced Parser Factory Usage

PyXML features several parsers, and multiple ways to instantiate them, depending on whether you’re using SAX, trying to create a DOM tree, or doing something completely different. Designed for portable code, a ParserFactory class is provided that supplies a SAX-ready parser guaranteed available in your runtime environment. Additionally, you can explicitly create a parser (or SAX driver) by dipping into any specific package, such as PyExpat. We illustrate an example of both, but normally you should rely on the parser factory to instantiate a parser.

The make_parser function (imported from xml.sax) returns a SAX driver for the first available parser in the list that you supply, or returns an available parser if no list is specified or if the list contains parsers that are not found or cannot be loaded. The make_parser function has its roots as part of the xml.sax.saxexts.ParserFactory class, but it is better to import the method from xml.sax (more on this in a bit). For example:

from xml.sax import make_parser
parser = make_parser(  )

At the time of this writing, if you have PyXML installed, a call to make_parser without an argument is sure to return either a PyExpat or xmlproc driver. If you dig into the source of the xml.sax module, you will see this list supplied to the ParserFactory class. If you instantiate a parser factory directly out of xml.sax.saxexts, you need to be sure to supply a list containing the name of at least one valid parser, or it ...

Get Python & XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.