Reading XML as Events with SAX

SAX is the original XML extension available with PHP. It’s also the best supported, since expat has been bundled with PHP since the release of PHP 4. PHP’s SAX support in PHP 5 and PHP 4 is the same. The only difference is a behind-the-scenes change.

Since PHP 5 unbundles expat in favor of libxml2, a compatibility layer maps SAX calls from one parser to the other. Therefore, your SAX applications should work exactly as they did under expat. Theory, however, doesn’t always equal reality. There are a few differences that can slip you up.

Here’s a list of the major incompatibilities:

Namespace parsing with xml_parser_create_ns( )

The xml_parser_create_ns( ) function, which is a namespace-aware version of xml_parser_create( ) has problems with default namespaces under old versions of libxml2. Therefore, this function is disabled unless you build PHP using libxml2 Version 2.6 or greater.

Fallback handling with xml_set_default_handler( )

With expat, all events that lack a handler are processed by the default handler. With libxml2, you must define a specific handler for each event. The default handler only handles comments and internal entities, such as &.

External entity handling with xml_set_external_entity_ref_handler( )

This function works under libxml2. Under expat, the default handler captures external entities like <!ENTITY rasmus SYSTEM "rasmus.ent">.

Get Upgrading to PHP 5 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.