12.3. Parsing XML with the DOM

Problem

You want to parse an XML file using the DOM API. This puts the file into a tree, which you can process using DOM functions. With the DOM, it’s easy to search for and retrieve elements that fit a certain set of criteria.

Solution

Use PHP’s DOM XML extension. Here’s how to read XML from a file:

$dom = domxml_open_file('books.xml');

Here’s how to read XML from a variable:

$dom = domxml_open_mem($books);

You can also get just a single node. Here’s how to get the root node:

$root = $dom->document_element( );

Here’s how to do a depth-first recursion to process all the nodes in a document:

function process_node($node) {
    if ($node->has_child_nodes( )) {
        foreach($node->child_nodes( ) as $n) {
            process_node($n);
        }
    }

    // process leaves
    if ($node->node_type( ) =  = XML_TEXT_NODE) {
        $content = rtrim($node->node_value( ));
        if (!empty($content)) {
            print "$content\n";
        }
    }

}
process_node($root);

Discussion

The W3C’s DOM provides a platform- and language-neutral method that specifies the structure and content of a document. Using the DOM, you can read an XML document into a tree of nodes and then maneuver through the tree to locate information about a particular element or elements that match your criteria. This is called tree-based parsing . In contrast, the non-DOM XML functions allow you to do event-based parsing.

Additionally, you can modify the structure by creating, editing, and deleting nodes. In fact, you can use the DOM XML functions to author a new XML ...

Get PHP Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.