Getting familiar with the XPath notation

XPath is a set of rules used to get information from an XML document. In XPath, XML documents are treated as trees of nodes. There are several kinds of nodes: elements, attributes, and texts are some of them. world, country, and isofficial are some of the nodes in the sample file.

Among the nodes, there are relationships. A node has a parent, zero or more children, siblings, ancestors, and descendants depending on where the other nodes are in the hierarchy.

In the sample countries file, country is the parent of the elements name, capital, and language. These three elements are the children of country.

To select a node in an XML document, you have to use a path expression relative to a current node. ...

Get Learning Pentaho Data Integration 8 CE - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.