Cover image for XPath and XPointer

Book description

Referring to specific information inside an XML document is a little like finding a needle in a haystack: how do you differentiate the information you need from everything else? XPath and XPointer are two closely related languages that play a key role in XML processing by allowing developers to find these needles and manipulate embedded information. XPath describes a route for finding specific items by defining a path through the hierarchy of an XML document, abstracting only the information that's relevant for identifying the data. XPointer extends XPath to identify more complex parts of documents. The two technologies are critical for developers seeking needles in haystacks in various types of processing. XPath and XPointer fills an essential need for XML developers by focusing directly on a critical topic that has been covered only briefly. Written by John Simpson, an author with considerable XML experience, the book offers practical knowledge of the two languages that underpin XML, XSLT and XLink. XPath and XPointer cuts through basic theory and provides real-world examples that you can use right away. Written for XML and XSLT developers and anyone else who needs to address information in XML documents, the book assumes a working knowledge of XML and XSLT. It begins with an introduction to XPath basics. You'll learn about location steps and paths, XPath functions and numeric operators. Once you've covered XPath in depth, you'll move on to XPointer--its background, syntax, and forms of addressing. By the time you've finished the book, you'll know how to construct a full XPointer (one that uses an XPath location path to address document content) and completely understand both the XPath and XPointer features it uses. XPath and XPointer contains material on the forthcoming XPath 2.0 spec and EXSLT extensions, as well as versions 1.0 of both XPath and XPointer. A succinct but thorough hands-on guide, no other book on the market provides comprehensive information on these two key XML technologies in one place.

Table of Contents

  1. XPath and XPointer
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. A Note Regarding Supplemental Files
    3. Preface
      1. Who Should Read This Book?
      2. Who Should Not Read This Book?
      3. Organization of the Book
      4. Conventions Used in This Book
      5. Comments and Questions
      6. Acknowledgments
    4. 1. Introducing XPath and XPointer
      1. 1.1. Why XPath and XPointer?
      2. 1.2. Antecedents/History
        1. 1.2.1. DSSSL
        2. 1.2.2. XSL
        3. 1.2.3. TEI
        4. 1.2.4. Intermedia
      3. 1.3. XPath, XPointer, and Other XML-Related Specs
        1. 1.3.1. Specs Dependent on XPath and XPointer
      4. 1.4. XPath and XPointer Versus XQuery
    5. 2. XPath Basics
      1. 2.1. The Node Tree: An Introduction
      2. 2.2. XPath Expressions
        1. 2.2.1. Location Steps and Location Paths
        2. 2.2.2. Expression Syntax
          1. 2.2.2.1. Tokens
          2. 2.2.2.2. Delimiters
          3. 2.2.2.3. Combining tokens and delimiters into complete expressions
      3. 2.3. XPath Data Types
        1. 2.3.1. Strings
        2. 2.3.2. Numeric Values
        3. 2.3.3. Boolean Values
      4. 2.4. Nodes and Node-Sets
        1. 2.4.1. Node Properties
          1. 2.4.1.1. Node names
          2. 2.4.1.2. Document order
          3. 2.4.1.3. Family relationships
        2. 2.4.2. Node-Sets
        3. 2.4.3. Node Types
          1. 2.4.3.1. The root node
          2. 2.4.3.2. Element nodes
          3. 2.4.3.3. Attribute nodes
          4. 2.4.3.4. PI nodes
          5. 2.4.3.5. Comment nodes
          6. 2.4.3.6. Text nodes
          7. 2.4.3.7. Namespace nodes
          8. 2.4.3.8. XPath node types and the XML Infoset
      5. 2.5. Node-Set Context
      6. 2.6. String-Values
        1. 2.6.1. String-Value of a Node-Set
    6. 3. Location Steps and Paths
      1. 3.1. XPath Expressions
        1. 3.1.1. The Filesystem Analogy
        2. 3.1.2. Points of Similarity, Points of Difference
      2. 3.2. Location Paths
        1. 3.2.1. The Importance of Context
        2. 3.2.2. Absolute Versus Relative Location Paths
        3. 3.2.3. Compound Location Paths
      3. 3.3. Location Steps
        1. 3.3.1. The Big Picture
        2. 3.3.2. The Node Test
        3. 3.3.3. The Axis
          1. 3.3.3.1. Defaults and shortcuts
          2. 3.3.3.2. Restrictions by context node type
          3. 3.3.3.3. Axes and efficiency
        4. 3.3.4. The Predicate
          1. 3.3.4.1. Nesting predicates
          2. 3.3.4.2. Compound predicates
          3. 3.3.4.3. Predicates with a single value and no operator
          4. 3.3.4.4. Special case: numeric-valued predicates
          5. 3.3.4.5. "Stacked" predicates
      4. 3.4. Compound Location Paths Revisited
    7. 4. XPath Functions and Numeric Operators
      1. 4.1. Introduction to Functions
        1. 4.1.1. What Functions Do
        2. 4.1.2. Functions Within Functions
      2. 4.2. XPath Function Types
        1. 4.2.1. Node-Set Functions
          1. 4.2.1.1. last( )
          2. 4.2.1.2. position( )
          3. 4.2.1.3. count(nodeset)
          4. 4.2.1.4. id(anytype)
          5. 4.2.1.5. id( ) and node-set arguments
          6. 4.2.1.6. local-name(nodeset?)
          7. 4.2.1.7. namespace-uri(nodeset?)
          8. 4.2.1.8. name(nodeset?)
        2. 4.2.2. String Functions
          1. 4.2.2.1. string(anytype?)
          2. 4.2.2.2. concat(string1, string2, ...)
          3. 4.2.2.3. starts-with(string1, string2)
          4. 4.2.2.4. contains(string1, string2)
          5. 4.2.2.5. substring(string, number1, number2?)
          6. 4.2.2.6. substring-before(string1, string2) and substring-after(string1, string2)
          7. 4.2.2.7. string-length(string?)
          8. 4.2.2.8. normalize-space(string?)
          9. 4.2.2.9. translate(string1, string2, string3)
        3. 4.2.3. Boolean Functions
          1. 4.2.3.1. boolean(anytype)
          2. 4.2.3.2. not(boolean)
          3. 4.2.3.3. true() and false( )
          4. 4.2.3.4. lang(string)
        4. 4.2.4. Numeric Functions
          1. 4.2.4.1. number(anytype?)
          2. 4.2.4.2. sum(nodeset)
          3. 4.2.4.3. floor(number) and ceiling(number)
          4. 4.2.4.4. round(number)
      3. 4.3. XPath Numeric Operators
        1. 4.3.1. div
        2. 4.3.2. mod
    8. 5. XPath in Action
      1. 5.1. XPath Visualiser: Some Background
      2. 5.2. Sample XML Document
      3. 5.3. General to Specific, Common to Far-Out
        1. 5.3.1. The Node Test
        2. 5.3.2. Axes
        3. 5.3.3. Predicates
        4. 5.3.4. Functions
        5. 5.3.5. Sublimely Ridiculous
    9. 6. XPath 2.0
      1. 6.1. General Goals
        1. 6.1.1. Simplify Manipulation of XML Schema-Typed Content
        2. 6.1.2. Simplify Manipulation of String Content
        3. 6.1.3. Support Related XML Standards
        4. 6.1.4. Improve Ease of Use
        5. 6.1.5. Improve Interoperability
        6. 6.1.6. Improve i18n Support
        7. 6.1.7. Maintain Backward Compatibility
        8. 6.1.8. Enable Improved Processor Efficiency
      2. 6.2. Specific Requirements
        1. 6.2.1. XPath 2.0 MUSTs
          1. 6.2.1.1. Express its data model in terms of the XML Infoset (1.1)
          2. 6.2.1.2. Provide common core syntax and semantics for XSLT and XML Query (1.2)
          3. 6.2.1.3. Support explicit "for any" and "for all" Boolean operations (1.3)
          4. 6.2.1.4. Extend the existing set of aggregate functions (1.4)
          5. 6.2.1.5. Loosen restrictions on location steps (2.1)
          6. 6.2.1.6. Provide a conditional expression (2.2)
          7. 6.2.1.7. Define consistent implicit semantics for collection-valued subexpressions (2.3)
          8. 6.2.1.8. Support string matching with regular expressions (3)
          9. 6.2.1.9. Define the operator matrix and conversions (4.1)
          10. 6.2.1.10. Allow scientific notation for numbers (4.2)
          11. 6.2.1.11. Define cast and constructor functions (4.3)
          12. 6.2.1.12. Support accessing the simple-type values of elements and attributes (4.5)
          13. 6.2.1.13. Define the behavior of operators for null arguments (4.6)
        2. 6.2.2. XPath 2.0 SHOULDs
          1. 6.2.2.1. Maintain backward compatibility with XPath 1.0 (1.5)
          2. 6.2.2.2. Provide intersection and difference functions (1.6)
          3. 6.2.2.3. Support the unary plus operator (1.7)
          4. 6.2.2.4. Simplify string replacement (2.4.1)
          5. 6.2.2.5. Simplify string padding (2.4.2)
          6. 6.2.2.6. Simplify string case conversions (2.4.3)
          7. 6.2.2.7. Support aggregation functions over collection-valued expressions (2.5)
          8. 6.2.2.8. Add a "list" data type (4.4)
          9. 6.2.2.9. Select elements/attributes based on an explicit XML Schema type (5.1)
          10. 6.2.2.10. Select elements/attributes based on an XML Schema type hierarchy (5.2)
          11. 6.2.2.11. Select elements based on XML Schema substitution groups (5.3)
          12. 6.2.2.12. Support lookups based on XML Schema unique constraints and keys (5.4)
    10. 7. XPointer Background
      1. 7.1. XPointer and Media types
      2. 7.2. Some Definitions
        1. 7.2.1. Resource
        2. 7.2.2. Subresource
        3. 7.2.3. Location
        4. 7.2.4. Location-set
        5. 7.2.5. Point
        6. 7.2.6. Range
        7. 7.2.7. Points and Ranges: Flattening the Logical Hierarchy
      3. 7.3. The Framework
      4. 7.4. Error Types
        1. 7.4.1. Syntax Errors
        2. 7.4.2. Resource Errors
        3. 7.4.3. Subresource Errors
      5. 7.5. Encoding and Escaping Characters in XPointer
        1. 7.5.1. Characters Significant to XPointer Itself
        2. 7.5.2. URI-Significant Characters
          1. 7.5.2.1. URIs versus IURIs
        3. 7.5.3. Characters in XML Documents
        4. 7.5.4. Progressive Escaping
          1. 7.5.4.1. Progressive escaping: a (perverse) example
    11. 8. XPointer Syntax
      1. 8.1. Shorthand Pointers
      2. 8.2. Scheme-Based XPointer Syntax
        1. 8.2.1. The Scheme
        2. 8.2.2. The schemedata
        3. 8.2.3. Contents of the xmlns() Scheme
        4. 8.2.4. Contents of the element( ) Scheme
        5. 8.2.5. Combining Names and Child Sequences
        6. 8.2.6. Contents of the xpointer( ) Scheme
        7. 8.2.7. Custom Schemes
        8. 8.2.8. Multiple Pointer Parts
          1. 8.2.8.1. "Failure-proofing" XPointers
          2. 8.2.8.2. Declaring and using namespaces
          3. 8.2.8.3. Mixing it up
      3. 8.3. Using XPointers in a URI
    12. 9. XPointer Beyond XPath
      1. 9.1. Why Extend XPath?
      2. 9.2. Points and Ranges
        1. 9.2.1. Points
          1. 9.2.1.1. Node points versus character points
          2. 9.2.1.2. Point syntax
          3. 9.2.1.3. Points as "nodes"
          4. 9.2.1.4. Points and general entities
        2. 9.2.2. Ranges
          1. 9.2.2.1. What can be in a range
          2. 9.2.2.2. Range syntax
          3. 9.2.2.3. Ranges as "nodes"
          4. 9.2.2.4. Covering ranges
      3. 9.3. XPointer Extensions to Document Order
        1. 9.3.1. XPointer Document Order Extensions: Examples
      4. 9.4. XPointer Functions
        1. 9.4.1. start-point(locset)
        2. 9.4.2. end-point(locset)
        3. 9.4.3. range-to(locset)
        4. 9.4.4. string-range(locset, string, number1?, number2?)
        5. 9.4.5. range(locset)
        6. 9.4.6. range-inside(locset)
        7. 9.4.7. here( )
        8. 9.4.8. origin( )
    13. A. Extension Functions for XPath in XSLT
      1. A.1. Additional Functions in XSLT 1.0
      2. A.2. EXSLT Extensions
        1. A.2.1. EXSLT Functions Module
        2. A.2.2. EXSLT Dates-and-Times Module
        3. A.2.3. EXSLT Dynamic Module
        4. A.2.4. EXSLT Common Module
        5. A.2.5. EXSLT Math Module
        6. A.2.6. EXSLT Regular Expressions Module
        7. A.2.7. EXSLT Sets Module
        8. A.2.8. EXSLT Strings Module
    14. Colophon
    15. SPECIAL OFFER: Upgrade this ebook with O’Reilly