You are previewing XQuery: The XML Query Language.
O'Reilly logo
XQuery: The XML Query Language

Book Description

“An excellent, early look at the emerging XML Query standard. The chapters on surprises and gotchas alone are worth the price of admission!”

         —Ashok Malhotra, Architect, Microsoft

“XQuery is the most important XML standard to emerge in recent years, and is a language with which anyone using XML on a regular basis should become acquainted. Michael Brundage's accessible introduction to XQuery provides enough information on all aspects of the standard, including its dark corners, to allow any XML developer to jump right in and start coding.”

         —Damien Fisher, Kernel Team Member, Soda Technologies Pty Ltd

“This book does an excellent job of distilling the essentials of XQuery in an understandable, straightforward and easily digestable manner. This book has already become an indispensible part of my library and is a welcome addition to my XML repertoire.”

         —Dare Obasanjo, Program Manager, Microsoft Corporation

“Simply put, the emerging XQuery standard adds enormous value to XML data and this book is your key to unlocking that value. Here in one stop you will find an accessible introduction to XQuery and a complete reference. Practitioners will particularly value the sections on XQuery idioms and surprises where Michael shares his tricks of the trade.”

         —Dave Van Buren, Project Manager, Jet Propulsion Laboratory

“It’s both a stupendous reference on XQuery and a good read. Michael writes with verve, authority, and an eminently readable style. What a rare delight to discover all this, and in a technical book too! When the sequel comes along, sign me up.”

         —Howard Katz, Owner, Fatdog Software Inc., Editor, XQuery from the Experts (Addison-Wesley, 2003)

From corporate IT departments to academic institutions, XML has become the language of choice for storing and transmitting data across diverse application domains. XQuery, an XML Query Language invented by the World Wide Web Consortium, offers a powerful, standardized way to query all of that XML-encapsulated information. With its ability to integrate XML and non-XML data, XQuery seems poised to do for XML what SQL has done for relational data.

Written by the Technical Lead for XML query processing at Microsoft, XQuery: The XML Query Language is an invaluable resource for XQuery novices and experts alike. For those new to XQuery, this example-rich text serves as a tutorial that brings readers quickly up to speed on XQuery's data model, type system, and core language features. More experienced XML and database developers will find an excellent reference on the nuances of various expressions, as well as a guide to using XQuery to accomplish specific tasks.

Drawing on his experiences using XQuery, Michael Brundage offers an objective, inside look at this emerging technology. His unique perspective translates into an accessible and authoritative guide for readers using XML for documents, Web services, or databases.

Key coverage includes:

  • Data model and type system

  • Path navigation

  • Iteration, construction, arithmetic, text processing, type operators and user-defined functions

  • Information beyond the standard—such as a look at update languages, performance benchmarks, query optimization, XQuery style, and much, much more

  • Hundreds of examples

  • The future of XQuery

The appendixes provide in-depth information on XQuery's type system, core expressions, built-in functions, regular expressions, and grammar. Meanwhile, the companion Web site offers downloadable source code for all of the examples in the book, the latest on the XQuery standard, answers to readers' questions, XQuery tips and strategies, and more.

XQuery will show developers, programmers, and database administrators how a single line of this deep and powerful new language can accomplish the equivalent of hundreds of lines written in C, C#, Java, and other general-purpose programming languages.



Table of Contents

  1. Copyright
    1. Dedication
  2. Praise for XQuery
  3. List of Figures
  4. Foreword
  5. Preface
    1. Who Should Read This Book?
    2. Organization
    3. Resources
    4. Acknowledgments
  6. I. Foundations
    1. 1. A Tour of XQuery
      1. 1.1. Introduction
      2. 1.2. Getting Started
      3. 1.3. Notational Conventions
      4. 1.4. Why XQuery?
        1. 1.4.1. Query Languages Versus Programming Languages
        2. 1.4.2. XQuery Versus XPath, XSLT, and SQL
      5. 1.5. Documents and Databases
      6. 1.6. Typed and Untyped Data
        1. 1.6.1. XML Schema Redux
        2. 1.6.2. The Types You Need
        3. 1.6.3. The Types You Don't Need
      7. 1.7. A Sample Query
      8. 1.8. Processing Model
      9. 1.9. Comments and Whitespace
      10. 1.10. Prolog
      11. 1.11. Constants
        1. 1.11.1. Boolean Constants
        2. 1.11.2. String Constants
        3. 1.11.3. Numeric Constants
        4. 1.11.4. Other Constants
      12. 1.12. XML
      13. 1.13. Built-in Functions
      14. 1.14. Operators
        1. 1.14.1. Logic Operators
        2. 1.14.2. Arithmetic Operators
        3. 1.14.3. Text Operators
        4. 1.14.4. Comparison Operators
      15. 1.15. Paths
      16. 1.16. Variables
      17. 1.17. FLWOR
      18. 1.18. Error Handling
      19. 1.19. Conclusion
      20. 1.20. Further Reading
    2. 2. Data Model and Type System
      1. 2.1. Introduction
      2. 2.2. An Overview of XML Data Models
        1. 2.2.1. Two Examples
        2. 2.2.2. The Document Object Model (DOM)
        3. 2.2.3. The XPath 1.0 Data Model
        4. 2.2.4. The XML Information Set (Infoset)
        5. 2.2.5. The Post-Schema-Validation Infoset (PSVI)
        6. 2.2.6. The XQuery Data Model
      3. 2.3. Structure of the XQuery Data Model
        1. 2.3.1. Items and Sequences
        2. 2.3.2. Atomic Values
        3. 2.3.3. Nodes
      4. 2.4. Atomic Types
        1. 2.4.1. Untyped Data
        2. 2.4.2. Boolean Types
        3. 2.4.3. Numeric Types
          1. 2.4.3.1. Numerics Background
          2. 2.4.3.2. XQuery Numeric Types
        4. 2.4.4. String Types
        5. 2.4.5. Calendar Types
        6. 2.4.6. Qualified Name Type
        7. 2.4.7. Other Types
      5. 2.5. Node Kinds
        1. 2.5.1. Kind, Identity, and Order
        2. 2.5.2. Hierarchy
        3. 2.5.3. Node Name
        4. 2.5.4. Node Type and Values
        5. 2.5.5. Other Node Properties
      6. 2.6. Common Type Conversions
        1. 2.6.1. Atomization
        2. 2.6.2. Effective Boolean Value
        3. 2.6.3. Sequence Type Matching
          1. 2.6.3.1. Type Matching Algorithm
        4. 2.6.4. Subtype Substitution
        5. 2.6.5. Numeric Type Promotion
      7. 2.7. Conclusion
      8. 2.8. Further Reading
    3. 3. Navigation
      1. 3.1. Introduction
      2. 3.2. Paths
        1. 3.2.1. Beginnings
        2. 3.2.2. Axes
        3. 3.2.3. Node Tests
          1. 3.2.3.1. Name Tests
          2. 3.2.3.2. Node Kind Tests
          3. 3.2.3.3. Wildcards
        4. 3.2.4. Other Axes
        5. 3.2.5. Predicates
          1. 3.2.5.1. Numeric Predicates
          2. 3.2.5.2. Boolean Predicates
          3. 3.2.5.3. Successive and Nested Predicates
      3. 3.3. Navigation Functions
      4. 3.4. Navigation Context
        1. 3.4.1. Input Sequence
        2. 3.4.2. Focus
        3. 3.4.3. Variable Declarations
        4. 3.4.4. Namespace Declarations
        5. 3.4.5. Function Declarations
        6. 3.4.6. Collations
      5. 3.5. Navigation Examples
      6. 3.6. Navigation Complexities
        1. 3.6.1. Namespaces
        2. 3.6.2. Node Identity
        3. 3.6.3. Other Context Information
      7. 3.7. Conclusion
      8. 3.8. Further Reading
    4. 4. Functions and Modules
      1. 4.1. Introduction
      2. 4.2. Built-in Function Library
      3. 4.3. Function Invocation
      4. 4.4. Function Conversion Rules
      5. 4.5. User-Defined Functions
      6. 4.6. Recursion
      7. 4.7. External Functions
      8. 4.8. Modules
      9. 4.9. Conclusion
  7. II. Core Language Features
    1. 5. Basic Expressions
      1. 5.1. Introduction
      2. 5.2. Comparisons
        1. 5.2.1. Value Comparisons
        2. 5.2.2. General Comparisons
        3. 5.2.3. Node Comparisons
        4. 5.2.4. Sequence and Tree Comparisons
      3. 5.3. Sequences
        1. 5.3.1. Constructing Sequences
        2. 5.3.2. Processing Sequences
      4. 5.4. Arithmetic
      5. 5.5. Logic
      6. 5.6. Query Prolog
        1. 5.6.1. Version Declaration
        2. 5.6.2. XML Space Declaration
        3. 5.6.3. Base URI Declaration
        4. 5.6.4. Default Collation Declaration
        5. 5.6.5. Namespace Declarations
        6. 5.6.6. Global Variable Declarations
        7. 5.6.7. Module Imports and Declaration
        8. 5.6.8. Schema Imports and Validation Declaration
      7. 5.7. Conclusion
      8. 5.8. Further Reading
    2. 6. Iteration
      1. 6.1. Introduction
      2. 6.2. FLWOR
        1. 6.2.1. Compared to SQL
        2. 6.2.2. Compared to XSLT
        3. 6.2.3. Introducing Variables
        4. 6.2.4. Tuples
      3. 6.3. Quantification
      4. 6.4. Joins
      5. 6.5. Comparing Sequences
        1. 6.5.1. Existential Comparison
        2. 6.5.2. Memberwise Comparison
        3. 6.5.3. Universal Comparison
      6. 6.6. Sorting
      7. 6.7. Grouping
      8. 6.8. Conclusion
      9. 6.9. Further Reading
    3. 7. Constructing XML
      1. 7.1. Introduction
      2. 7.2. Element Nodes
        1. 7.2.1. Direct Element Constructor
        2. 7.2.2. Computed Element Constructor
      3. 7.3. Attribute Nodes
        1. 7.3.1. Direct Attribute Constructors
        2. 7.3.2. Computed Attribute Constructors
      4. 7.4. Text Nodes
      5. 7.5. Document Nodes
      6. 7.6. Comment Nodes
      7. 7.7. Processing Instruction Nodes
      8. 7.8. Namespace Nodes
      9. 7.9. Composition
      10. 7.10. Validation
      11. 7.11. Element and Attribute Content
        1. 7.11.1. Character Escapes
        2. 7.11.2. Whitespace
        3. 7.11.3. Content Sequence
          1. 7.11.3.1. Attribute Content
          2. 7.11.3.2. Element Content
      12. 7.12. Conclusion
      13. 7.13. Further Reading
    4. 8. Text Processing
      1. 8.1. Introduction
      2. 8.2. The XML Character Model
        1. 8.2.1. Background
        2. 8.2.2. Code Points
        3. 8.2.3. Normalization
      3. 8.3. Character Encodings
      4. 8.4. Collations
      5. 8.5. Text Operators
      6. 8.6. Text Functions
      7. 8.7. Conclusion
      8. 8.8. Further Reading
    5. 9. Type Operators
      1. 9.1. Introduction
      2. 9.2. Cast and Castable
      3. 9.3. Type Conversion Rules
        1. 9.3.1. Converting Up and Down the Type Hierarchy
        2. 9.3.2. Converting Across the Type Hierarchy
          1. 9.3.2.1. Conversion to/from String
          2. 9.3.2.2. Conversion to/from Numeric Types and Boolean
          3. 9.3.2.3. Conversion to/from Calendar Types
          4. 9.3.2.4. Conversion to/from Binary Types
      4. 9.4. treat as
      5. 9.5. instance of and typeswitch
      6. 9.6. User-Defined Types
        1. 9.6.1. Schema Imports
        2. 9.6.2. Typed Content
        3. 9.6.3. Validation
      7. 9.7. Conclusion
      8. 9.8. Further Reading
  8. III. Application
    1. 10. Practical Examples
      1. 10.1. Introduction
      2. 10.2. Style
        1. 10.2.1. Case
        2. 10.2.2. Spaces
        3. 10.2.3. Braces
        4. 10.2.4. Slice
        5. 10.2.5. Concise
      3. 10.3. Idioms
        1. 10.3.1. Text Idioms
          1. 10.3.1.1. Reverse a String
          2. 10.3.1.2. Comma-Separated Values
          3. 10.3.1.3. Lower-, Upper-, and Title-Case
          4. 10.3.1.4. Test for ASCII Characters
        2. 10.3.2. Navigation Idioms
          1. 10.3.2.1. Select Elements and Attributes Simultaneously
          2. 10.3.2.2. Navigate Case-Insensitively
          3. 10.3.2.3. Select Nodes by Type
          4. 10.3.2.4. Test Whether Two Nodes Are in the Same Tree
          5. 10.3.2.5. Select All Leaf Elements
          6. 10.3.2.6. Select All Ancestors
          7. 10.3.2.7. Select the First Common Ancestor
          8. 10.3.2.8. Select All Siblings
          9. 10.3.2.9. Calculate the Maximum Depth
        3. 10.3.3. Sequence Idioms
          1. 10.3.3.1. Union, Intersection, and Difference
          2. 10.3.3.2. Select Every Other Member
          3. 10.3.3.3. Permutations
        4. 10.3.4. Type Idioms
          1. 10.3.4.1. Binary Data
        5. 10.3.5. Logic Idioms
          1. 10.3.5.1. Boolean XOR
          2. 10.3.5.2. Three-Valued Logic
        6. 10.3.6. Arithmetic Idioms
          1. 10.3.6.1. IsNaN
          2. 10.3.6.2. Median
          3. 10.3.6.3. Rounding Modes
          4. 10.3.6.4. Random Number Generation
          5. 10.3.6.5. Factorial
          6. 10.3.6.6. Square Root
          7. 10.3.6.7. Complex Numbers
          8. 10.3.6.8. Linear Algebra
      4. 10.4. Conclusion
      5. 10.5. Further Reading
    2. 11. Surprises
      1. 11.1. Introduction
      2. 11.2. Confusion over Meaning
        1. 11.2.1. Numbers Like 3.14 Are Decimal, Not Double
        2. 11.2.2. Predicates Index Sequences, Not Strings
        3. 11.2.3. not(=) Is Not !=
        4. 11.2.4. Arithmetic Isn't Associative
        5. 11.2.5. Predicates + Abbreviated Axes = Confusion
        6. 11.2.6. Node Sequences Are Different from Node Siblings
        7. 11.2.7. No Sub-Tree Pruning
        8. 11.2.8. Type Conversions
        9. 11.2.9. What's in a Name?
        10. 11.2.10. FLWOR Doesn't Move the Current Context
        11. 11.2.11. Nested FLWOR Is Different
      3. 11.3. Confusion over Syntax
        1. 11.3.1. Punctuation Is Tricky
        2. 11.3.2. The <foo/><foo/> Problem
        3. 11.3.3. XML Numbers Aren't XQuery Numbers
        4. 11.3.4. Wacky Paths
      4. 11.4. Conclusion
    3. 12. XQuery Serialization
      1. 12.1. Introduction
      2. 12.2. XQuery Serialization
        1. 12.2.1. Sequences of Values
        2. 12.2.2. The Root
        3. 12.2.3. Serialization Parameters
      3. 12.3. XQueryX
      4. 12.4. Conclusion
    4. 13. Query Optimization
      1. 13.1. Introduction
      2. 13.2. Common Query Optimizations
        1. 13.2.1. Lazy Evaluation
        2. 13.2.2. Early Evaluation
        3. 13.2.3. Streaming and Database Evaluation
      3. 13.3. Barriers to Optimization
        1. 13.3.1. Node Identity
        2. 13.3.2. Sequence Order
        3. 13.3.3. Error Preservation
        4. 13.3.4. Side Effects
      4. 13.4. Formal Semantics
      5. 13.5. Conclusion
      6. 13.6. Further Reading
    5. 14. Beyond the Standard
      1. 14.1. Introduction
      2. 14.2. Potential Changes
        1. 14.2.1. Namespaces
        2. 14.2.2. Modules and Prolog
        3. 14.2.3. Additional Types
        4. 14.2.4. Simplify, Simplify
        5. 14.2.5. Built-in Functions
      3. 14.3. Standards Roadmap
        1. 14.3.1. Today
        2. 14.3.2. Tomorrow
      4. 14.4. XQuery 1.1
      5. 14.5. Data Manipulation
        1. 14.5.1. XQuery DML
        2. 14.5.2. SiXDML
        3. 14.5.3. XUpdate
      6. 14.6. Full-Text Search
      7. 14.7. Performance Benchmarks
      8. 14.8. Conclusion
      9. 14.9. Further Reading
  9. IV. Reference
    1. A. Data Model and Type System Reference
      1. A.1. Introduction
      2. A.2. Overview
      3. A.3. Node Kinds
      4. A.4. Atomic Types
      5. A.5. Primitive Type Conversions
      6. A.6. Built-in Atomic Types
    2. B. Expression Reference
      1. B.1. Introduction
        1. for
        2. let
        3. where
        4. order by
        5. return
    3. C. Function Reference
      1. C.1. Introduction
    4. D. Regular Expressions
      1. D.1. Introduction
      2. D.2. Overview
      3. D.3. Advanced Regexps
      4. D.4. Regexp Language
      5. D.5. Character Properties
    5. E. Grammar
      1. E.1. Introduction
      2. E.2. The XQuery Grammar
      3. E.3. Reserved Keywords
      4. E.4. Operator Precedence
  10. Bibliography
    1. Standards
    2. Working Drafts and Notes
    3. Further Reading