You are previewing XSLT, 2nd Edition.
O'Reilly logo
XSLT, 2nd Edition

Book Description

After years of anticipation and delay, the W3C finally released the XSLT 2.0 standard in January 2007. The revised edition of this classic book offers practical, real-world examples that demonstrate how you can apply XSLT stylesheets to XML data using either the new specification, or the older XSLT 1.0 standard. XSLT is a critical language for converting XML documents into other formats, such as HTML code or a PDF file. With XSLT, you get a thorough understanding of XSLT and XPath and their relationship to other web standards, along with recommendations for a honed toolkit in an open platform-neutral, standards-based environment. This book:

  • Covers the XSLT basics, including simple stylesheets and methods for setting up transformation engines

  • Walks you through the many parts of XSLT, particularly XSLT's template-based approach to transformations

  • Applies both XSLT 1.0 and 2.0 solutions to the same problems, helping you decide which version of XSLT is more appropriate for your project

  • Includes profuse examples that complement both the tutorial and the reference material

The new edition of XSLT has been updated thoroughly to explain XSLT 2.0's many dependencies, notably XML Schema and XPath 2.0. Want to find out how the 2.0 specification improves on the old? This book will explain.

Table of Contents

  1. XSLT
  2. Dedication
  3. A Note Regarding Supplemental Files
  4. Preface
    1. About This Book
    2. Where I’m Coming From
      1. I Believe in Open, Platform-Neutral, Standards-Based Computing
      2. I Assume You’re Busy
      3. I Don’t Care Which Standards-Compliant Tools You Use
      4. XSLT Is a Tool, Not a Religion
      5. You Shouldn’t Migrate All of Your Stylesheets Just Because There’s a New Version of XSLT
    3. How This Book Is Organized
    4. Conventions Used in This Book
    5. How to Contact Us
    6. Safari® Enabled
    7. Acknowledgments for the Second Edition
    8. Acknowledgments from the First Edition
  5. 1. Getting Started
    1. The Design of XSLT
      1. [2.0] The Design of XSLT 2.0
    2. XML Basics
      1. XML’s Heritage
      2. XML Document Rules
        1. An XML document must be contained in a single element
        2. All elements must be nested
        3. All attributes must be quoted
        4. XML tags are case-sensitive
        5. All end tags are required
        6. Empty tags can contain the end marker
        7. XML declarations
        8. Document Type Definitions (DTDs) and XML Schemas
        9. Well-formed versus valid documents
        10. Tags versus elements
        11. Namespaces
        12. [2.0] Datatypes
      3. Programming Interfaces for XML: DOM, SAX, and Others
        1. DOM
          1. A sample DOM tree
        2. SAX
        3. Other programming interfaces
      4. XSLT Standards
        1. XSL transformations (XSLT) version 1.0
        2. XML path language (XPath) version 1.0
        3. XSL transformations (XSLT) version 2.0
        4. XML path language (XPath) version 2.0
        5. XQuery 1.0 and XPath 2.0 Data Model (XDM)
        6. XQuery 1.0 and XPath 2.0 functions and operators
        7. XQuery 1.0 and XPath 2.0 formal semantics
        8. XSLT 2.0 and XQuery 1.0 serialization
        9. XQuery 1.0: an XML query language
        10. XML syntax for XQuery 1.0 (XQueryX)
      5. XML Standards
        1. XML 1.0
        2. XML 1.1
        3. The Extensible Stylesheet Language (XSL)
        4. XML Schemas
        5. RelaxNG
        6. Schematron
        7. The Simple API for XML (SAX)
        8. Document Object Model (DOM)
        9. Namespaces in XML
        10. Associating stylesheets with XML documents
        11. Scalable Vector Graphics (SVG)
        12. XML pointer language (XPointer) version 1.0
        13. XML linking language (XLink) version 1.0
    3. Installing XSLT Processors
      1. Installing Xalan
      2. Installing Saxon
      3. Installing the Microsoft XSLT Processor
      4. Installing the Altova XSLT Engine
    4. Summary
  6. 2. The Obligatory Hello World Example
    1. Goals of This Chapter
    2. Transforming Hello World
      1. Our Sample Document
      2. A Sample Stylesheet
      3. Transforming the XML Document
      4. Stylesheet Results
    3. How a Stylesheet Is Processed
      1. Parsing the Stylesheet
      2. Parsing the Transformee
      3. Lather, Rinse, Repeat
      4. Walking Through Our Example
    4. Stylesheet Structure
      1. The <xsl:stylesheet> Element
      2. The <xsl:output> Element
      3. Our First <xsl:template>
      4. The <xsl:template> for <greeting> Elements
      5. Built-in Template Rules
        1. Built-in template rule for element and document nodes
        2. Built-in template rule for modes
        3. Built-in template rule for text and attribute nodes
        4. Built-in template rule for comment and processing instruction nodes
        5. Built-in template rule for namespace nodes
      6. Top-Level Elements
      7. Other Approaches
    5. Sample Gallery
      1. The Hello World SVG File
      2. The Hello World PDF File
      3. The Hello World Java Program
      4. The Hello World VRML File
    6. Summary
  7. 3. XPath: A Syntax for Describing Needles and Haystacks
    1. The XPath Data Model
      1. Node Types
        1. The root node
        2. Element nodes
        3. Attribute nodes
        4. Text nodes
        5. Comment nodes
        6. Processing instruction nodes
        7. Namespace nodes
      2. Node Tests
        1. [2.0] New node tests in XPath 2.0
      3. [2.0] Sequences and Atomic Values
    2. Location Paths
      1. The Context
        1. [1.0] The XPath 1.0 context
        2. [2.0] The XPath 2.0 context
      2. Simple Location Paths
      3. Relative and Absolute Expressions
      4. Selecting Things Besides Elements with Location Paths
        1. Selecting attributes
        2. Selecting the text of an element
        3. Selecting comments, processing instructions, and namespace nodes
      5. Using Wildcards
      6. Axes
        1. Unabbreviated syntax
        2. Axis roll call
      7. Predicates
        1. Numbers in predicates
        2. Functions in predicates
    3. Attribute Value Templates
    4. Datatypes
      1. Datatypes in XPath 1.0
      2. Datatypes in XPath 2.0
    5. XPath Operators
      1. Mathematical Operators
        1. Addition (+)
        2. Subtraction (–)
        3. Multiplication (*)
        4. Division (div)
        5. [2.0] Integer division (idiv)
        6. Modulo (mod)
        7. Unary minus (–x)
        8. Unary plus (+x)
      2. Boolean Operators
        1. Comparing expressions
        2. [2.0] Comparing atomic values
        3. [2.0] Comparing sequences
      3. [2.0] Conditional Expressions—if, then, and else
      4. [2.0] Iterators Over Sequences—The for Operator
      5. [2.0] Quantified Expressions—some and every
      6. [2.0] Range Expressions—The to Operator
      7. [2.0] Constructor Functions
      8. [2.0] Datatype Operators—instance of, castable as, cast as, and treat as
        1. instance of
        2. cast as
        3. castable as
        4. treat as
      9. [2.0] Set Operators—except, intersect, and union
        1. except
        2. intersect
        3. union
      10. [2.0] Node Operators
        1. The is operator
        2. node-after (>>)
        3. node-before (<<)
    6. [2.0] Comments in XPath Expressions
    7. [2.0] Types of XSLT 2.0 Processors
    8. The XPath View of an XML Document
      1. Output View
      2. The Stylesheet
    9. Summary
  8. 4. Creating Output
    1. Goals of This Chapter
    2. Generating Text
      1. Creating Simple Text
      2. Outputting the Value of Something
      3. [2.0] Changes to <xsl:value-of> in XSLT 2.0
    3. Numbering Things
      1. [2.0] Changes to <xsl:number> in XSLT 2.0
    4. Formatting Decimal Numbers
    5. [2.0] Formatting Dates and Times
    6. Using <xsl:copy> and <xsl:copy-of>
      1. A Stylesheet That Reproduces Its Input Document
      2. A Stylesheet That Doesn’t Quite Reproduce Its Input Document
    7. Dealing with Whitespace
      1. Whitespace Basics
      2. Using <xsl:preserve-space> and <xsl:strip-space>
      3. The normalize-space() function
      4. A Simple Technique for Adding Whitespace to Text Output
    8. Summary
  9. 5. Branching and Control Elements
    1. Goals of This Chapter
    2. Branching Elements of XSLT
      1. The <xsl:if> Element
        1. Converting to boolean values
        2. Boolean examples
      2. The <xsl:choose> Element
        1. <xsl:choose> example
      3. The <xsl:for-each> Element
        1. <xsl:for-each> example
    3. Invoking Templates by Name
      1. How It Works
      2. Templates à la mode
    4. Parameters
      1. Defining a Parameter in a Template
      2. Passing Parameters
      3. Global Parameters
        1. Setting global parameters in a Java program
        2. Setting global parameters in .NET
      4. [2.0] Important Differences in XSLT 2.0
        1. New values for the mode attribute
        2. Undefined parameters are illegal
        3. Required parameters
        4. Datatyping support
        5. Tunnel parameters
    5. Variables
      1. Are These Things Really Variables?
      2. Variable Scope
    6. Using Recursion to Do Most Anything
      1. Implementing a String Replace Function
        1. Procedural design
        2. Recursive design
      2. [2.0] Using the XPath 2.0 replace() Function to Avoid Recursion
    7. A Stylesheet That Emulates a for Loop
      1. Template Design
      2. Implementation
      3. The Complete Example
    8. Summary
  10. 6. Creating Links and Cross-References
    1. Using the XML ID, IDREF, and IDREFS Datatypes
      1. The Datatypes and How They Work
      2. Linking Parts of an XML Document
      3. A Stylesheet That Uses the id() Function
      4. [2.0] The idref() Function
      5. Generating HTML Documents with Links
      6. Limitations of IDs
    2. XSLT’s Key Facility
      1. Defining a Key with <xsl:key>
      2. Generating Links with the key() Function
      3. Advantages of the key() Function
    3. Generating Links in Unstructured Documents
      1. An Unstructured XML Document in Need of Links
      2. The generate-id() Function
    4. Summary
  11. 7. Sorting and Grouping Elements
    1. Sorting Data with <xsl:sort>
      1. Our First Example
      2. The Details on the <xsl:sort> Element
        1. What’s the deal with that syntax?
        2. Attributes
        3. Where can you use <xsl:sort>?
      3. Another Example
    2. [2.0] The <xsl:perform-sort> Element
    3. Grouping Nodes
      1. Our First Attempt
      2. A Brute-Force Approach
      3. Grouping with <xsl:variable>
      4. The <xsl:key> Approach
    4. [2.0] New Grouping Syntax in XSLT 2.0
      1. The Most Common Grouping Style: group-by
      2. Another Type of Grouping: group-adjacent
      3. Grouping using group-starting-with
      4. Grouping Using group-ending-with
    5. Summary
  12. 8. Combining Documents
    1. The document() Function
      1. An Aside: Doing Math with Recursion
        1. Recursive design
        2. Using format-number() to control output
      2. Base URIs and the document() Function
    2. The document() Function and Sorting
    3. Implementing Lookup Tables
    4. Grouping Across Multiple Documents
    5. [2.0] Using XSLT 2.0 to Simplify Things
      1. Grouping by Distinct Values
      2. Doing Math Without Recursion
      3. Implementing Lookup Tables with <xsl:function>
      4. Using if Instead of <xsl:choose>
      5. Using the format-date() Function
      6. The Complete XSLT 2.0 Solution
    6. [2.0] The doc() and doc-available() Functions
    7. [2.0] The collection() Function
    8. [2.0] The unparsed-text() and unparsed-text-available() Functions
    9. Summary
  13. 9. Extending XSLT
    1. The XSLT Extension Mechanism
      1. Extension Elements
      2. Extension Functions
      3. Fallback Processing
      4. Namespaces for Extensions
    2. [2.0] Creating New Functions with <xsl:function>
    3. Example: Generating Multiple Output Files
    4. Creating Custom Collations
      1. Using a Custom Collation for Sorting
      2. Using a Custom Collation for Comparing Text
    5. Generating Hidden Word Graphics
      1. Java Version
      2. .NET Version
    6. Example: Generating an SVG Pie Chart
    7. Writing Extensions in Other Languages
      1. Jython
      2. JRuby
      3. JavaScript
      4. Jacl
    8. Using Extension Functions from the EXSLT Library
    9. Accessing a Database with an Extension Element
      1. Accessing a Database in Saxon
      2. Accessing a Database in Xalan
    10. Creating a Photo Album with an Extension Element
      1. Xalan Java Version
      2. Saxon Java Version
      3. .NET Version
    11. Summary
  14. A. XSLT Reference
    1. [2.0] Attributes common to all XSLT elements
    2. [2.0] <xsl:analyze-string>
    3. <xsl:apply-imports>
    4. <xsl:apply-templates>
    5. <xsl:attribute>
    6. <xsl:attribute-set>
    7. <xsl:call-template>
    8. [2.0] <xsl:character-map>
    9. <xsl:choose>
    10. <xsl:comment>
    11. <xsl:copy>
    12. <xsl:copy-of>
    13. <xsl:decimal-format>
    14. [2.0] <xsl:document>
    15. <xsl:element>
    16. <xsl:fallback>
    17. <xsl:for-each>
    18. [2.0] <xsl:for-each-group>
    19. [2.0] <xsl:function>
    20. <xsl:if>
    21. <xsl:import>
    22. [2.0 – Schema] <xsl:import-schema>
    23. <xsl:include>
    24. <xsl:key>
    25. [2.0] <xsl:matching-substring>
    26. <xsl:message>
    27. [2.0] <xsl:namespace>
    28. <xsl:namespace-alias>
    29. [2.0] <xsl:next-match>
    30. [2.0] <xsl:non-matching-substring>
    31. <xsl:number>
    32. <xsl:otherwise>
    33. <xsl:output>
    34. [2.0] <xsl:output-character>
    35. <xsl:param>
    36. [2.0] <xsl:perform-sort>
    37. <xsl:preserve-space>
    38. <xsl:processing-instruction>
    39. [2.0] <xsl:result-document>
    40. [2.0] <xsl:sequence>
    41. <xsl:sort>
    42. <xsl:strip-space>
    43. <xsl:stylesheet>
    44. <xsl:template>
    45. <xsl:text>
    46. <xsl:transform>
    47. <xsl:value-of>
    48. <xsl:variable>
    49. <xsl:when>
    50. <xsl:with-param>
  15. B. XPath Reference
    1. XPath Node Types
      1. The Root Node
      2. Element Nodes
      3. Attribute Nodes
      4. Text Nodes
      5. Comment Nodes
      6. Processing-Instruction Nodes
      7. Namespace Nodes
    2. XPath Node Tests
    3. XPath Axes
    4. The XPath Context
    5. XPath 1.0 Datatypes
    6. [2.0] XPath 2.0 Datatypes
    7. Operators and Keywords
    8. Operator Precedence—XPath 1.0
    9. [2.0] Operator Precedence—XQuery 1.0 and XPath 2.0
  16. C. XSLT, XPath, and XQuery Function Reference
    1. Kinds of Functions
      1. Accessor Functions
      2. Boolean Functions
      3. Constructor Functions
      4. Context Functions
      5. Cross-Referencing and Grouping Functions
      6. Date, Time, and Duration Functions
      7. Node Functions
      8. Numeric Functions
      9. QName Functions
      10. Regular Expression Functions
      11. Sequence or Node-Set Functions
      12. String Functions
      13. Miscellaneous Functions
      14. Collation Functions
        1. [2.0] abs()
        2. [2.0] adjust-date-to-timezone()
        3. [2.0] adjust-dateTime-to-timezone()
        4. [2.0] adjust-time-to-timezone()
        5. [2.0] avg()
        6. [2.0] base-uri()
        7. boolean()
        8. ceiling()
        9. [2.0] codepoint-equal()
        10. [2.0] codepoints-to-string()
        11. [2.0] collection()
        12. [2.0] compare()
        13. concat()
        14. contains()
        15. count()
        16. current()
        17. [2.0] current-date()
        18. [2.0] current-dateTime()
        19. [2.0] current-group()
        20. [2.0] current-grouping-key()
        21. [2.0] current-time()
        22. [2.0] data()
        23. [2.0] dateTime()
        24. [2.0] day-from-date()
        25. [2.0] day-from-dateTime()
        26. [2.0] days-from-duration()
        27. [2.0] deep-equal()
        28. [2.0] default-collation()
        29. [2.0] distinct-values()
        30. [2.0] doc()
        31. [2.0] doc-available()
        32. document()
        33. [2.0] document-uri()
        34. element-available()
        35. [2.0] empty()
        36. [2.0] encode-for-uri()
        37. [2.0] ends-with()
        38. [2.0] error()
        39. [2.0] escape-html-uri()
        40. [2.0] exactly-one()
        41. [2.0] exists()
        42. false()
        43. floor()
        44. [2.0] format-date()
        45. [2.0] format-dateTime()
        46. format-number()
        47. [2.0] format-time()
        48. function-available()
        49. generate-id()
        50. [2.0] hours-from-dateTime()
        51. [2.0] hours-from-duration()
        52. [2.0] hours-from-time()
        53. id()
        54. [2.0] idref()
        55. [2.0] implicit-timezone()
        56. [2.0] in-scope-prefixes()
        57. [2.0] index-of()
        58. [2.0] insert-before()
        59. [2.0] iri-to-uri()
        60. key()
        61. lang()
        62. last()
        63. local-name()
        64. [2.0] local-name-from-QName()
        65. [2.0] lower-case()
        66. [2.0] matches()
        67. [2.0] max()
        68. [2.0] min()
        69. [2.0] minutes-from-dateTime()
        70. [2.0] minutes-from-duration()
        71. [2.0] minutes-from-time()
        72. [2.0] month-from-date()
        73. [2.0] month-from-dateTime()
        74. [2.0] months-from-duration()
        75. name()
        76. namespace-uri()
        77. [2.0] namespace-uri-for-prefix()
        78. [2.0] namespace-uri-from-QName()
        79. [2.0 – Schema] nilled()
        80. [2.0] node-name()
        81. normalize-space()
        82. [2.0] normalize-unicode()
        83. not()
        84. number()
        85. [2.0] one-or-more()
        86. position()
        87. [2.0] prefix-from-QName()
        88. [2.0] QName()
        89. [2.0] regex-group()
        90. [2.0] remove()
        91. [2.0] replace()
        92. [2.0] resolve-QName()
        93. [2.0] resolve-uri()
        94. [2.0] reverse()
        95. [2.0] root()
        96. round()
        97. [2.0] round-half-to-even()
        98. [2.0] seconds-from-dateTime()
        99. [2.0] seconds-from-duration()
        100. [2.0] seconds-from-time()
        101. starts-with()
        102. [2.0] static-base-uri()
        103. string()
        104. [2.0] string-join()
        105. string-length()
        106. [2.0] string-to-codepoints()
        107. [2.0] subsequence()
        108. substring()
        109. substring-after()
        110. substring-before()
        111. sum()
        112. system-property()
        113. [2.0] timezone-from-date()
        114. [2.0] timezone-from-dateTime()
        115. [2.0] timezone-from-time()
        116. [2.0] tokenize()
        117. [2.0] trace()
        118. translate()
        119. true()
        120. [2.0] type-available()
        121. [2.0] unordered()
        122. [2.0] unparsed-entity-public-id()
        123. unparsed-entity-uri()
        124. [2.0] unparsed-text()
        125. [2.0] unparsed-text-available()
        126. [2.0] upper-case()
        127. [2.0] year-from-date()
        128. [2.0] year-from-dateTime()
        129. [2.0] years-from-duration()
        130. [2.0] zero-or-one()
  17. D. XML Schema Overview
    1. Declaring Elements and Attributes
      1. Creating an Empty Element
      2. Creating an Empty Element with Attributes
      3. Creating an Element with Text
      4. Creating an Element with Text and Attributes
      5. Creating an Element with Mixed Content
    2. Defining Datatypes
      1. Anonymous Types
      2. Groups
      3. Creating New Datatypes by Restriction
      4. Creating New Datatypes by Extension
      5. Casting Between Datatypes
      6. Creating List Types
      7. Creating Union Types
      8. Substitution Groups
      9. Abstract Elements and Datatypes
    3. Using an XML Schema in a Stylesheet
      1. Importing XML Schemas with <xsl:import-schema>
      2. Using XML Schemas Without Namespaces
      3. Using XML Schemas with Namespaces
  18. E. [2.0] Regular Expressions
    1. Simple Expressions
    2. Subexpressions
    3. Quantifiers
    4. [XPath] Reluctant Quantifiers
    5. Processing Modes
    6. [XPath] Anchors
    7. Back-references
    8. Metacharacters
    9. Single-Character Escapes
    10. Multiple-Character Escapes
    11. Character Groups
      1. Letters
      2. Marks
      3. Numbers
      4. Punctuation
      5. Separators
      6. Symbols
      7. Everything Else
      8. Block Escapes
  19. F. XSLT Formatting Codes
    1. Formatting Codes for Numbers
      1. Parts of Numbers
      2. Parts of Decimal Formats
    2. Formatting Codes for Dates and Times
      1. Parts of Dates and Times
      2. Presentation Modifiers
      3. Calendars
  20. G. XSLT 2.0 Migration Guide
    1. Powerful New Features in XSLT 2.0 and XPath 2.0
      1. Recursion Isn’t Necessary Nearly as Often
      2. Grouping Is Much, Much Easier
      3. Datatypes and XML Schemas Are Supported
      4. Regular Expressions Are Supported
    2. Potential Errors
      1. Passing Undefined Parameters with <xsl:call-template> Causes an Error
      2. Math Works Differently in Some Cases
      3. Type Checking Is Much Stricter
      4. Calling Some Functions with More Than One Node Causes an Error
    3. Approaches to Migration
      1. Write (or Rewrite) Your Stylesheets from Scratch
      2. Change the Version to 2.0 and See What Happens
      3. Replace Awkward XSLT 1.0 Code with XSLT 2.0 Features
      4. Mix XSLT 1.0 and XSLT 2.0 in the Same Stylesheet
      5. Don’t Migrate at All
  21. Glossary
  22. Index
  23. Colophon
  24. Copyright