O'Reilly logo

XQuery, 2nd Edition by Priscilla Walmsley

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 19. Regular Expressions

Regular expressions are patterns that describe strings. They can be used as arguments to four XQuery built-in functions to determine whether a string value matches a particular pattern (matches), to replace parts of string that match a pattern (replace), to tokenize strings based on a delimiter pattern (tokenize), and to split a string into matching and non-matching parts (analyze-string). This chapter explains the regular expression syntax used by XQuery.

The Structure of a Regular Expression

The regular expression syntax of XQuery is based on that of XML Schema, with some additions. Regular expressions, also known as regexes, can be composed of a number of different parts: atoms, quantifiers, and branches.

Atoms

An atom is the most basic unit of a regular expression. It might describe a single character, such as d, or an escape sequence that represents one or more characters, like \s or \p{Lu}. It could also be a character class expression that represents a range or choice of several characters, such as [a-z]. These kinds of atoms are described later in this chapter.

Quantifiers

Atoms may indicate required, optional, or repeating strings. The number of times a matching string may appear is indicated by a quantifier, which appears directly after an atom. For example, to indicate that the letter d must appear one or more times, you can use the expression d+, where the + means “one or more.” The different quantifiers are listed in Table 19-1.

Table 19-1. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required