Back-References

XQuery supports the use of back-references.[*] Back-references allow you to ensure that certain characters in a string match each other. For example, suppose you want to ensure that a string is a product number delimited by either single or double quotes. The product number must be three digits, followed by a dash, followed by two uppercase letters. You could write the expression:

('|")\d{3}-[A-Z]{2}('|")

However, this would allow a string that starts with a single quote and ends with a double quote. You want to be sure the quotes match. You could write the expression:

'\d{3}-[A-Z]{2}'|"\d{3}-[A-Z]{2}"

but this requires repeating the entire pattern for the product number. Instead, you can parenthesize the expression representing the quotes and refer back to it using an escaped digit. For example, the expression:

('|")\d{3}-[A-Z]{2}\1

is equivalent to the prior example, but it is shorter and simpler. The atom \1 indicates that you want to repeat the first parenthesized expression, namely ('|"). The characters that match the first parenthesized expression must be the same characters that match the back-reference. This means that the regular expression does not match a string that starts with a single quote and ends with a double quote.

The parenthesized sub-expressions are numbered in order from left to right based on the position of the opening parenthesis, starting with 1 (not 0). You can reference any of them by number. You can use as many digits as you want, provided ...

Get XQuery now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.