Unicode Support

This package supports Unicode 4.0, although \w, \W, \d, \D, \s, and \S support only ASCII. You can use the equivalent Unicode properties \p{L}, \P{L}, \p{Nd}, \P{Nd}, \p{Z}, and \P{Z}. The word boundary sequences—\b and \B—do understand Unicode.

For supported Unicode properties and blocks, see Table 2. This package supports only the short property names, such as \p{Lu}, and not \p{Lowercase_Letter}. Block names require the In prefix, and support only the name form without spaces or underscores, for example, \p{InGreekExtended}, not \p{In_Greek_Extended} or \p{In Greek Extended}.

Get Regular Expression Pocket Reference, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.