Unicode Support

.NET provides built-in support for Unicode 3.1, including full support in the \w, \d, and \s sequences. The range of characters matched can be limited to ASCII characters by turning on ECMAScript mode. Case-insensitive matching is limited to the characters of the current language defined in Thread.CurrentCulture, unless the CultureInvariant option is set.

.NET supports the standard Unicode properties (see Table 2) and blocks. Only the short form of property names are supported. Block names require the Is prefix, and must use the simple name form, without spaces or underscores.

Get Regular Expression Pocket Reference, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.