Unicode Support
Perl provides built-in support for Unicode 3.2, including full support in the \w
, \d
, \s
, and \b
metasequences.
The following constructs respect the current locale if use locale
is defined: case-insensitive (i
) mode, \L
, \l
, \U
, \u
, \w
, and \W
.
Perl supports the standard Unicode properties (see Table 3) as well as Perl-specific composite properties (see Table 10). Scripts and properties may have an Is
prefix, but do not require it. Blocks require an In
prefix only if the block name conflicts with a script name.
Table 1-10. Perl composite Unicode properties
Property | Equivalent |
---|---|
| [\x00-\x7f] |
| [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}] |
| [\p{Ll}\p{Lu}\p{Lt}\p{Lo}] |
| \p{C} |
| \p{Nd} |
| [^\p{C}\p{Space}] |
| \p{Ll} |
| \P{C} |
| \p{P} |
| [\t\n\f\r\p{Z}] |
| [\p{Lu}\p{Lt}] |
| [_\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}] |
| [0-9a-fA-F] |
Get Regular Expression Pocket Reference, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.