utf8

While Perl’s implementation of Unicode support is incomplete, the use utf8 pragma tells the Perl parser to allow UTF-8 in the program text in its current lexical scope. no utf8 tells Perl to switch back to treating text as literal bytes in the current lexical scope. You’ll probably use use utf8 only for compatibility, since future versions of Perl will standardize on the UTF-8 encoding for source text.

use utf8 has the following effects: bytes with their high-bit set (identifiers, string constants, constant regular expressions, package names) will be treated as literal UTF-8 characters and regular expressions within the scope of the utf8 pragma and will default to using character semantics instead of byte semantics. For example:

@bytes_or_chars = split //, $data;  # May split to bytes if data
                                    # $data isn't UTF-8
            {
                use utf8;                       # Forces char semantics
                @chars = split //, $data;       # Splits characters
            }

Get Perl in a Nutshell, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.