Building Blocks

Every language has a set of basic components (words or parts of words) and a set of syntax rules for combining them. The “words” in rules are literal characters (or symbols), some metacharacters (or metasymbols), and escape sequences, while the combining syntax includes other metacharacters, quantifiers, bracketing characters, and assertions.

Metacharacters

The “word"-like metacharacters are ., ^, ^^, $, and $$. The . matches any single character, even a newline character. Actually, what it matches by default is a Unicode grapheme, but you can change that behavior with a pragma in your code, or a modifier on the rule. (We’ll discuss modifiers in Section 7.3 later in this chapter.) The ^ and $ metacharacters are zero-width matches on the beginning and end of a string. They each have doubled alternates ^^ and $$ that match at the beginning and end of every line within a string.

The |, &, \, #, and := metacharacters are all syntax structure elements. The | is an alternation between two options. The & matches two patterns simultaneously (the patterns must be the same length). The \ turns literal characters into metacharacters (the escape sequences) or turns metacharacters into literal characters. The # marks a comment to the end of the line. Whitespace insensitivity (the old /x modifier) is on by default, so you can start a comment at any point on any line in a rule. Just make sure you don’t comment out the symbol that terminates the rule. The := binds a hypothetical ...

Get Perl 6 and Parrot Essentials, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.