Metacharacters and Metasymbols

Now that we've admired all the fancy cages, we can go back to looking at the critters in the cages, those funny-looking symbols you put inside the patterns. By now you'll have cottoned to the fact that these symbols aren't regular Perl code like function calls or arithmetic operators. Regular expressions are their own little language nestled inside of Perl. (There's a bit of the jungle in all of us.)

For all their power and expressivity, patterns in Perl recognize the same 12 traditional metacharacters (the Dirty Dozen, as it were) found in many other regular expression packages:

\ | ( ) [ { ^ $ * + ? .

Some of those bend the rules, making otherwise normal characters that follow them special. We don't like to call the longer sequences "characters", so when they make longer sequences, we call them metasymbols (or sometimes just "symbols"). But at the top level, those twelve metacharacters are all you (and Perl) need to think about. Everything else proceeds from there.

Some simple metacharacters stand by themselves, like . and ^ and $. They don't directly affect anything around them. Some metacharacters work like prefix operators, governing what follows them, like \. Others work like postfix operators, governing what immediately precedes them, like *, +, and ?. One metacharacter, |, acts like an infix operator, standing between the operands it governs. There are even bracketing metacharacters that work like circumfix operators, governing something ...

Get Programming Perl, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.