Alternation

Inside a pattern or subpattern, use the | metacharacter to specify a set of possibilities, any one of which could match. For instance:

/Gandalf|Saruman|Radagast/

matches Gandalf or Saruman or Radagast. The alternation extends only as far as the innermost enclosing parentheses (whether capturing or not):

/prob|n|r|l|ate/    # Match prob, n, r, l, or ate
/pro(b|n|r|l)ate/   # Match probate, pronate, prorate, or prolate
/pro(?:b|n|r|l)ate/ # Match probate, pronate, prorate, or prolate

The second and third forms match the same strings, but the second form captures the variant character in $1 and the third form does not.

At any given position, the Engine tries to match the first alternative, and then the second, and so on. The relative length of the alternatives does not matter, which means that in this pattern:

/(Sam|Samwise)/

$1 will never be set to Samwise no matter what string it's matched against, because Sam will always match first. When you have overlapping matches like this, put the longer ones at the beginning.

But the ordering of the alternatives only matters at a given position. The outer loop of the Engine does left-to-right matching, so the following always matches the first Sam:

"'Sam I am,' said Samwise" =~ /(Samwise|Sam)/;   # $1 eq "Sam"

But you can force right-to-left scanning by making use of greedy quantifiers, as discussed earlier in "Quantifiers":

"'Sam I am,' said Samwise" =~ /.*(Samwise|Sam)/; # $1 eq "Samwise"

You can defeat any left-to-right (or right-to-left) matching ...

Get Programming Perl, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.