Patterns allow you to group portions of your pattern together into subpatterns and to remember the strings matched by those subpatterns. We call the first behavior clustering and the second one capturing.
To capture a substring for later use, put parentheses
around the subpattern that matches it. The first pair of parentheses
stores its substring in
$1, the second pair in
$2, and so on. You may use as many parentheses as
you like; Perl just keeps defining more numbered variables for you
to represent these captured strings.
/(\d)(\d)/ # Match two digits, capturing them into $1 and $2 /(\d+)/ # Match one or more digits, capturing them all into $1 /(\d)+/ # Match a digit one or more times, capturing the last into $1
Note the difference between the second and third patterns. The second form is usually what you want. The third form does not create multiple variables for multiple digits. Parentheses are numbered when the pattern is compiled, not when it is matched.
Captured strings are often called
backreferences because they refer back to parts
of the captured text. There are actually two ways to get at these
backreferences. The numbered variables you've seen are how you get
at backreferences outside of a pattern, but inside the pattern, that
doesn't work. You have to use
\2, etc. So to find doubled words like "
the" or "
had had", you might use this
But most often, you'll be using the
$1 form, because you'll ...