You want to use regular expressions on a string containing more than
one line, but the special characters
character but newline),
^ (start of string), and
$ (end of string) don’t seem to work for
you. This might happen if you’re reading in multiline records
or the whole file at once.
/s, or both as
match newline (normally it doesn’t). If the string had more
than one line in it, then
/foo.*bar/s could match
"foo" on one line and a
on a following line. This doesn’t affect dots in character
[#%.], since they are regular periods
/m modifier lets
$ match next to a newline.
/^=head[1-7]$/m would match that pattern not just
at the beginning of the record, but anywhere right after a newline as
A common, brute-force approach to parsing documents where newlines
are not significant is to read the file one paragraph at a time (or
sometimes even the entire file as one string) and then extract tokens
one by one. To match across newlines, you need to make
. match a newline; it ordinarily does not. In
cases where newlines are important and you’ve read more than
one line into a string, you’ll probably prefer to have
$ match beginning- and
end-of-line, not just beginning- and end-of-string.
The difference between
/s is important:
$ match next to a
. match newlines. You can even use them together—they’re not ...