O'Reilly logo

Perl Cookbook by Nathan Torkington, Tom Christiansen

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Matching Multiple Lines

Problem

You want to use regular expressions on a string containing more than one line, but the special characters . (any character but newline), ^ (start of string), and $ (end of string) don’t seem to work for you. This might happen if you’re reading in multiline records or the whole file at once.

Solution

Use /m , /s, or both as pattern modifiers. /s lets . match newline (normally it doesn’t). If the string had more than one line in it, then /foo.*bar/s could match a "foo" on one line and a "bar" on a following line. This doesn’t affect dots in character classes like [#%.], since they are regular periods anyway.

The /m modifier lets ^ and $ match next to a newline. /^=head[1-7]$/m would match that pattern not just at the beginning of the record, but anywhere right after a newline as well.

Discussion

A common, brute-force approach to parsing documents where newlines are not significant is to read the file one paragraph at a time (or sometimes even the entire file as one string) and then extract tokens one by one. To match across newlines, you need to make . match a newline; it ordinarily does not. In cases where newlines are important and you’ve read more than one line into a string, you’ll probably prefer to have ^ and $ match beginning- and end-of-line, not just beginning- and end-of-string.

The difference between /m and /s is important: /m makes ^ and $ match next to a newline, while /s makes . match newlines. You can even use them together—they’re not ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required