Chapter 14. Understanding Regular Expressions, Part II

Jeffrey Friedl

Maybe I’m somewhat of a kook, but I get a lot of pleasure (and instruction) from taking some simple task and really investigating all the ways to go about solving it, comparing and contrasting the various solutions.

Usually, the end result is only a better understanding of Perl (and often, Perl’s regular expressions), but sometimes there is a tangible benefit. For example, the Perl FAQ about removing whitespace from strings is the result of a long day of benchmarking.

Not long ago in comp.lang.perl.misc, someone asked if there was an “and” for regular expressions comparable to the “or” in /this|that/. That is, he wanted to find lines that matched two otherwise unrelated expressions. The quick answer provided by many was an && between two regular expressions: /this/ && /that/, although some offered solutions such as /this.*that|that.*this/.

Randal Schwartz came up with the silly but ingenious /^(?=.*one)(?=.*two)/, and when Tom Christiansen asked why the ^ was included, I got the itch to delve a bit deeper.

Knowing Versus Knowing on Paper

Japanese has an expression that translates to English as "paper driver.” These people have driver’s licenses (hard to get in Japan) but don’t have a car and hardly ever drive —they’re drivers only on paper. I tend to think twice before getting in the car when they’re behind the wheel. Textbook knowledge without experience to back it up doesn’t mean much.

In Understanding Regular Expressions, ...

Get Computer Science & Perl Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.