5.11. Match Complete Lines That Do Not Contain a Word

Problem

You want to match complete lines that do not contain the word ninja.

Solution

^(?:(?!\bninja\b).)*$
Regex options: Case insensitive, ^ and $ match at line breaks (“dot matches line breaks” must not be set)
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

In order to match a line that does not contain something, use negative lookahead (described in Recipe 2.16). Notice that in this regular expression, a negative lookahead and a dot are repeated together using a noncapturing group. This makes sure that the regex \bninja\b fails at every position in the line. The ^ and $ anchors are placed at the edges of the regular expression to make sure you match a complete line.

The options you apply to this regular expression determine whether it tries to match the entire subject string or just one line at a time. With the option to let ^ and $ match at line breaks enabled and the option to let dot match line breaks disabled, this regular expression works as described and matches line by line. If you invert the state of these two options, the regular expression will match any string that does not contain the word “ninja”.

Caution

Testing a negative lookahead against every position in a line or string is rather inefficient. This solution is only intended to be used in situations where one regular expression is all that can be used, such as when using an application that can’t be programmed. When programming, ...

Get Regular Expressions Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.