Cover by Steven Levithan, Jan Goyvaerts

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

2.17. Match One of Two Alternatives Based on a Condition

Problem

Create a regular expression that matches a comma-delimited list of the words one, two, and three. Each word can occur any number of times in the list, and the words can occur in any order, but each word must appear at least once.

Solution

\b(?:(?:(one)|(two)|(three))(?:,|\b)){3,}(?(1)|(?!))(?(2)|(?!))(?(3)|(?!))
Regex options: None
Regex flavors: .NET, PCRE, Perl, Python

Java, JavaScript, and Ruby do not support conditionals. When programming in these languages (or any other language), you can use the regular expression without the conditionals, and write some extra code to check if each of the three capturing groups matched something.

\b(?:(?:(one)|(two)|(three))(?:,|\b)){3,}
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

.NET, PCRE, Perl, and Python support conditionals using numbered capturing groups. (?(1)then|else) is a conditional that checks whether the first capturing group has already matched something. If it has, the regex engine attempts to match then. If the capturing group has not participated in the match attempt thus far, the else part is attempted.

The parentheses, question mark, and vertical bar are all part of the syntax for the conditional. They don’t have their usual meaning. You can use any kind of regular expression for the then and else parts. The only restriction is that if you want to use alternation for one of the parts, you have to use a group ...

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required