5.2. Find Any of Multiple Words
Problem
You want to find any one out of a list of words, without having to search through the subject string multiple times.
Solution
Using alternation
The simple solution is to alternate between the words you want to match:
\b(?:one|two|three)\b
Regex options: Case insensitive |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
More complicated examples of matching similar words are shown in Recipe 5.3.
Example JavaScript solution
var subject = 'One times two plus one equals three.'; var regex = /\b(?:one|two|three)\b/gi; subject.match(regex); // returns an array with four matches: ['One','two','one','three'] // This function does the same thing but accepts an array of words to // match. Any regex metacharacters within the accepted words are escaped // with a backslash before searching. function match_words (subject, words) { var regex_metachars = /[(){}[\]*+?.\\^$|,\-]/g; for (var i = 0; i < words.length; i++) { words[i] = words[i].replace(regex_metachars, '\\$&'); } var regex = new RegExp('\\b(?:' + words.join('|') + ')\\b', 'gi'); return subject.match(regex) || []; } match_words(subject, ['one','two','three']); // returns an array with four matches: ['One','two','one','three']
Discussion
Using alternation
There are three parts to this regular expression: the word
boundaries on both ends, the noncapturing group, and the list of
words (each separated by the ‹|
› alternation operator). The word boundaries ensure that the regex does not match part of ...
Get Regular Expressions Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.