5.2. Find Any of Multiple Words

Problem

You want to find any one out of a list of words, without having to search through the subject string multiple times.

Solution

Using alternation

The simple solution is to alternate between the words you want to match:

\b(?:one|two|three)\b
Regex options: Case insensitive
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

More complex examples of matching similar words are shown in Recipe 5.3.

Example JavaScript solution

var subject = "One times two plus one equals three.";

// Solution 1:

var regex = /\b(?:one|two|three)\b/gi;

subject.match(regex);
// Returns an array with four matches: ["One","two","one","three"]

// Solution 2 (reusable):

// This function does the same thing but accepts an array of words to
// match. Any regex metacharacters within the accepted words are escaped
// with a backslash before searching.

function matchWords(subject, words) {
    var regexMetachars = /[(){[*+?.\\^$|]/g;

    for (var i = 0; i < words.length; i++) {
        words[i] = words[i].replace(regexMetachars, "\\$&");
    }

    var regex = new RegExp("\\b(?:" + words.join("|") + ")\\b", "gi");

    return subject.match(regex) || [];
}

matchWords(subject, ["one","two","three"]);
// Returns an array with four matches: ["One","two","one","three"]

Discussion

Using alternation

There are three parts to this regular expression: the word boundaries on both ends, the noncapturing group, and the list of words (each separated by the | alternation operator). The word boundaries ensure that the regex ...

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.