Words and Whitespace Regexps

While there are many other patterns for use in regular expressions , they generally aren't very common. So far we've looked at all but five of the most common ones, which leaves us with . (a period), \s, \S, \b, and \B.

The pattern . will match any single character except \n (new line). Therefore, c.twill match "cat," but not "cart."

The next two, \s and \S, equate to "Match any whitespace " and "Match any non-whitespace," respectively. That is, if you specify [\s\S], your regular expression will match any single character, regardless of what it is; if you use [\s\S]*, your regular expression will match anything. For example:

    $string = "Foolish child!";
    preg_match("/[\S]{7}[\s]{1}[\S]{6}/", $string);

That matches precisely seven non-whitespace characters, followed by one whitespace character, followed by six non-whitespace characters—the exact string.

The last two patterns, \b and \B, equate to "On a word boundary" and "Not on a word boundary," respectively. That is, if you use the regexp /oo\b/, it will match "foo," "moo," "boo," and "zoo," because the "oo" is at the end of the word, but not "fool," "wool," or "pool," because the "oo" is inside the word. The \B pattern is the opposite, which means it would match only patterns that aren't on the edges of a word—using the previous example, "fool," "wool," and "pool" would be matched, whereas "foo," "moo," "boo," and "zoo" would not.

For example:

 $string = "Foolish child!"; if (preg_match("/oo\b/i", $string)) ...

Get PHP in a Nutshell now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.