13.8. Escaping Special Characters in a Regular Expression
Problem
You want to have characters
such as *
or +
treated as
literals, not as metacharacters, inside a regular expression. This is
useful when allowing users to type in search strings you want to use
inside a regular expression.
Solution
Use preg_quote( )
to escape Perl-compatible regular-expression
metacharacters:
$pattern = preg_quote('The Education of H*Y*M*A*N K*A*P*L*A*N').':(\d+)'; if (preg_match("/$pattern/",$book_rank,$matches)) { print "Leo Rosten's book ranked: ".$matches[1]; }
Use quotemeta( )
to escape POSIX metacharacters:
$pattern = quotemeta('M*A*S*H').':[0-9]+'; if (ereg($pattern,$tv_show_rank,$matches)) { print 'Radar, Hot Lips, and the gang ranked: '.$matches[1]; }
Discussion
Here are the characters that preg_quote( )
escapes:
. \ + * ? ^ $ [ ] ( ) { } < > = ! | :
Here are the characters that quotemeta( )
escapes:
. \ + * ? ^ $ [ ] ( )
These functions escape the metacharacters with backslash.
The quotemeta( )
function doesn’t
match all POSIX metacharacters. The characters {
,
}
, and |
are also valid
metacharacters but aren’t converted. This is another
good reason to use preg_match( )
instead of
ereg( )
.
You can also pass preg_quote( )
an additional
character to escape as a second argument. It’s
useful to pass your pattern delimiter (usually /
)
as this argument so it also gets escaped. This is important if you
incorporate user input into a regular-expression pattern. The
following code expects $_REQUEST['search_term'] ...
Get PHP Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.