Matching Pattern Metacharacters Literally

Problem

You want to perform a pattern match for a literal instance of a character that’s special in patterns.

Solution

Escape the special character with a backslash. Or maybe two.

Discussion

Pattern matching is based on the use of metacharacters that have a special meaning and thus stand for something other than themselves. This means that to match a literal instance of a metacharacter, you must turn off its special meaning somehow. Do this by using a backslash character (\). Assume that a table metachar contains the following rows:

mysql> SELECT c FROM metachar;
+------+
| c    |
+------+
| %    |
| _    |
| .    |
| ^    |
| $    |
| \    |
+------+

A pattern consisting only of either SQL metacharacter matches all the values in the table, not just the metacharacter itself:

mysql> SELECT c, c LIKE '%', c LIKE '_' FROM metachar;
+------+------------+------------+
| c    | c LIKE '%' | c LIKE '_' |
+------+------------+------------+
| %    |          1 |          1 |
| _    |          1 |          1 |
| .    |          1 |          1 |
| ^    |          1 |          1 |
| $    |          1 |          1 |
| \    |          1 |          1 |
+------+------------+------------+

To match a literal instance of a SQL pattern metacharacter, precede it with a backslash:

mysql> SELECT c, c LIKE '\%', c LIKE '\_' FROM metachar;
+------+-------------+-------------+
| c    | c LIKE '\%' | c LIKE '\_' |
+------+-------------+-------------+
| %    |           1 |           0 |
| _    |           0 |           1 |
| .    |           0 |           0 |
| ^    |           0 |           0 |
| $    |           0 |           0 |
| \    |           0 |           0 |
+------+-------------+-------------+

The principle is somewhat similar for matching regular expression metacharacters. For example, each of the following regular expressions matches every row in the table:

mysql> SELECT c, c REGEXP '.', c REGEXP '^', c REGEXP '$' FROM metachar;
+------+--------------+--------------+--------------+
| c    | c REGEXP '.' | c REGEXP '^' | c REGEXP '$' |
+------+--------------+--------------+--------------+
| %    |            1 |            1 |            1 |
| _    |            1 |            1 |            1 |
| .    |            1 |            1 |            1 |
| ^    |            1 |            1 |            1 |
| $    |            1 |            1 |            1 |
| \    |            1 |            1 |            1 |
+------+--------------+--------------+--------------+

To match the metacharacters literally, just add a backslash, right? Well, try it:

mysql> SELECT c, c REGEXP '\.', c REGEXP '\^', c REGEXP '\$' FROM metachar;
+------+---------------+---------------+---------------+
| c    | c REGEXP '\.' | c REGEXP '\^' | c REGEXP '\$' |
+------+---------------+---------------+---------------+
| %    |             1 |             1 |             1 |
| _    |             1 |             1 |             1 |
| .    |             1 |             1 |             1 |
| ^    |             1 |             1 |             1 |
| $    |             1 |             1 |             1 |
| \    |             1 |             1 |             1 |
+------+---------------+---------------+---------------+

It didn’t work, because regular expressions are processed a bit differently than SQL patterns. With REGEXP, you need a double backslash to match a metacharacter literally:

mysql> SELECT c, c REGEXP '\\.', c REGEXP '\\^', c REGEXP '\\$' FROM metachar;
+------+----------------+----------------+----------------+
| c    | c REGEXP '\\.' | c REGEXP '\\^' | c REGEXP '\\$' |
+------+----------------+----------------+----------------+
| %    |              0 |              0 |              0 |
| _    |              0 |              0 |              0 |
| .    |              1 |              0 |              0 |
| ^    |              0 |              1 |              0 |
| $    |              0 |              0 |              1 |
| \    |              0 |              0 |              0 |
+------+----------------+----------------+----------------+

Because backslash suppresses the special meaning of metacharacters, backslash itself is special. To match a backslash literally, use double backslashes in SQL patterns or quadruple backslashes in regular expressions:

mysql> SELECT c, c LIKE '\\', c REGEXP '\' FROM metachar;
+------+-------------+-----------------+
| c    | c LIKE '\\' | c REGEXP '\\\\' |
+------+-------------+-----------------+
| %    |           0 |               0 |
| _    |           0 |               0 |
| .    |           0 |               0 |
| ^    |           0 |               0 |
| $    |           0 |               0 |
| \    |           1 |               1 |
+------+-------------+-----------------+

It’s even worse trying to figure out how many backslashes to use when you’re issuing a query from within a program. It’s more than likely that backslashes are also special to your programming language, in which case you’ll need to double each one.

Within a character class, use these marks to include literal instances of the following class constructor characters:

  • To include a literal ] character, list it first.

  • To include a literal - character, list it first or last.

  • To include a literal ^ character, list it somewhere other than as the first character.

  • To include a literal \ character, double it.

Get MySQL Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.