5.14. Escape Regular Expression Metacharacters

Problem

You want to use a literal string provided by a user or from some other source as part of a regular expression. However, you want to escape all regular expression metacharacters within the string before embedding this string in your regex, to avoid any unintended consequences.

Solution

By adding a backslash before any characters that potentially have special meaning within a regular expression, you can safely use the resulting pattern to match a literal sequence of characters. Of the programming languages covered by this book, all except JavaScript have a built-in function or method to perform this task (listed in Table 5-3). However, for the sake of completeness, we’ll show how to pull this off using your own regex, even in the languages that have a ready-made solution.

Built-in solutions

Table 5-3 lists the native functions designed to solve this problem.

Table 5-3. Native solutions for escaping regular expression metacharacters

Language

Function

C#, VB.NET

Regex.Escape(str)

Java

Pattern.quote(str)

Perl

quotemeta(str)

PHP

preg_quote(str, [delimiter])

Python

re.escape(str)

Ruby

Regexp.escape(str)

Notably absent from the list is JavaScript, which does not have a built-in function designed for this purpose.

Regular expression

Although it’s best to use a built-in solution if available, you can pull this off on your own by using the following regular expression along with the appropriate replacement string (shown next). Just make sure to replace all matches, ...

Get Regular Expressions Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.