Multiline Comments

Problem

You want to match a comment that starts with /* and ends with */. Nested comments are not permitted. Any /* between /* and */ is simply part of the comment. Comments can span across lines.

Solution

/\*.*?\*/
Regex options: Dot matches line breaks
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby
/\*[\s\S]*?\*/
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

The forward slash has no special meaning in regular expressions, but the asterisk does. We need to escape the asterisk with a backslash. This gives /\* and \*/ to match /* and */. Backslashes and/or forward slashes may get other special meanings when you add literal regular expressions to your source code, so you may need to escape the forward slashes as explained in Recipe 3.1.

We use .*? to match anything between the two delimiters of the comment. The option “dot matches line breaks” that most regex engines have allows this to span multiple lines. We need to use a lazy quantifier to make sure that the comment stops at the first */ after the /*, rather than at the last */ in the file.

JavaScript is the only regex flavor in this book that does not have an option to make the dot match line breaks. If you’re using JavaScript without the XRegExp library, you can use [\s\S] to accomplish the same. Although you could use [\s\S] with the other regex flavors too, we do not recommend it, as regex engines generally have optimized code to handle the dot, ...

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.