All Comments

Problem

You want a regex that matches both single-line and multiline comments, as described in the "Problem" sections of the preceding two recipes.

Solution

(?-s://.*)|(?s:/\*.*?\*/)
Regex options: None
Regex flavors: .NET, Java, PCRE, Perl
(?-m://.*)|(?m:/\*.*?\*/)
Regex options: None
Regex flavors: Ruby
//[^\r\n]*|/\*.*?\*/
Regex options: Dot matches line breaks
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby
//.*|/\*[\s\S]*?\*/
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

You might think that you could just use alternation to combine the solutions from the previous two recipes: //.*|/\*.*?\*/. That won’t work, because the first alternative should have “dot matches line breaks” turned off, whereas the second alternative should have it turned on. If you want to combine the two regular expressions using the dot, you need to use mode modifiers to turn on the option “dot matches line breaks” for the second half of the regular expression. The solutions shown here also explicitly turn off the option for the first half of the regular expression. Strictly speaking, this isn’t necessary, but it makes things more obvious and prevents mistakes with the “dot matches line breaks” option if this regex were combined into an even longer regex.

Python and JavaScript (with or without XRegExp) do not support mode modifiers in the middle of the regular expression. For Python and JavaScript with XRegExp, we can use the negated character class ...

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.