Operators

Problem

You are developing a syntax coloring scheme for your favorite text editor. You need a regular expression that matches any of the characters that can be used as operators in the programming language for which you’re creating the scheme: -, +, *, /, =, <, >, %, &, ^, |, !, ~, and ?. The regex doesn’t need to check whether the combination of characters forms a valid operator. That is not a job for a syntax coloring scheme; instead, it should simply highlight all operator characters as such.

Solution

[-+*/=<>%&^|!~?]
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

If you read Recipe 2.3, the solution is obvious. You may wonder why we included this as a separate recipe.

The focus of this chapter is on regular expressions that will be used in larger systems, such as syntax coloring schemes. Such systems will often combine regular expressions using alternation. That can lead to unexpected pitfalls that may not be obvious when you see a regular expression in isolation.

One pitfall is that a system using this regular expression will likely have other regular expressions that match the same characters. Many programming languages use / as the division operator and // to start a comment. If you combine the regular expression from this recipe with the one from Single-Line Comments into (?<operator>[-+*/=<>%&^|!~?])|(?<comment>//.*), then you will find that your system never matches any comments. All forward slashes will be matched as ...

Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.