Greed

There’s one last feature of the repetition operators that can cause unexpected results: by default, they’re greedy. This isn’t a question of computing virtue, but rather one of how much content a regular expression can match at one go. This is a common issue in things like HTML, where you might see something like:

<a href= "http://example.com" >Example.com</a>

You might think you could match the HTML tags simply with an expression like:

/<.*>/

But instead of matching the opening tag and closing tag separately, that expression will grab everything from the opening < to the closing > of </a>, because it can. If you want to restrain a given expression so that it takes the smallest possible matching bite, add a ? behind any of the repetition operators:

/<.*?>/

Greed matters more when you use regular expressions to extract content from long strings, but it can yield confusing results even in supposedly simple matching. If you have mysterious problems, greed is a good thing to check for.

Get Learning Rails now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.