Cover by Alasdair Allan

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

Regular Expressions

Regular expressions, commonly known as regexes, are a pattern-matching standard for text processing, and are a powerful tool when dealing with strings. With regular expressions, an expression serves as a pattern to compare with the text being searched. You can use regular expressions to search for patterns in a string, replace text, and extract substrings from the original string.

Introduction to Regular Expressions

In its simplest form, you can use a regular expression to match a literal string; for example, the regular expression “string” will match the string “this is a string”. Each character in the expression will match itself, unless it is one of the special characters +, ?, ., *, ^, $, (, ), [, {, |, or \. The special meaning of these characters can be escaped by prepending a backslash character, \.

We can also tie our expression to the start of a string (^string) or the end of a string (string$). For the string “this is a string”, ^string will not match the string, while string$ will.

We can also use quantified patterns. Here, * matches zero or more times, ? matches zero or one time, and + matches one or more times. So, the regular expression “23*4” would match “1245”, “12345”, and “123345”, but the expression “23?4” would match “1245” and also “12345”. Finally, the expression “23+4” would match “12345” and “123345” but not “1245”.

Unless told otherwise, regular expressions are always greedy; they will normally match the longest string possible.

While a backslash ...

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required