Chapter 13. Regular Expressions

Introduction

Regular expressions are a powerful tool for matching and manipulating text. While not as fast as plain-vanilla string matching, regular expressions are extremely flexible; they allow you to construct patterns to match almost any conceivable combination of characters with a simple, albeit terse and somewhat opaque syntax.

In PHP, you can use regular expression functions to find text that matches certain criteria. Once located, you can choose to modify or replace all or part of the matching substrings. For example, this regular expression turns text email addresses into mailto: hyperlinks:

$html = preg_replace('/[^@\s]+@([-a-z0-9]+\.)+[a-z]{2,}/i',
                     '<a href="mailto:$0">$0</a>', $text);

As you can see, regular expressions are handy when transforming plain text into HTML and vice versa. Luckily, since these are such popular subjects, PHP has many built-in functions to handle these tasks. Recipe 9.9 tells how to escape HTML entities, Recipe 11.12 covers stripping HTML tags, and Recipe 11.10 and Recipe 11.11 show how to convert ASCII to HTML and HTML to ASCII, respectively. For more on matching and validating email addresses, see Recipe 13.7.

Over the years, the functionality of regular expressions has grown from its basic roots to incorporate increasingly useful features. As a result, PHP offers two different sets of regular-expression functions. The first set includes the traditional (or POSIX) functions, all beginning with ereg (for ...

Get PHP Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.