Regular-Expression Syntax

In their most basic form, regular expressions match characters on a one-for-one basis. Thus, you can match the first occurrence of the letter “a” in a search string as follows:

REFind("a", "abcdefg")

Likewise, you can search for the letters “can” in succession with:

REFind("can", "Watch the candle burn")

Using “can” as your regular expression matches “can”, “candle”, and “scan”. Regular expressions that look for individual characters in this manner are said to be single-character regular expressions.

Single-character regular expressions are great when you know the exact series of characters that you want to match. But what if you don’t know? What if you need to find and remove all the HTML tags in a particular block of text? You certainly don’t want to have to code a regular expression for every single HTML tag. Instead, you can use what is known as a multicharacter regular expression. Multicharacter regular expressions use special characters (covered in a moment) to define ranges of characters to be matched. The following example takes the contents of a form variable (Form.MyString) and removes all HTML from it:

<cfset NoHTML = REReplace(Form.MyString, "<[^>]*>", "", "All")>

At first glance, the regular expression looks like a string of unrelated and meaningless characters. If you take the regular expression and break it down, it is easier to see what is happening. In this case, the first part of the expression, “<”, matches the open angle bracket used to ...

Get Programming ColdFusion MX, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.