Searching with More Complex Patterns

The regular expression mechanism of grep provides for some very powerful patterns that can fit most of your needs.

A regular expression describes patterns for matching against strings. Any alphabetic character just matches that character in the string. “A” matches “A”, “B” matches “B”; no surprise there. But regular expressions define other special characters that can be used by themselves or in combination with other characters to make more complex patterns.

We already said that any character without some special meaning simply matches itself—“A” to “A” and so on. The next important rule is to combine letters just by position, so “AB” matches “A” followed by “B”. This, too, seems obvious.

The first special character is (.). A period (.) matches any single character. Therefore …. matches any four characters; A. matches an “A” followed by any character; and .A. matches any character, then an “A”, then any character (not necessarily the same character as the first).

An asterisk (*) means to repeat zero or more occurrences of the previous character. So A* means zero or more “A” characters, and .* means zero or more characters of any sort (such as “abcdefg”, “aaaabc”, “sdfgf ;lkjhj”, or even an empty line).

So what does ..* mean? Any single character followed by zero or more of any character (i.e., one or more characters) but not an empty line.

Speaking of lines, the caret ^ matches the beginning of a line of text and the dollar sign $ matches the end ...

Get bash Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.