Chapter 3. Searching and Substitutions

As we discussed in Section 1.2, Unix programmers prefer to work on lines of text. Textual data is more flexible than binary data, and Unix systems provide a number of tools that make slicing and dicing text easy.

In this chapter, we look at two fundamental operations that show up repeatedly in shell scripting: text searching—looking for specific lines of text—and text substitution—changing the text that is found.

While you can accomplish many things by using simple constant text strings, regular expressions provide a much more powerful notation for matching many different actual text fragments with a single expression. This chapter introduces the two regular expression "flavors" provided by various Unix programs, and then proceeds to cover the most important tools for text extraction and rearranging.

Searching for Text

The workhorse program for finding text (or "matching text," in Unix jargon) is grep. On POSIX systems, grep can use either of the two regular expression flavors, or match simple strings.

Traditionally, there were three separate programs for searching through text files:

grep

The original text-matching program. It uses Basic Regular Expressions (BREs) as defined by POSIX, and as we describe later in the chapter.

egrep

"Extended grep." This program uses Extended Regular Expressions (EREs), which are a more powerful regular expression notation. The cost of EREs is that they can be more computationally expensive to use. On ...

Get Classic Shell Scripting now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.