To Join a Phrase

We have covered all the advanced constructs of sed and are now ready to look at a shell script named phrase that uses nearly all of them. This script is a general-purpose, grep-like program that allows you to look for a series of multiple words that might appear across two lines.

An essential element of this program is that, like grep, it prints out only the lines that match the pattern. You might think we’d use the -n option to suppress the default output of lines. However, what is unusual about this sed script is that it creates an input/output loop, controlling when a line is output or not.

The logic of this script is to first look for the pattern on one line and print the line if it matches. If no match is found, we read another line into the pattern space (as in previous multiline scripts). Then we copy the two-line pattern space to the hold space for safekeeping. Now the new line that was read into the pattern space previously could match the search pattern on its own, so the next match we attempt is on the second line only. Once we’ve determined that the pattern is not found on either the first or second lines, we remove the newline between the two lines and look for it spanning those lines.

The script is designed to accept arguments from the command line. The first argument is the search pattern. All other command-line arguments will be interpreted as filenames. Let’s look at the entire script before analyzing it:

#! /bin/sh # phrase -- search for words across ...

Get sed & awk, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.