The awk Programming Model

awk views an input stream as a collection of records, each of which can be further subdivided into fields. Normally, a record is a line, and a field is a word of one or more nonwhitespace characters. However, what constitutes a record and a field is entirely under the control of the programmer, and their definitions can even be changed during processing.

An awk program consists of pairs of patterns and braced actions, possibly supplemented by functions that implement the details of the actions. For each pattern that matches the input, the action is executed, and all patterns are examined for every input record.

Either part of a pattern/action pair may be omitted. If the pattern is omitted, the action is applied to every input record. If the action is omitted, the default action is to print the matching record on standard output. Here is the typical layout of an awk program:

            pattern  { action }                      Run action if pattern matches
            pattern                                  
            Print record if pattern matches
         { action }                      Run action for every record

Input is switched automatically from one input file to the next, and awk itself normally handles the opening, reading, and closing of each input file, allowing the user program to concentrate on record processing. The code details are presented later in Section 9.5.

Although the patterns are often numeric or string expressions, awk also provides two special patterns with the reserved words BEGIN and END.

The action associated with BEGIN is performed just ...

Get Classic Shell Scripting now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.