wc

wc stands for Word Count, although it can also count characters and lines. This makes it a flexible tool for counting any kind of items. It is most commonly used to count the number of lines in a file, or (as with most Unix tools) in any other data sent to it, but it can count characters and words, too.

Although it is often used against only one file, or to parse standard input via a pipe, wc is also capable of counting and totaling multiple files at once. The three main flags to wc are -w (count words), -c (count characters), and -l (count lines). Of these, by far the most commonly used is the line count. Counting the number of lines in a file is often useful; counting the number of results from a pipeline is also very useful. A lot of the code and recipes in this book use wc -l as an automatic way of counting the number of results.

All implementations of wc pad their output when processing multiple files so that the columns line up nicely. This can be a pain when scripting, because “ 14” is not so easily interpreted as the number fourteen as a simple “14” without the padding. The Unix implementation of wc always pads so a workaround is required; awk will happily strip the whitespace, so the command below using awk works fine. This snippet shows multiple files with padding and how this affects the common task of assigning a variable with the length of a file.

wc -l /etc/hosts*
  18 /etc/hosts
  14 /etc/hosts.allow
  87 /etc/hosts.deny
 119 total
$ wc -l /etc/hosts   18 /etc/hosts ...

Get Shell Scripting: Expert Recipes for Linux, Bash, and More now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.