Tag Lists

Use of the tr command to obtain lists of words, or more generally, to transform one set of characters to another set, as in Example 5-5 in the preceding section, is a handy Unix tool idiom to remember. It leads naturally to a solution of a problem that we had in writing this book: how do we ensure consistent markup through about 50K lines of manuscript files? For example, a command might be marked up with <command>tr</command> when we talk about it in the running text, but elsewhere, we might give an example of something that you type, indicated by the markup <literal>tr</literal>. A third possibility is a manual-page reference in the form <emphasis>tr</emphasis>(1).

The taglist program in Example 5-6 provides a solution. It finds all begin/end tag pairs written on the same line and outputs a sorted list that associates tag use with input files. Additionally, it flags with an arrow cases where the same word is marked up in more than one way. Here is a fragment of its output from just the file for a version of this chapter:

$ taglist ch05.xml
...
      2 cut                             command         ch05.xml
      1 cut                             emphasis        ch05.xml <----
...
      2 uniq                            command         ch05.xml
      1 uniq                            emphasis        ch05.xml <----
      1 vfstab                          filename        ch05.xml
...

The tag listing task is reasonably complex, and would be quite hard to do in most conventional programming languages, even ones with large class libraries, such as C++ and Java, and even if you started with the Knuth or Hanson literate programs for the somewhat similar word-frequency problem. ...

Get Classic Shell Scripting now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.