Effective awk Programming, 3rd Edition

Book description

Effective awk Programming,3rd Edition, focuses entirely onawk, exploring it in the greatest depth of the threeawk titles we carry. It's an excellent companion piece tothe more broadly focused second edition. This book providescomplete coverage of the gawk 3.1 language as well as themost up-to-date coverage of the POSIX standard for awkavailable anywhere. Author Arnold Robbins clearly distinguishesstandard awk features from GNU awk(gawk)-specific features, shines light into many of the"dark corners" of the language (areas to watch out for whenprogramming), and devotes two full chapters to example programs. Abrand new chapter is devoted to TCP/IP networking with gawk. Heincludes a summary of how the awk language evolved. The bookalso covers:

  • Internationalization of gawk

  • Interfacing to i18n at the awk level

  • Two-way pipes

  • TCP/IP networking via the two-way pipe interface

  • The new PROCINFO array, which provides information aboutrunning gawk

  • Profiling and pretty-printing awk programs

  • In addition to covering the awk language, this book servesas the official "User's Guide" for the GNU implementation ofawk (gawk), describing in an integrated fashion theextensions available to the System V Release 4 version ofawk that are also available in gawk. As the officialgawk User's Guide, this book will also be availableelectronically, and can be freely copied and distributed under theterms of the Free Software Foundation's Free Documentation License(FDL). A portion of the proceeds from sales of this book will go tothe Free Software Foundation to support further development of freeand open source software. The third edition of Effective awkProgramming is a GNU Manual and is published by O'Reilly &Associates under the Free Software Foundation'sFreeDocumentation License (FDL). A portion of the proceeds fromthe sale of this book is donated to the Free Software Foundation tofurther development of GNU software. This book is also available inelectronic form; you have the freedom to modify this GNU Manual,like GNU software. Copies published by the Free Software Foundationraise funds for GNU development.

    Publisher resources

    View/Submit Errata

    Table of contents

    1. A Note Regarding Supplemental Files
    2. Dedication
    3. Foreword
    4. Preface
      1. History of awk and gawk
      2. A Rose by Any Other Name
      3. Using This Book
      4. Typographical Conventions
        1. Dark Corners
      5. The GNU Project and This Book
      6. How to Contribute
      7. Acknowledgments
    5. I. The awk Language and gawk
      1. 1. Getting Started with awk
        1. 1.1. How to Run awk Programs
          1. 1.1.1. One-Shot Throwaway awk Programs
          2. 1.1.2. Running awk Without Input Files
          3. 1.1.3. Running Long Programs
          4. 1.1.4. Executable awk Programs
          5. 1.1.5. Comments in awk Programs
          6. 1.1.6. Shell-Quoting Issues
        2. 1.2. Datafiles for the Examples
        3. 1.3. Some Simple Examples
        4. 1.4. An Example with Two Rules
        5. 1.5. A More Complex Example
        6. 1.6. awk Statements Versus Lines
        7. 1.7. Other Features of awk
        8. 1.8. When to Use awk
      2. 2. Regular Expressions
        1. 2.1. How to Use Regular Expressions
        2. 2.2. Escape Sequences
        3. 2.3. Regular Expression Operators
        4. 2.4. Using Character Lists
        5. 2.5. gawk-Specific Regexp Operators
        6. 2.6. Case Sensitivity in Matching
        7. 2.7. How Much Text Matches?
        8. 2.8. Using Dynamic Regexps
      3. 3. Reading Input Files
        1. 3.1. How Input Is Split into Records
        2. 3.2. Examining Fields
        3. 3.3. Non-constant Field Numbers
        4. 3.4. Changing the Contents of a Field
        5. 3.5. Specifying How Fields Are Separated
          1. 3.5.1. Using Regular Expressions to Separate Fields
          2. 3.5.2. Making Each Character a Separate Field
          3. 3.5.3. Setting FS from the Command Line
          4. 3.5.4. Field-Splitting Summary
        6. 3.6. Reading Fixed-Width Data
        7. 3.7. Multiple-Line Records
        8. 3.8. Explicit Input with getline
          1. 3.8.1. Using getline with No Arguments
          2. 3.8.2. Using getline into a Variable
          3. 3.8.3. Using getline from a File
          4. 3.8.4. Using getline into a Variable from a File
          5. 3.8.5. Using getline from a Pipe
          6. 3.8.6. Using getline into a Variable from a Pipe
          7. 3.8.7. Using getline from a Coprocess
          8. 3.8.8. Using getline into a Variable from a Coprocess
          9. 3.8.9. Points to Remember About getline
          10. 3.8.10. Summary of getline Variants
      4. 4. Printing Output
        1. 4.1. The print Statement
        2. 4.2. Examples of print Statements
        3. 4.3. Output Separators
        4. 4.4. Controlling Numeric Output with print
        5. 4.5. Using printf Statements for Fancier Printing
          1. 4.5.1. Introduction to the printf Statement
          2. 4.5.2. Format-Control Letters
          3. 4.5.3. Modifiers for printf Formats
          4. 4.5.4. Examples Using printf
        6. 4.6. Redirecting Output of print and printf
        7. 4.7. Special Filenames in gawk
          1. 4.7.1. Special Files for Standard Descriptors
          2. 4.7.2. Special Files for Process-Related Information
          3. 4.7.3. Special Files for Network Communications
          4. 4.7.4. Special Filename Caveats
        8. 4.8. Closing Input and Output Redirections
      5. 5. Expressions
        1. 5.1. Constant Expressions
          1. 5.1.1. Numeric and String Constants
          2. 5.1.2. Octal and Hexadecimal Numbers
          3. 5.1.3. Regular Expression Constants
        2. 5.2. Using Regular Expression Constants
        3. 5.3. Variables
          1. 5.3.1. Using Variables in a Program
          2. 5.3.2. Assigning Variables on the Command Line
        4. 5.4. Conversion of Strings and Numbers
        5. 5.5. Arithmetic Operators
        6. 5.6. String Concatenation
        7. 5.7. Assignment Expressions
        8. 5.8. Increment and Decrement Operators
        9. 5.9. True and False in awk
        10. 5.10. Variable Typing and Comparison Expressions
        11. 5.11. Boolean Expressions
        12. 5.12. Conditional Expressions
        13. 5.13. Function Calls
        14. 5.14. Operator Precedence (How Operators Nest)
      6. 6. Patterns, Actions, and Variables
        1. 6.1. Pattern Elements
          1. 6.1.1. Regular Expressions as Patterns
          2. 6.1.2. Expressions as Patterns
          3. 6.1.3. Specifying Record Ranges with Patterns
          4. 6.1.4. The BEGIN and END Special Patterns
            1. 6.1.4.1. Startup and cleanup actions
            2. 6.1.4.2. Input/Output from BEGIN and END rules
          5. 6.1.5. The Empty Pattern
        2. 6.2. Using Shell Variables in Programs
        3. 6.3. Actions
        4. 6.4. Control Statements in Actions
          1. 6.4.1. The if-else Statement
          2. 6.4.2. The while Statement
          3. 6.4.3. The do-while Statement
          4. 6.4.4. The for Statement
          5. 6.4.5. The break Statement
          6. 6.4.6. The continue Statement
          7. 6.4.7. The next Statement
          8. 6.4.8. Using gawk’s nextfile Statement
          9. 6.4.9. The exit Statement
        5. 6.5. Built-in Variables
          1. 6.5.1. Built-in Variables That Control awk
          2. 6.5.2. Built-in Variables That Convey Information
          3. 6.5.3. Using ARGC and ARGV
      7. 7. Arrays in awk
        1. 7.1. Introduction to Arrays
        2. 7.2. Referring to an Array Element
        3. 7.3. Assigning Array Elements
        4. 7.4. Basic Array Example
        5. 7.5. Scanning All Elements of an Array
        6. 7.6. The delete Statement
        7. 7.7. Using Numbers to Subscript Arrays
        8. 7.8. Using Uninitialized Variables as Subscripts
        9. 7.9. Multidimensional Arrays
        10. 7.10. Scanning Multidimensional Arrays
        11. 7.11. Sorting Array Values and Indices with gawk
      8. 8. Functions
        1. 8.1. Built-in Functions
          1. 8.1.1. Calling Built-in Functions
          2. 8.1.2. Numeric Functions
          3. 8.1.3. String-Manipulation Functions
            1. 8.1.3.1. More about \ and & with sub, gsub, and gensub
          4. 8.1.4. Input/Output Functions
          5. 8.1.5. Using gawk’s Timestamp Functions
          6. 8.1.6. Bit-Manipulation Functions of gawk
          7. 8.1.7. Using gawk’s String-Translation Functions
        2. 8.2. User-Defined Functions
          1. 8.2.1. Function Definition Syntax
          2. 8.2.2. Function Definition Examples
          3. 8.2.3. Calling User-Defined Functions
          4. 8.2.4. The return Statement
          5. 8.2.5. Functions and Their Effects on Variable Typing
      9. 9. Internationalization with gawk
        1. 9.1. Internationalization and Localization
        2. 9.2. GNU gettext
        3. 9.3. Internationalizing awk Programs
        4. 9.4. Translating awk Programs
          1. 9.4.1. Extracting Marked Strings
          2. 9.4.2. Rearranging printf Arguments
          3. 9.4.3. awk Portability Issues
        5. 9.5. A Simple Internationalization Example
        6. 9.6. gawk Can Speak Your Language
      10. 10. Advanced Features of gawk
        1. 10.1. Allowing Nondecimal Input Data
        2. 10.2. Two-Way Communications with Another Process
        3. 10.3. Using gawk for Network Programming
        4. 10.4. Using gawk with BSD Portals
        5. 10.5. Profiling Your awk Programs
      11. 11. Running awk and gawk
        1. 11.1. Invoking awk
        2. 11.2. Command-Line Options
        3. 11.3. Other Command-Line Arguments
        4. 11.4. The AWKPATH Environment Variable
        5. 11.5. Obsolete Options and/or Features
        6. 11.6. Known Bugs in gawk
    6. II. Using awk and gawk
      1. 12. A Library of awk Functions
        1. 12.1. Naming Library Function Global Variables
        2. 12.2. General Programming
          1. 12.2.1. Implementing nextfile as a Function
          2. 12.2.2. Assertions
          3. 12.2.3. Rounding Numbers
          4. 12.2.4. The Cliff Random Number Generator
          5. 12.2.5. Translating Between Characters and Numbers
          6. 12.2.6. Merging an Array into a String
          7. 12.2.7. Managing the Time of Day
        3. 12.3. Datafile Management
          1. 12.3.1. Noting Datafile Boundaries
          2. 12.3.2. Rereading the Current File
          3. 12.3.3. Checking for Readable Datafiles
          4. 12.3.4. Treating Assignments as Filenames
        4. 12.4. Processing Command-Line Options
        5. 12.5. Reading the User Database
        6. 12.6. Reading the Group Database
      2. 13. Practical awk Programs
        1. 13.1. Running the Example Programs
        2. 13.2. Reinventing Wheels for Fun and Profit
          1. 13.2.1. Cutting out Fields and Columns
          2. 13.2.2. Searching for Regular Expressions in Files
          3. 13.2.3. Printing out User Information
          4. 13.2.4. Splitting a Large File into Pieces
          5. 13.2.5. Duplicating Output into Multiple Files
          6. 13.2.6. Printing Nonduplicated Lines of Text
          7. 13.2.7. Counting Things
        3. 13.3. A Grab Bag of awk Programs
          1. 13.3.1. Finding Duplicated Words in a Document
          2. 13.3.2. An Alarm Clock Program
          3. 13.3.3. Transliterating Characters
          4. 13.3.4. Printing Mailing Labels
          5. 13.3.5. Generating Word-Usage Counts
          6. 13.3.6. Removing Duplicates from Unsorted Text
          7. 13.3.7. Extracting Programs from Texinfo Source Files
          8. 13.3.8. A Simple Stream Editor
          9. 13.3.9. An Easy Way to Use Library Functions
      3. 14. Internetworking with gawk
        1. 14.1. Networking with gawk
          1. 14.1.1. gawk’s Networking Mechanisms
            1. 14.1.1.1. The fields of the special filename
            2. 14.1.1.2. Comparing protocols
            3. 14.1.1.3. /inet/tcp
            4. 14.1.1.4. /inet/udp
            5. 14.1.1.5. /inet/raw
          2. 14.1.2. Establishing a TCP Connection
          3. 14.1.3. Troubleshooting Connection Problems
          4. 14.1.4. Interacting with a Network Service
          5. 14.1.5. Setting up a Service
          6. 14.1.6. Reading Email
          7. 14.1.7. Reading a Web Page
          8. 14.1.8. A Primitive Web Service
          9. 14.1.9. A Web Service with Interaction
            1. 14.1.9.1. A Simple CGI Library
          10. 14.1.10. A Simple Web Server
          11. 14.1.11. Network Programming Caveats
        2. 14.2. Some Applications and Techniques
          1. 14.2.1. PANIC: An Emergency Web Server
          2. 14.2.2. GETURL: Retrieving Web Pages
          3. 14.2.3. REMCONF: Remote Configuration of Embedded Systems
          4. 14.2.4. URLCHK: Look for Changed Web Pages
          5. 14.2.5. WEBGRAB: Extract Links from a Page
          6. 14.2.6. STATIST: Graphing a Statistical Distribution
          7. 14.2.7. MOBAGWHO: A Simple Mobile Agent
        3. 14.3. Related Links
    7. III. Appendixes
      1. A. The Evolution of the awk Language
        1. A.1. Major Changes Between V7 and SVR3.1
        2. A.2. Changes Between SVR3.1 and SVR4
        3. A.3. Changes Between SVR4 and POSIX awk
        4. A.4. Extensions in the Bell Laboratories awk
        5. A.5. Extensions in gawk Not in POSIX awk
        6. A.6. Major Contributors to gawk
      2. B. Installing gawk
        1. B.1. The gawk Distribution
          1. B.1.1. Getting the gawk Distribution
          2. B.1.2. Extracting the Distribution
          3. B.1.3. Contents of the gawk Distribution
        2. B.2. Compiling and Installing gawk on Unix
          1. B.2.1. Compiling gawk for Unix
          2. B.2.2. Additional Configuration Options
          3. B.2.3. The Configuration Process
        3. B.3. Installation on PC Operating Systems
          1. B.3.1. Installing a Prepared Distribution for PC Systems
          2. B.3.2. Compiling gawk for PC Operating Systems
          3. B.3.3. Using gawk on PC Operating Systems
        4. B.4. Reporting Problems and Bugs
        5. B.5. Other Freely Available awk Implementations
      3. C. Implementation Notes
        1. C.1. Downward Compatibility and Debugging
        2. C.2. Making Additions to gawk
          1. C.2.1. Adding New Features
          2. C.2.2. Porting gawk to a New Operating System
        3. C.3. Adding New Built-in Functions to gawk
          1. C.3.1. A Minimal Introduction to gawk Internals
          2. C.3.2. Directory and File Operation Built-ins
            1. C.3.2.1. Using chdir and stat
            2. C.3.2.2. C code for chdir and stat
            3. C.3.2.3. Integrating the extensions
        4. C.4. Probable Future Extensions
      4. D. Basic Programming Concepts
        1. D.1. What a Program Does
        2. D.2. Data Values in a Computer
        3. D.3. Floating-Point Number Caveats
      5. E. GNU General Public License
        1. E.1. Preamble
        2. E.2. Terms and Conditions for Copying, Distribution, and Modification
        3. E.3. NO WARRANTY
        4. E.4. END OF TERMS AND CONDITIONS
          1. E.4.1. How to Apply These Terms to Your New Programs
      6. F. GNU Free Documentation License
        1. F.1. ADDENDUM: How to Use This License for Your Documents
    8. Glossary
    9. Index
    10. About the Author
    11. Colophon
    12. Copyright

    Product information

    • Title: Effective awk Programming, 3rd Edition
    • Author(s): Arnold Robbins
    • Release date: May 2001
    • Publisher(s): O'Reilly Media, Inc.
    • ISBN: 9780596000707