Program Elements

Like most scripting languages, awk deals with numbers and strings. It provides scalar and array variables to hold data, numeric and string expressions, and a handful of statement types to process data: assignments, comments, conditionals, functions, input, loops, and output. Many features of awk expressions and statements are purposely similar to ones in the C programming language.

Comments and Whitespace

Comments in awk run from sharp (#) to end-of-line, just like comments in the shell. Blank lines are equivalent to empty comments.

Wherever whitespace is permitted in the language, any number of whitespace characters may be used, so blank lines and indentation can be used for improved readability. However, single statements usually cannot be split across multiple lines, unless the line breaks are immediately preceded with a backslash.

Strings and String Expressions

String constants in awk are delimited by quotation marks: "This is a string constant". Character strings may contain any 8-bit character except the control character NUL (character value 0), which serves as a string terminator in the underlying implementation language, C. The GNU implementation, gawk, removes that restriction, so gawk can safely process arbitrary binary files.

awk strings contain zero or more characters, and there is no limit, other than available memory, on the length of a string. Assignment of a string expression to a variable automatically creates a string, and the memory occupied ...

Get Classic Shell Scripting now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.