awk Programming Language
awk 'pattern { action }' inputfile OR awk -f awkprogram inputfile
Examples: (See also, STRUCTURE)
awk '/chevy/' cars.db # Print any lines in "cars.db" that contain "chevy".
awk '{print $3, $1}' cars.db # Print the 3rd and 1st field of every line.
awk '/chevy/ {print $3 $1}' cars.db # Print the 3rd and 1st field of
# lines that contain "chevy".
awk '$1 ~ /h/' cars.db # Print any lines that contain "h" in the 1st field.
awk '$2 ~ /^[tm]/ {print $2, "$", $5}' cars.db # Print 2nd field and the
# 5th field (preceded by a "$") of lines in which the second field
# starts with either "t" or "m".
awk '$3 ~ /5$/ {print $0} # Print the entire line ($0) for any line in
# which the 3rd field ends with a "5".
ACTIONS (includes expressions with constants, var, asgnmnt, funct calls, etc.)
print EXPRESSION-LIST printf (FORMAT, EXPRESSION-LIST)
break do STATEMENT while (EXPRESSION)
continue for (VARIABLE in ARRAY)
next for (EXPRESSION; EXPRESSION; EXPRESSION) STATEMENT
exit if (EXPRESSION) STATEMENT
exit EXPRESSION if (EXPRESSION) STATEMENT else STATEMENT
{ STATEMENTS } while (EXPRESSION) STATEMENT
BUILT-IN ARITHMETIC FUNCTIONS
atan2(y,x) arctan. y/x, -pi to pi rand() random number r, 0 <= r < 1
cos(x) cosine of x, x in radians sin(x) sine of x, x in radians
exp(x) exponential function of x sqrt(x) square root of x
int(x) integer part of x srand(x) x is new seed for rand()
log(x) natural logarithm of x
BUILT-IN STRING FUNCTIONS
gsub(r,s) subs s for r in $0, return # of subs made
gsub(r,s,t) subs s for r in string t, return # of subs made
index(s,t) return first position of string t in s, or 0 if t not present
length(s) number of chars in s
match(s,r) does s contain r? return index or 0, resets RSTART nad RLENGTH
split(s,a) split s into array a on FS, return # of fields
split(s,a,fs) split s into array a on field separator fs, return # of fields
sprintf(fmt,expr-list) return expr-list formatted according to format fmt
sub(r,s) subs s for leftmost longest substr of $0 matched by r; give # subs
sub(r,s,t) subs s for left longest
substr (s,p) return s from position p to the end
substr (s,p,n) return n characters of string s starting at position p
BUILT-IN VARIABLES (defaults listed in parentheses)
ARGC command-line args number OFMT output number format ("%.6g")
ARGV command-line args array OFS output field separator (" ")
FILENAME current input file ORS output record separator "\n")
FNR current file record number RLENGTH length of matched string
FS input field separator (" ") RS input record separator ("\n")
NF number of fields, current record RSTART start of matched string
NR number of records read so far SUBSEP subscript separator ("\034")
COMMENTS start with # and end at the end of the line.
CONSTANTS
Numeric: integer (1000000), decimal (1000000.00), or scientific (1e6)
String: enclosed in quotes ("") and may contain escape sequences
ESCAPE SEQUENCES
\b backspace \t tab
\f formfeed \ddd octal value ddd (d={0,1,...7})
\n newline (line feed) \c any other char c literally
\r carriage return (e.g. \\ for \ and \" for ")
EXPRESSIONS
numeric/string constants, var, fields ($0,$1,...,$NF), funct calls,
array elems, NOTE ($0 is the entire line, $1 is the first field)
assignment = += -= *= /= %= (modular) ^= (exponent)
?: conditional expr + - * / % ^ arithmetic
|| (OR) && (AND) ! (NOT) unary + and -
~ !~ matching/not matching ++ -- increment/decrement
< <= == != > >= relational () grouping
concatenation (no explicit op)
FILE INPUT
getline, Retrieves the next line from FILENAME (the current file) and resets $0 with it
getline var, Retrieves the next line from FILENAME and stores it in a variable (var)
getline < file, Retrieves the next line from another file
getline var < inputfile, Retrieves the next line from another file and stores it in var
Example: while ((getline line < "input.txt") > 0)
print line
close("input.txt")
next, Retrieves the next line from FILENAME and returns to the first rule of the program
OPERATORS
< less than >= greater than or equal to
<= less than or equal to > greater than
== equal to ~ matched by
!= not equal to !~ not matched by
PATTERN SUMMARY
BEGIN { statements } /regular expression/ { statements }
END { statements } compound pattern { statements }
expression { statements } rangepttrn1, rangepttrn2 { statements }
REGULAR EXPRESSIONS, Metacharacters
\ quoting character | alternation operator (this|that)
^ at the beginning () grouping, (r1)(r2) matches xy where
$ at the end x matches r1 and y matches r2
. match a single char * A* match zero or multiple A's
[] char class, ex. [a-z] + A+ match one or more A's
[^] char not in class, ex. [^a-z] ? A? match null string or A
STRING-MATCHING PATTERNS matches if...
/regexpr/ input line has a substring matched by regexpr
expression ~ /regexpr/ expression has a substring matched by regexpr
expression !~ /regexpr/ expr does not have a substring matched by regexpr
STRUCTURE
Each awk program is a series of patterns-action statements.
pattern { action }
pattern { action }
. . .
The awk program reads the first input line from the file(s) and tests it
against each pattern. For each pattern matched, the corresponding action,
whether single or multiple steps, is performed. After all patterns are
checked, the next line of input is read. The program terminates when the
end-of-file is reached.
VARIABLES
User-defined variables have letters, digits, and underscores, and do not
begin with a digit. An uninitialized user-defined variable has the string
value "" (null) and the numeric value 0.
Resources
The AWK
Programming Language (includes source for the one true awk)
AWK Manual,
for the new implementation of AWK (sometimes called nawk)
|