Cheatsheet & Examples: awk

HongHong
6 min read

Basic Text Processing and Pattern Matching
Example Usage:
awk '{ print }' file.txt

What it does:
Prints every line of the input file.

Command-line Arguments Explained:

  • { print }: The default awk script that outputs each line.
  • file.txt: The input file to process.

Filtering Lines Based on Patterns
Example Usage:
awk '/error/ { print }' log.txt

What it does:
Prints lines containing the word "error" from the input file.

Command-line Arguments Explained:

  • /error/: A pattern to match lines containing "error".
  • { print }: The action to execute for matching lines.
  • log.txt: The input file to search within.

Print Specific Fields from a File
Example Usage:
awk '{ print $1, $3 }' data.txt

What it does:
Displays the first and third fields of each line in the input file.

Command-line Arguments Explained:

  • { print $1, $3 }: Outputs the first and third fields of each line.
  • data.txt: The input file to extract fields from.

Using Field Separators with -F
Example Usage:
awk -F, '{ print $2 }' data.csv

What it does:
Prints the second field of each line in a CSV file (using comma as the separator).

Command-line Arguments Explained:

  • -F,: Sets the field separator to a comma.
  • { print $2 }: Outputs the second field.
  • data.csv: The input file to process.

Summing Values in a Column
Example Usage:
awk '{ sum += $1 } END { print sum }' numbers.txt

What it does:
Calculates the sum of values in the first column of the input file and prints it.

Command-line Arguments Explained:

  • { sum += $1 }: Accumulates the value of the first field.
  • END { print sum }: Outputs the final sum after processing all lines.
  • numbers.txt: The input file containing numerical data.

Printing Line Numbers
Example Usage:
awk '{ print NR, $0 }' file.txt

What it does:
Displays each line of the file prepended with its line number.

Command-line Arguments Explained:

  • NR: Built-in variable representing the current line number.
  • $0: Represents the entire line.
  • file.txt: The input file to process.

Using BEGIN and END Blocks
Example Usage:
awk 'BEGIN { print "Start" } END { print "End" }' file.txt

What it does:
Prints "Start" before processing the file and "End" after processing completes.

Command-line Arguments Explained:

  • BEGIN { print "Start" }: Executes once at the beginning.
  • END { print "End" }: Executes once at the end.
  • file.txt: The input file being processed.

Formatting Output with printf
Example Usage:
awk '{ printf "%-10s %5d\n", $1, $2 }' data.txt

What it does:
Formats and prints the first field as a string, the second as a number, with specific spacing.

Command-line Arguments Explained:

  • printf: Formats output according to the specified pattern.
  • %-10s: Left-justifies the first field in a 10-character width.
  • %5d: Right-justifies the second field in a 5-character width.
  • data.txt: The input file to process.

Handling Multiple Files
Example Usage:
awk 'NR == 1 { print FILENAME }' file1.txt file2.txt

What it does:
Prints the filename of each input file at the first line of processing.

Command-line Arguments Explained:

  • NR == 1: Condition to check for the first line of each file.
  • FILENAME: Built-in variable containing the current input file name.
  • file1.txt file2.txt: Multiple input files to process.

Using Variables with -v
Example Usage:
awk -v threshold=100 '$1 > threshold { print $0 }' data.txt

What it does:
Filters and prints lines where the first field exceeds the value of threshold.

Command-line Arguments Explained:

  • -v threshold=100: Defines a user variable threshold with the value 100.
  • $1 > threshold: Condition to check if the first field exceeds threshold.
  • data.txt: The input file to process.

Calculating Field Counts
Example Usage:
awk '{ print NF }' file.txt

What it does:
Prints the number of fields in each line of the input file.

Command-line Arguments Explained:

  • NF: Built-in variable indicating the number of fields in the current line.
  • file.txt: The input file to analyze.

Using Regular Expressions in Patterns
Example Usage:
awk '/^[A-Z]/ { print }' file.txt

What it does:
Prints lines that start with an uppercase letter.

Command-line Arguments Explained:

  • /^[A-Z]/: A regular expression pattern matching lines starting with an uppercase letter.
  • { print }: Action to output matching lines.
  • file.txt: The input file to process.

Substituting Text in a File
Example Usage:
awk '{ gsub(/old/, "new"); print }' file.txt

What it does:
Replaces all occurrences of "old" with "new" in each line of the input file.

Command-line Arguments Explained:

  • gsub(/old/, "new"): Substitutes "old" with "new" globally in the line.
  • print: Outputs the modified line.
  • file.txt: The input file to process.

Counting Occurrences of a Pattern
Example Usage:
awk '/error/ { count++ } END { print "Errors:", count }' log.txt

What it does:
Tallies the number of lines matching "error" and prints the total.

Command-line Arguments Explained:

  • /error/: Pattern to match lines containing "error".
  • count++: Increments a variable for each match.
  • END { print "Errors:", count }: Outputs the final count after processing.
  • log.txt: The input file to analyze.

Printing Lines with Specific Field Lengths
Example Usage:
awk 'length($0) > 80 { print }' file.txt

What it does:
Prints lines in the file that are longer than 80 characters.

Command-line Arguments Explained:

  • length($0) > 80: Condition checks if the line length exceeds 80.
  • $0: Represents the entire line.
  • file.txt: The input file to process.

Using for Loops to Iterate Fields
Example Usage:
awk '{ for(i=1; i<=NF; i++) print $i }' file.txt

What it does:
Prints each field of every line on a separate line.

Command-line Arguments Explained:

  • for(i=1; i<=NF; i++): Iterates over all fields in the line.
  • print $i: Outputs each field individually.
  • file.txt: The input file to process.

Using if Conditions for Filtering
Example Usage:
awk '$2 > 50 { print $1 }' data.txt

What it does:
Prints the first field of lines where the second field exceeds 50.

Command-line Arguments Explained:

  • $2 > 50: Condition checks if the second field's value is greater than 50.
  • print $1: Outputs the first field of matching lines.
  • data.txt: The input file to process.

Changing Output Field Separator with OFS
Example Usage:
awk '{ OFS=","; print $1, $2 }' data.txt

What it does:
Prints the first and second fields separated by a comma.

Command-line Arguments Explained:

  • OFS=",": Sets the output field separator to a comma.
  • print $1, $2: Outputs the first and second fields with the new separator.
  • data.txt: The input file to process.

Using NR to Process Specific Lines
Example Usage:
awk 'NR == 5 { print }' file.txt

What it does:
Prints the fifth line of the input file.

Command-line Arguments Explained:

  • NR == 5: Condition checks if the current line number is 5.
  • print: Outputs the matching line.
  • file.txt: The input file to process.

Using FILENAME in END Block
Example Usage:
awk 'END { print FILENAME }' file.txt

What it does:
Prints the name of the last processed file after finishing.

Command-line Arguments Explained:

  • END { print FILENAME }: Executes after all lines are processed and outputs the filename.
  • file.txt: The input file being processed.

Printing Fields with Custom Separator
Example Usage:
awk 'BEGIN { FS=":"; OFS="," } { print $1, $2 }' data.txt

What it does:
Processes fields separated by colons and outputs them with commas.

Command-line Arguments Explained:

  • BEGIN { FS=":"; OFS="," }: Sets input/output field separators before processing.
  • print $1, $2: Outputs the first and second fields with the new separator.
  • data.txt: The input file to process.
0
Subscribe to my newsletter

Read articles from Hong directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Hong
Hong

I am a developer from Malaysia. I work with PHP most of the time, recently I fell in love with Go. When I am not working, I will be ballroom dancing :-)