Advanced Linux Commands Explained

`awk`

Why Learn This?

When working as a DevOps engineer, you’ll deal with large log files, CSV data, or system outputs. You often need to extract specific parts, count things, or filter content quickly.

Now you have two choices:

📜 Write a shell script
⚙️ Use a powerful one-liner tool like awk

What is awk?

awk is a text-processing command in Linux. It’s like a mini programming language for working with structured text files like .log, .csv, .tsv, etc.

✨ Key Use Cases:

Extract specific columns from a file
Filter lines based on keywords (like INFO, ERROR)
Apply conditions and loops
Count things (like how many times something occurred)

🍽️ Analogy: Restaurant Waiter vs awk

Just like a waiter serves you specific items from a full menu...

awk serves you specific parts from a full text file 🍲

Basic `awk` syntax:

awk '{ action }' filename

$1, $2, $3 → Represent columns
NR → Current row number
print → Displays output

Common Examples

Print the whole content of a file:

awk '{print}' app.log

Print the first column:

awk '{print $1}' app.log

Print multiple columns (1st, 2nd, 3rd, 5th):

awk '{print $1, $2, $3, $5}' app.log

Print only lines that contain the word INFO:

awk '/INFO/ {print $1, $2, $3, $5}' app.log

Save the filtered INFO lines to a new file:

awk '/INFO/ {print $1, $2, $3, $5}' app.log > only_info.log

Count Occurrences

Count how many times INFO appears:

awk '/INFO/ {count++} END {print count}' app.log

With a custom message:

awk '/INFO/ {count++} END {print "The count of INFO is", count}' app.log

Filter by Time (e.g., Between 08:53:00 and 08:53:59)

Let’s say your log’s second column is a timestamp:

awk '$2 >= "08:53:00" && $2 <= "08:53:59" {print $3}' app.log

Print Lines Between Line Numbers 2 to 10

awk 'NR >= 2 && NR <= 10 {print}' app.log

To print the row numbers:

awk 'NR >= 2 && NR <= 10 {print NR}' app.log

Why awk is Powerful?

Supports conditions, loops, ranges, and custom logic
Can act like a mini-programming language for text
Works great with formatted data like .csv, .tsv, .log

⚠️ Works best with structured/column-based files

Difference between `awk` and `sed`

Feature	awk	sed
Purpose	Programming-style text processor	Stream editor
Best Used For	Parsing and processing structured files (logs, CSV)	Find & replace, line edits, simple filters
Limitation	Needs structured formats	Not good for complex logic

Summary:

Use awk when you want columns, conditions, counts, loops
Use sed when you want quick replacements or line changes

Real-World DevOps Use Cases

Parse logs to extract IPs, timestamps, or error types
Generate quick reports from .csv files
Filter logs based on date/time or event type
Count occurrences of issues in system logs

What is sed?

sed stands for Stream Editor. It reads input line by line (as a stream), performs operations (like search, replace, delete), and outputs the result.

💡 Think of sed as a robot that edits your file while reading it — no need to open it in an editor.

`awk` vs `sed`

Feature	awk	sed
Use-case	Extracting, printing, filtering structured data	Editing and transforming text
Syntax style	Mini-programming language with `{}`	Expression-based (no `{}`)
Data structure	Works with columns (CSV/TSV/logs)	Works line-by-line, unstructured
Ideal for	Reports, CSVs, log filtering	Search & replace, quick edits

Basic Syntax:

sed 'expression' filename

Lets compare it with awk:

awk '/INFO/' app.log
sed -n '/INFO/p' app.log

In sed, if you want to only show matched lines, you must use the -n flag:

sed -n '/INFO/p' app.log

Search & Replace

Replace all instances of INFO with LOG in the entire file:

sed 's/INFO/LOG/g' app.log
```bash
# - s = substitute
# - INFO = the pattern to search
# - LOG = replacement
# - g = globally (all matches on the line)

Show Line Numbers with a Match

Show only line numbers where INFO appears:

sed -n -e '/INFO=/' app.log

# Explanation:
# -n: suppress automatic printing
# -e: execute expression
# /INFO/=: print line numbers where match found

Show both line numbers and matching lines:

sed -n -e '/INFO/=' -e '/INFO/p' app.log

Replace Only in a Specific Line Range

Replace INFO with LOG only in the first 10 lines:

sed '1,10 s/INFO/LOG/g' app.log

Print Only the First 10 Changed Lines

sed '1,10 s/INFO/LOG/g; 1,10p; 11q' app.log

# 🧠 What’s going on here?
# 1,10 s/INFO/LOG/g → Replace within lines 1 to 10
# 1,10p → Print only lines 1 to 10
# 11q → Quit after line 10 (no extra output)
# ; → Separates multiple commands

Real-World DevOps Use Cases

Use Case	Example
Replace config variables	Replace localhost with IP
Clean up logs	Remove or mask sensitive data
Automation	Update version tags or comments in scripts
Pipeline	Use in Dockerfile, CI/CD shell, bash scripts

Summary Cheat Sheet

Task	sed Command
Match lines	`sed -n '/INFO/p' app.log`
Replace globally	`sed 's/INFO/LOG/g' app.log`
Replace in range	`sed '1,10 s/INFO/LOG/g'`
Show line numbers with match	`sed -n -e '/INFO/=' app.log`
Show both line & content	`sed -n -e '/INFO/=' -e '/INFO/p'`
Replace & print only 10 lines	`sed '1,10 s/INFO/LOG/g; 1,10p; 11q' app.log`

What is grep?

grep stands for Global Regular Expression Print.

It’s like a powerful filter — it scans a file or stream, searches for a pattern, and prints only those lines that match.

Real-Life Analogy

Imagine you’re reading a 200-page book 📖, but you’re only interested in the pages that talk about "AWS". Instead of reading the whole book, you just search for the word "AWS" and mark those pages.

That’s what grep does — it finds and highlights the relevant lines in big files like logs, scripts, or processes.

Basic Usage

Find lines containing the word INFO:

grep INFO app.log

Case-insensitive search:

grep -i info app.log
# -i makes the search case-insensitive

Count how many matches:

grep -i -c info app.log
# -c stands for count of matching lines

Same Thing in awk?

You can get the same count in awk like this:

awk '/INFO/ {count++} END {print count}' app.log

So why do we need both?
Because each tool has different strengths.

awk vs sed vs grep – What's the Difference?

Tool	Best For	Structure Needed	Real Power
grep	Simple search and match	No structure needed	Fast filtering
awk	Column-based logic	Needs structured data	Logic, conditions, counters
sed	Inline editing	Works line-by-line	Search & replace, line edit

DevOps Use Case Example

See all running processes:

ps aux

Filter only processes run by ubuntu user:

ps aux | grep ubuntu

Now get only the 2nd column (PID):

ps aux | grep ubuntu | awk '{print $2}'
# ps aux shows all processes
# grep ubuntu filters processes started by the ubuntu user
# awk '{print $2}' shows the process ID

Bonus: Powerful Grep Flags

Flag	Meaning	Example
`-i`	Ignore case	`grep -i info app.log`
`-c`	Count matches	`grep -c INFO app.log`
`-v`	Invert match (exclude pattern)	`grep -v INFO app.log`
`-n`	Show line numbers	`grep -n INFO app.log`
`-r`	Recursive search in folders	`grep -r "ERROR" /var/logs/`

Summary: Choose the Right Tool

Task	Tool
Quick search	grep
Column-based filtering or counting	awk
Inline find & replace	sed
Complex report from structured logs	awk
Editing config files or scripts in place	sed

Day 05: Linux Pro Commands

Table of contents

awk

Why Learn This?

What is awk?

✨ Key Use Cases:

🍽️ Analogy: Restaurant Waiter vs awk

Basic awk syntax:

Common Examples

Print the whole content of a file:

Print the first column:

Print multiple columns (1st, 2nd, 3rd, 5th):

Print only lines that contain the word INFO:

Save the filtered INFO lines to a new file:

Count Occurrences

Count how many times INFO appears:

With a custom message:

Filter by Time (e.g., Between 08:53:00 and 08:53:59)

Let’s say your log’s second column is a timestamp:

Print Lines Between Line Numbers 2 to 10

Why awk is Powerful?

Difference between awk and sed

Summary:

Real-World DevOps Use Cases

What is sed?

awk vs sed

Basic Syntax:

Lets compare it with awk:

Search & Replace

Replace all instances of INFO with LOG in the entire file:

Show Line Numbers with a Match

Show only line numbers where INFO appears:

Show both line numbers and matching lines:

Replace Only in a Specific Line Range

Replace INFO with LOG only in the first 10 lines:

Print Only the First 10 Changed Lines

Real-World DevOps Use Cases

Summary Cheat Sheet

What is grep?

Real-Life Analogy

Basic Usage

Case-insensitive search:

Count how many matches:

Same Thing in awk?

awk vs sed vs grep – What's the Difference?

DevOps Use Case Example

See all running processes:

Filter only processes run by ubuntu user:

Now get only the 2nd column (PID):

Bonus: Powerful Grep Flags

Summary: Choose the Right Tool

Subscribe to my newsletter

Prashant Gohel

Prashant Gohel

`awk`

Basic `awk` syntax:

Difference between `awk` and `sed`

`awk` vs `sed`