Text Processing Commands in Linux

Jay JethawaJay Jethawa
2 min read

Introduction/

Text processing commands used to manipulate and extract information from text files.There are three powerful commands that facilitate text processing are AWK, SED and GREP. Each of these tools has its unique strengths and can be used to perform complex text manipulations with simple commands. In this article, we'll explore these commands with examples.

1. AWK: The Versatile Text Processor

AWK is a designed for text processing and typically used as a data extraction and reporting tool. It scans a file line by line, splits each line into fields and performs actions on the fields.

Syntax:

bashCopy codeawk 'pattern {action}' filename

Example:

Suppose we have a file data.txt with the following content:

textCopy codeJohn 25
Jane 30
Doe 22

To print the names and ages where age is greater than 24:

bashCopy codeawk '$2 > 24 {print $1, $2}' data.txt

Output:

textCopy codeJohn 25
Jane 30

Explanation:

  • $2 > 24 - Checks if the second field (age) is greater than 24.

  • {print $1, $2} - Prints the first and second fields (name and age).


2. SED: The Stream Editor

SED is a stream editor used to perform basic text transformations on an input stream (a file or input from a pipeline). It can execute complex text manipulations using simple commands.

Syntax:

bashCopy codesed 'command' filename

Example:

Suppose we have a file example.txt with the following content:

textCopy codeHello world
Hello sed
Hello awk

To replace the word "Hello" with "Hi":

bashCopy codesed 's/Hello/Hi/' example.txt

Output:

textCopy codeHi world
Hi sed
Hi awk

Explanation:

  • s/Hello/Hi/ - Substitutes the first occurrence of "Hello" with "Hi" on each line.

3. GREP: The Text Searcher

GREP is a command_line utility used to search for specific patterns within files. It prints the lines that match the given pattern.

Syntax:

bashCopy codegrep 'pattern' filename

Example:

Suppose we have a file logfile.txt with the following content:

textCopy codeError: File not found
Warning: Disk almost full
Error: Out of memory
Info: Operation completed

To find all lines containing the word "Error":

bashCopy codegrep 'Error' logfile.txt

Output:

textCopy codeError: File not found
Error: Out of memory

Explanation:

  • Error - The pattern to search for in the file.

Conclusion

Understanding AWK, SED and GREP can significantly enhance your text processing capabilities in Linux. These commands allow you to filter, transform & search through text efficiently.

0
Subscribe to my newsletter

Read articles from Jay Jethawa directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jay Jethawa
Jay Jethawa