Text Processing Tools


1. grep
The grep command is used for searching and filtering text in files based on patterns (regular expressions) or simple strings.
grep stands for "Global Regular Expression Print"
We can use grep anywhere like with files, searching for file, directories etc.
Syntax
grep [OPTION] .. Pattern [File] ..
15 Cases of grep command
Reference video
File which is being used
To ignore the upper and lower case while searching
No results cuz its case sensitive
Use : -i → ignore
grep -i <keyword> <file-name>
Can search patterns also
To search everything except given pattern/keyword :
grep -v <keyword> <file-name>
To print how many times (count) given keyword present in file
grep -c <keyword> <file-name>
Counts the number of lines in
<file-name>
that contain<keyword>
and prints the count.To search for exact match of given keyword in a file
grep -w <keyword> <file-name>
To print the line number of matches of given keyword in a file
grep -n <keyword> <file-name>
To search a given keyword in multiple files
grep -n <keyword> <file-name> <file-name>
To suppress file names while search a given keyword in multiple files
grep -h <keyword> <file-name> <file-name>
To search multiple keywords in a file
grep -e <keyword1> -e <keyword2> <file-name>
To search multiple keywords in multiple file
grep -e <keyword1> -e <keyword2> <file-name>
To only print file names which matches given keywords
grep -l <keyword> <file-name1> <file-name>
To get the keywords/pattern from a file and match with a another file
-f option allows you to specify a file containing patterns to search for.
If you want to search a lot of keywords then rather than typing all the keywords on terminal , just create a file with those keywords and use the option -f
Syntax
grep -f pattern_file target_file
- To print the matching line which start with given keyword
grep "^keyword" file
“^” → caret
- To print the matching line which end with given keyword
grep "keyword$" file
- Suppose we have 100 files in a directory and we need to search a keyword in all the files
grep -R -f <file-name> <directory-name>/
Can also search a word
- We can use egrep command for the multiple keywords search
egrep "key1|key2|key3" <file-name>
If you just wanna search but don't want to print on terminal
grep -q "keyword" <file-name>
echo $?
This command is used to display the exit status of the previously executed command or script.
0 - successful
Non zero - indicate different types of errors or problems encountered during the execution.
If you want to suppress error messages.
- ls | grep filename
2.awk
It is particularly useful for processing structured text data, such as tabular data log files, CSV, or space-separated files**.**
It is designed to process text line by line and allows you to perform various tasks.
Syntax
awk 'pattern {action}' file
pattern
→ Defines a condition (optional).action
→ Specifies what to do when the pattern matches.file
→ Input file to process.The file that is to be used for this example
Sample file: data
Commands
Print Entire File
awk '{print}' <file-name>
Print a Specific Column
awk '{print $1}' <file-name>
To print multiple columns
awk '{print $1, $3}' <file-name>
Use Case:
ls -l | awk '{print $NF}'
Filter Lines Matching a Pattern
awk '/pattern/ {print}' <file-name>
Use Conditions (
if
statements)awk '{if($4 >= 600000) print $0}' <file-name>
Count Lines in a File
awk 'END {print NR}' <file-name>
NR
(Number of Records) → A built-inawk
variable that keeps track of the current line number..END
Ensures the action is performed after reading all lines in the fileTo get last column
awk '{print $NF}' <file-name>
NF
is a built-inawk
variable that stands for Number of Fields in a line. It represents the total number of columns (fields) in the current line.$NF
→ Refers to the last field of each line.Using delimiter
awk '{print $2 ":" $NF}' <file-name>
Sum a Column
awk '{sum += $4} END {print "Total salary:", sum}' <file-name>
To get a specific row
awk 'NR==3 {print}' <file-name>
Range of lines
awk 'NR==4, NR==5 {print}' <file-name>
Print the line numbers of empty lines.
awk 'NF==0 {print NR}' file_name
NF==0
→ Checks if the line has zero fields (i.e., an empty line).Updating a file
The changes made by the awk command will not be permanent in the file. The awk command reads the file and makes modifications in memory but does not alter the original file.
-F option
The -F option in the awk command is used to specify the field separator (delimiter) in the input data. It tells awk how to split each line of input into fields.
awk -F, '{print $2 " " $5}' <file-name>
-F, means you're specifying a comma (,) as the field separator. This is often used when working with CSV (Comma-Separated Values) files, where each field is separated by a comma
How to use for loop in AWK command?
awk 'BEGIN {for(i=0;i<=10;i++) print i;}'
How to use while loop in AWK command?
awk 'BEGIN {while(i<10){ i++; print "Num is " i;}}'
SED command
The sed
(Stream Editor) command is used for searching, replacing, deleting, and modifying text in a file or input stream.
To print a specific line
sed -n '3p' file-name
To print the last line
sed -n '$p' file-name
'$p'
: Prints only the last ($
) line of the file.To print a range of lines.
sed -n '1,4p' file-name
To print the lines that contain a specific word or pattern
sed -n '/pattern/p' file-name
To print only the specified lines.
sed -n -e '3p' -e '5p' file-name
To print only the lines that contain specified patterns
sed -n -e'/pilot/p' -e'/BTech/p' file-name
To print a specified line and the next few lines from the given file.
sed -n '3,+3p' file-name
To print every alternate line starting from the given line.
sed -n '3~2p' file-name
To read expression from external file.
ext.txt
sed -n -f ext.txt file-name
To replace a word in a file and print
sed 's/old-string/new-string/g' file-name
To replace all occurrences of
<string_to_change>
with<new_string>
only on the specified line offile_name
, leaving other lines unchanged.sed '3s/BTech/MTech/g' file-name
sed '7! s/BTech/MBA/g' data
To delete a line
sed '1d' file-name
Deleting a range of lines
sed '1,3d' file-name
Subscribe to my newsletter
Read articles from Swati Verma directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Swati Verma
Swati Verma
Growing in DevOps, together! 🤝 | Associate Software Engineer at Tech Mahindra | Enthusiastic about automation, cloud solutions, and efficient software delivery. | Let's connect, collaborate, and learn from each other!