Word Count in Linux for understanding it's usage in practical scenarios.

Mihir SuratwalaMihir Suratwala
4 min read

Last Blog Review

In the last, blog we understood how to use “Find” command in linux, which makes the life easy as there are lot of files and directories in linux because being a inverted tree like structure, it’s very difficult to find a particular find due to which using this command we can get the desired files and directories quickly within no time. As Find allows you to use many filter’s to search the file and directories as we saw above.

What is exactly Word Count ???

Word Count is a command or process which helps to count the number of lines, word, characters in the file which is specified as arguments. This feature seems to be a not so attractive one, but it acts as a life savior when you deal with lot’s of files and you need to look for some specific words, also when you are creating a script where you are checking the word limit is exceeded or not, even more.

Let’s understand it practically →

A. To count the number of lines

wc -l <file.name>

Ex. wc -l text.txt

B. To count the number of words

wc -w <file.name>

Ex. wc -w text.txt

C. To count the number of bytes

wc -c <file.name>

Ex. wc -c text.txt

D. To print the length of the longest character in a file

wc -L <file.name>

Ex. wc -L text.txt

E. It can be used to count the number of files and folders in a directory

ls <directory> | wc -l

Ex. ls log | wc -l

Practical use cases →

  1. Suppose I have a log.txt file in which there are some error words, now i want calculate how many error words have occurred ?

     root@ubuntu-host ~ ➜  cat log.txt
     2023-01-24 10:01:02 - INFO - Application started successfully
     2023-01-24 10:01:05 - WARN - Database connection attempt took longer than expected
     2023-01-24 10:01:10 - ERROR - Unable to fetch user data from database: [Errno 110] Connection timed out
     2023-01-24 10:01:12 - INFO - User 'john.doe' successfully logged in
     2023-01-24 10:01:15 - ERROR - Invalid input provided for parameter 'amount' in function 'calculate_total'
     2023-01-24 10:01:18 - INFO - File 'report.txt' generated successfully
     2023-01-24 10:01:20 - WARN - Network latency detected on request to external API
     2023-01-24 10:01:23 - INFO - Processing user request successfully
     2023-01-24 10:01:25 - ERROR -  Error processing image: Image format not supported
     2023-01-24 10:01:28 - INFO - System health check passed
     2023-01-24 10:01:30 - WARN - Cache miss for key 'recent_products'
    
     root@ubuntu-host ~ ➜  grep -o -i "error" log.txt | wc -w
     4
    
     root@ubuntu-host ~ ➜
    
  1. While web scraping to find the number of words on that web html page

     root@ubuntu-host ~ ➜  curl -s https://google.com 
     <HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
     <TITLE>301 Moved</TITLE></HEAD><BODY>
     <H1>301 Moved</H1>
     The document has moved
     <A HREF="https://www.google.com/">here</A>.
     </BODY></HTML>
    
     root@ubuntu-host ~ ➜  curl -s https://google.com | wc -w
     14
    
  2. If we want to take the backup of the file having a good amount of words, and to analyze that we can make use of word command

     root@ubuntu-host ~ ➜  cat log.txt
     2023-01-24 10:01:02 - INFO - Application started successfully
     2023-01-24 10:01:05 - WARN - Database connection attempt took longer than expected
     2023-01-24 10:01:10 - ERROR - Unable to fetch user data from database: [Errno 110] Connection timed out
     2023-01-24 10:01:12 - INFO - User 'john.doe' successfully logged in
     2023-01-24 10:01:15 - ERROR - Invalid input provided for parameter 'amount' in function 'calculate_total'
     2023-01-24 10:01:18 - INFO - File 'report.txt' generated successfully
     2023-01-24 10:01:20 - WARN - Network latency detected on request to external API
     2023-01-24 10:01:23 - INFO - Processing user request successfully
     2023-01-24 10:01:25 - ERROR -  Error processing image: Image format not supported
     2023-01-24 10:01:28 - INFO - System health check passed
     2023-01-24 10:01:30 - WARN - Cache miss for key 'recent_products'
    
     root@ubuntu-host ~ ➜  cat wordcountshell.sh
     #!/bin/bash
    
     if [ $(wc -w < log.txt) -lt 500 ]; then
       echo "File is having too less words to backup"
     else
       echo "File is good, you can take backup"
     fi
    
     root@ubuntu-host ~ ➜  sh wordcountshell.sh
     File is having too less words to backup
    
     root@ubuntu-host ~ ➜
    

Conclusion →

So, here we understood how word count command looks very simple to use. But it can help in hell lot of ways to make the life easy for analyzing any file/folder. It will assist in counting lines, words, characters, bytes and can be used in shell scripts to automate the task as well.

💡
That’s a wrap for today’s post! I hope this has given you some valuable insights. Be sure to explore more articles on our blog for further tips and advice. See you in the next post!
0
Subscribe to my newsletter

Read articles from Mihir Suratwala directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Mihir Suratwala
Mihir Suratwala

Hi, How are you !! Hope you doing good.... I got introduced to Cloud initially. As I went ahead learning what is cloud and how it works, then got to know a field which is DevOps that makes Cloud model more effective. So, as I started working & got good experience on AWS. I have been learning the DevOps tool and technologies on how to use it with the Cloud, which will give me good understanding on how Cloud and DevOps go hand in hand to deploy my applications.