Python File Analysis: Counting Lines and Words
Introduction
Python is a versatile language that excels in various file manipulation and analysis tasks. In this article, we'll explore four Python scripts that allow you to count lines and words in text files. We'll discuss each script's functionality, explain how it works, and provide practical examples to demonstrate their usage.
Let's get started by creating the folder and file we will be working with. Here's a Python script that will create the list of files in a directory:
Script 0: Create Files in Folder
#!/usr/bin/python3
import os
# create_files.py
# Specify the directory path
directory_path = 'training'
# List of file names
file_names = [
'lightyear.txt',
'mavka.txt',
'moana.txt',
'mummies.txt',
'puss-in-boots.txt',
'alita.txt',
'encanto.txt',
'seabeasts.txt',
'spider-man.txt',
'the-magicians-elephant.txt',
'vivo.txt'
]
# Create the directory if it doesn't exist
if not os.path.exists(directory_path):
os.makedirs(directory_path)
# Create the files in the directory
for file_name in file_names:
file_path = os.path.join(directory_path, file_name)
open(file_path, 'w').close()
print(f'Files created in the "{directory_path}" directory.')
After the successful creation of these files in the folder, proceed to populate them with either soundtrack theme songs or cast information.
Script 1: Counting Lines in a File
#!/usr/bin/python3
import os
file = os.path.join('.', 'vivo.txt')
with open(file, 'r') as f:
lines = f.readlines()
line_count = len(lines)
print(line_count)
This script is designed to count the number of lines in a specified text file. Here's how it works:
It imports the
os
module to access file-related functions.It defines the
file
variable, indicating the path to the target text file.It uses a context manager (
with open(file, 'r') as f
) to open the file in read mode and create a file objectf
.It reads all the lines from the file using
f.readlines()
and stores them in thelines
list.It calculates the line count by finding the length of the
lines
list usinglen(lines)
.Finally, it prints the line count to the console.
Script 2: Counting Occurrences of Words in a File
#!/usr/bin/python3
import os
import re
from collections import Counter
file = os.path.join('.', 'seabeasts.txt')
with open(file, 'r') as f:
text = f.read()
words = re.findall(r'\b\w+\b', text.lower())
word_counts = Counter(words)
for word, count in word_counts.items():
print(f'{word}: {count}')
This script counts the occurrences of each word in a specified text file, including the count of each unique word. Here's a breakdown of how it works:
It imports the
os
module to work with files, there
module for regular expressions, and theCounter
class from thecollections
module.It defines the
file
variable, indicating the path to the target text file.Using a context manager, it opens the file in read mode (
'r'
) and creates a file objectf
.It reads the entire text content from the file using
f.read
()
and converts it to lowercase to ensure case-insensitive counting.It uses the
re.findall()
method to extract all words from the text using a regular expression pattern.It creates a word count dictionary using the
Counter
class to count the occurrences of each word.Finally, it iterates through the dictionary and prints each word along with its count to the console.
Script 3: Counting Words in a File
#!/usr/bin/python3
import os
file = os.path.join('.', 'puss-in-boots.txt')
word_counts = {}
with open(file, 'r') as f:
text = f.read()
word_count = len(text.split())
word_counts = word_count
print(word_counts)
This script counts the total number of words in a specified text file. Here's how it works:
It imports the
os
module to access file-related functions.It defines the
file
variable, indicating the path to the target text file.It initializes an empty dictionary
word_counts
to store the word count.Using a context manager, it opens the file in read mode (
'r'
) and creates a file objectf
.It reads the entire text content from the file using
f.read
()
.It splits the text into words using
text.split()
and calculates the word count by finding the length of the resulting list.It assigns the word count to the
word_counts
variable and prints it to the console.
Script 4: Counting Words in Multiple Files
#!/usr/bin/python3
import os
import re
from collections import Counter
word_counts = Counter()
for file in os.listdir('.'):
if file.endswith('.txt'):
filename = os.path.join('.', file)
with open(filename, 'r') as f:
text = f.read()
words = re.findall(r'\b\w+\b', text.lower())
word_counts.update(words)
for word, count in word_counts.items():
print(f'{word}: {count}')
This script counts the occurrences of words in multiple text files within a folder. Here's how it works:
It imports the
os
module to work with files, there
module for regular expressions, and theCounter
class from thecollections
module.It initializes a
Counter
objectword_counts
to store the word count.It uses a for loop to iterate through all files in the current directory (
os.listdir('.')
).For each file ending with ".txt," it constructs the full file path using
os.path.join()
and opens the file in read mode ('r'
) using a context manager.It reads the entire text content from the file, converts it to lowercase, and uses regular expressions to extract all words.
It updates the
word_counts
dictionary with the word count from each file usingword_counts.update(words)
.Finally, it iterates through the dictionary and prints each word along with its count to the console.
Conclusion
These Python scripts provide essential tools for analyzing text files by counting lines and words. Whether you need to determine the number of lines in a file, count occurrences of specific words, calculate the total word count, or analyze multiple files simultaneously, these scripts demonstrate how Python can simplify such tasks. By understanding and adapting these scripts, you can perform file analysis efficiently in your Python projects.
Feel free to access the codebase by cloning the Git repository, which is available at the following Github URL
Subscribe to my newsletter
Read articles from Omini Okoi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Omini Okoi
Omini Okoi
I am a web developer with experience in back-end engineering and cloud computing. I enjoy mentoring and training others in coding and technology. I’m always open to collaboration and building stuff that can make a positive impact.