10 Useful yet Rarely Used OS Functions in Python

Sachin PalSachin Pal
7 min read

You must have used functions provided by the os module in Python several times in your projects. These could be used to create a file, walk down a directory, get info on the current directory, perform path operations, and more.

In this article, we’ll discuss the functions that are as useful as any function in the os module but are rarely used.

os.path.commonpath()

When working with multiple files that share a common directory structure, you might want to find the longest shared path. os.path.commonpath() does just that. This can be helpful when organizing files or dealing with different paths across environments.

Here’s an example:

import os

paths = ['/user/data/project1/file1.txt', '/user/data/project2/file2.txt']
common_path = os.path.commonpath(paths)
print("Common Path:", common_path)

This code will give us the common path shared by these two paths.

Common Path: /user/data

You can see that os.path.commonpath() takes a list of path names, which might be impractical to manually write them down.

In that case, it is best to iterate over all of the directories, subdirectories, and file names and then look for the common path.

import os

def get_file_paths(directory, file_extension=None):
    # Collect all file paths in the directory (and subdirectories, if any)
    file_paths = []
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file_extension is None or file.endswith(file_extension):
                file_paths.append(os.path.join(root, file))
    return file_paths


# Specify the root directory to start from
directory_path = 'D:/SACHIN/Pycharm/Flask-Tutorial'

# If you want to filter by file extension
file_paths = get_file_paths(directory_path, file_extension='.html')

# Find the common path among all files
if file_paths:
    common_path = os.path.commonpath(file_paths)
    print("Common Path:", common_path)
else:
    print("No files found in the specified directory.")

In this example, the function get_file_paths() traverses a directory from top to bottom and appends all the paths found in the file_paths list. This function optionally takes a file extension if we want to look out for specific files.

Now we can easily find the common path of any directory.

Common Path: D:\SACHIN\Pycharm\Flask-Tutorial\templates

os.scandir()

If you’re using os.listdir() to get the contents of a directory, consider using os.scandir() instead. It’s not only faster but also returns DirEntry objects, which provide useful information like file types, permissions, and whether the entry is a file or a directory.

Here’s an example:

import os

with os.scandir('D:/SACHIN/Pycharm/osfunctions') as entries:
    for entry in entries:
        print(f"{entry.name} : \n"
              f">>>> Is File: {entry.is_file()} \n"
              f">>>> Is Directory: {entry.is_dir()}")

In this example, we used os.scandir() and passed a directory and then we iterated over this directory and printed the info.

.idea : 
>>>> Is File: False 
>>>> Is Directory: True
main.py : 
>>>> Is File: True 
>>>> Is Directory: False
sample.py : 
>>>> Is File: True 
>>>> Is Directory: False

os.path.splitext()

Let’s say you’re working with files and need to check their extension, you can get help from os.path.splitext() function. It splits the file path into the root and extension, which can help you determine the file type.

import os

filename = 'report.csv'
root, ext = os.path.splitext(filename)
print(f"Root: {root} \n"
      f"Extension: {ext}")

Output

Root: report 
Extension: .csv

Look at some cases where paths can be weird, at that time how os.path.splitext() works.

import os

filename = ['.report', 'report', 'report.case.txt', 'report.csv.zip']
for idx, paths in enumerate(filename):
    root, ext = os.path.splitext(paths)
    print(f"{idx} - {paths}\n"
          f"Root: {root} | Extension: {ext}")

Output

0 - .report
Root: .report | Extension: 
1 - report
Root: report | Extension: 
2 - report.case.txt
Root: report.case | Extension: .txt
3 - report.csv.zip
Root: report.csv | Extension: .zip

os.makedirs()

There's already a frequently used function that allows us to create directories. But what about when you create nested directories?

Creating nested directories can be a hassle with os.mkdir() since it only makes one directory at a time. os.makedirs() allows you to create multiple nested directories in one go, and the exist_ok=True argument makes sure it doesn’t throw an error if the directory already exists.

import os

os.makedirs('project/data/files', exist_ok=True)
print("Nested directories created!")

When we run this program, it will create specified directories and sub-directories.

Nested directories created!

If we run the above program again, it won’t throw an error due to exist_ok=True.

os.replace()

Similar to os.rename(), os.replace() moves a file to a new location, but it safely overwrites any existing file at the destination. This is helpful for tasks where you’re updating or backing up files and want to ensure that old files are safely replaced.

import os

os.replace(src='main.py', dst='new_main.py')
print("File replaced successfully!")

In this code, main.py file will be renamed to new_main.py just as os.rename() function but this operation is like take it all or nothing. It means the file replacement happens in a single, indivisible step, so either the entire operation succeeds or nothing changes at all.

File replaced successfully!

os.urandom()

For cryptographic purposes, you need a secure source of random data. os.urandom() generates random bytes suitable for things like generating random IDs, tokens, or passwords. It’s more secure than the random module for sensitive data.

os.urandom() uses randomness generated by the operating system you are using from various resources to make bytes (data) unpredictable.

In Windows, it uses BCryptGenRandom() to generate random bytes.

import os

secure_token = os.urandom(16)  # 16 bytes of random data
print("Secure Token:", secure_token)
#Making it human-readable
print("Secure Token:", secure_token.hex())

Output

Secure Token: b'\x84\xd6\x1c\x1bKB\x7f\xcd\xf6\xb7\xc4D\x92z\xe3{'
Secure Token: 84d61c1b4b427fcdf6b7c444927ae37b

os.path.samefile()

The os.path.samefile() function in Python is used to check if two paths refer to the same file or directory on the filesystem. It’s particularly helpful in scenarios where multiple paths might point to the same physical file, such as when dealing with symbolic links, hard links, or different absolute and relative paths to the same location.

import os

is_same = os.path.samefile('/path/to/file1.txt', '/different/path/to/symlink_file1.txt')
print("Are they the same file?", is_same)

os.path.samefile() is designed to return True only if both paths reference the same file on disk, such as a file that’s hard-linked or symlinked to the same data on the filesystem.

os.path.relpath()

os.path.relpath() is a computation function that computes the relative path between two paths. This is particularly useful when building file paths dynamically or working with relative imports.

Consider the following example:

import os

# Target file path
target_path = "D:/SACHIN/Pycharm/osfunctions/project/engine/log.py"
# Starting point
start_path = "D:/SACHIN/Pycharm/osfunctions/project/interface/character/specific.py"

relative_path = os.path.relpath(target_path, start=start_path)
print(relative_path)

In this example, we have target_path which contains a path where we have to navigate and start_path contains a path from where we have to start calculating the relative path to target_path.

When we run this, we get the following output.

..\..\..\engine\log.py

This means we have to go up three directories and then down to engine/log.py.

os.fsync()

When we perform a file writing (file.write()) operation, the data isn’t saved to disk instantly instead the data is saved into the system’s buffer and if something unexpected happens before writing the data to the disk, the data gets lost.

os.fsync() forces the data to be written, ensuring data integrity. It’s especially useful in logging or when writing critical data that must not be lost.

import os

with open('data.txt', 'w') as f:
    f.write("gibberish!")
    os.fsync(f.fileno())  # Ensures data is written to disk

os.fsync(f.fileno()) is called to make sure the data is immediately written to the disk and not left in the buffer.

os.fsync() takes file descriptor that’s why we passed f.fileno() which is a unique integer assigned by the system to the file on which we are operating.

os.get_terminal_size()

If you’re creating CLI tools, formatting the output to fit the terminal width can make the output cleaner. os.get_terminal_size() gives you the current terminal width and height, making it easy to dynamically format content.

import os

size = os.get_terminal_size()
print(f"Terminal Width: {size.columns}, Terminal Height: {size.lines}")

When we run this code in the terminal, we get the size of the terminal on which we are running this script.

PS > py sample.py
Terminal Width: 158, Terminal Height: 12

Note: You may get an error when directly running the script on IDE where the program doesn’t have access to the terminal.


🏆Other articles you might be interested in if you liked this one

Streaming videos on the frontend in FastAPI.

How to fix circular imports in Python.

Template inheritance in Flask.

How to use type hints in Python?

How to find and delete mismatched columns from datasets in pandas?

How does the learning rate affect the ML and DL models?


That’s all for now.

Keep Coding✌✌.

0
Subscribe to my newsletter

Read articles from Sachin Pal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sachin Pal
Sachin Pal

I am a self-taught Python developer who loves to write on Python Programming and quite obsessed with Machine Learning.