Parsing Messy Log Files in Python


Logs are often messy, but they hold valuable data. Whether it’s server logs, application traces, or error dumps, knowing how to parse and extract information using Python is a valuable skill.
In this post, we'll build a Python script that:
Reads a messy log file line by line
Extracts meaningful info using
re
(regular expressions)Outputs structured data for analysis
💥 The Problem
Here’s a sample log file:
pgsqlCopyEdit[2025-07-21 14:02:15] ERROR User login failed for user: admin IP: 192.168.0.5
[2025-07-21 14:03:12] INFO User logged in successfully: user: guest IP: 192.168.0.3
[2025-07-21 14:05:44] WARNING Disk usage high on /dev/sda1
[2025-07-21 14:06:01] ERROR File not found: /etc/passwd
Let’s say you want to extract all timestamps, log levels, messages, and IPs (if available) into structured records.
🔍 Step 1: Define the Regex Pattern
pythonCopyEditimport re
log_pattern = re.compile(
r"\[(?P<timestamp>.*?)\]\s+(?P<level>ERROR|INFO|WARNING)\s+(?P<message>.*?)(IP:\s*(?P<ip>\d{1,3}(?:\.\d{1,3}){3}))?$"
)
This will match logs with or without IPs.
📂 Step 2: Parse the Log File
pythonCopyEditdef parse_log(file_path):
results = []
with open(file_path, 'r') as f:
for line in f:
match = log_pattern.search(line)
if match:
entry = match.groupdict()
results.append(entry)
return results
📊 Step 3: Analyze or Export the Data
You can do anything with the parsed data — print it, analyze it, or export to JSON/CSV:
pythonCopyEditimport json
logs = parse_log("server.log")
with open("logs.json", "w") as f:
json.dump(logs, f, indent=2)
# Or pretty print
for log in logs:
print(log)
📦 Output Example
jsonCopyEdit{
"timestamp": "2025-07-21 14:02:15",
"level": "ERROR",
"message": "User login failed for user: admin ",
"ip": "192.168.0.5"
}
🔁 Bonus: Filter by Log Level
pythonCopyEditerrors_only = [entry for entry in logs if entry['level'] == 'ERROR']
Now you can export all errors, generate reports, or trigger alerts.
🧵 Wrapping Up
Whether you're debugging a system or building a monitoring tool, parsing logs with Python can save you hours. Regex gives you control. Wrap it in a function, and you’ve got a reusable tool.
Have your own log formats from Nginx, Flask, or custom apps? Try adapting the pattern and share your results!
Subscribe to my newsletter
Read articles from Ashraful Islam Leon directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Ashraful Islam Leon
Ashraful Islam Leon
Passionate Software Developer | Crafting clean code and elegant solutions