Parsing Messy Log Files in Python

Logs are often messy, but they hold valuable data. Whether it’s server logs, application traces, or error dumps, knowing how to parse and extract information using Python is a valuable skill.

In this post, we'll build a Python script that:

  • Reads a messy log file line by line

  • Extracts meaningful info using re (regular expressions)

  • Outputs structured data for analysis


💥 The Problem

Here’s a sample log file:

pgsqlCopyEdit[2025-07-21 14:02:15] ERROR User login failed for user: admin IP: 192.168.0.5
[2025-07-21 14:03:12] INFO User logged in successfully: user: guest IP: 192.168.0.3
[2025-07-21 14:05:44] WARNING Disk usage high on /dev/sda1
[2025-07-21 14:06:01] ERROR File not found: /etc/passwd

Let’s say you want to extract all timestamps, log levels, messages, and IPs (if available) into structured records.


🔍 Step 1: Define the Regex Pattern

pythonCopyEditimport re

log_pattern = re.compile(
    r"\[(?P<timestamp>.*?)\]\s+(?P<level>ERROR|INFO|WARNING)\s+(?P<message>.*?)(IP:\s*(?P<ip>\d{1,3}(?:\.\d{1,3}){3}))?$"
)

This will match logs with or without IPs.


📂 Step 2: Parse the Log File

pythonCopyEditdef parse_log(file_path):
    results = []
    with open(file_path, 'r') as f:
        for line in f:
            match = log_pattern.search(line)
            if match:
                entry = match.groupdict()
                results.append(entry)
    return results

📊 Step 3: Analyze or Export the Data

You can do anything with the parsed data — print it, analyze it, or export to JSON/CSV:

pythonCopyEditimport json

logs = parse_log("server.log")

with open("logs.json", "w") as f:
    json.dump(logs, f, indent=2)

# Or pretty print
for log in logs:
    print(log)

📦 Output Example

jsonCopyEdit{
  "timestamp": "2025-07-21 14:02:15",
  "level": "ERROR",
  "message": "User login failed for user: admin ",
  "ip": "192.168.0.5"
}

🔁 Bonus: Filter by Log Level

pythonCopyEditerrors_only = [entry for entry in logs if entry['level'] == 'ERROR']

Now you can export all errors, generate reports, or trigger alerts.


🧵 Wrapping Up

Whether you're debugging a system or building a monitoring tool, parsing logs with Python can save you hours. Regex gives you control. Wrap it in a function, and you’ve got a reusable tool.

Have your own log formats from Nginx, Flask, or custom apps? Try adapting the pattern and share your results!

0
Subscribe to my newsletter

Read articles from Ashraful Islam Leon directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ashraful Islam Leon
Ashraful Islam Leon

Passionate Software Developer | Crafting clean code and elegant solutions