Real-Time Kubernetes Monitoring with OpenObserve, Prometheus, and Python

🧩 Introduction
In today’s cloud-native ecosystem, observability is a must. While tools like Grafana and Loki are widely used, I recently explored a different approach using OpenObserve, a high-performance, low-cost observability platform.
In this post, I’ll walk you through how we monitored Kubernetes pods—capturing pod status, logs, age, and log counts—using OpenObserve, Prometheus, and Python scripts, with an addition of Vector for log forwarding.
🛠️ Tech Stack
Kubernetes (Minikube running on an AWS EC2 instance)
Prometheus – for scraping application and pod-level metrics
Python – for collecting Kubernetes metadata and logs
Vector (optional) – for forwarding logs
OpenObserve – for storing and visualizing logs and metrics
📌 Use Case Overview
Our observability goals included:
Monitoring pod lifecycle: status, restarts, and age
Collecting and centralizing pod logs from all namespaces
Dynamically tracking pod age and restart patterns
Generating custom request count metrics using Prometheus actuator endpoints
Creating dashboards and alerts in OpenObserve for real-time insights
🧭 Step-by-Step Procedure
Here’s a structured breakdown of how we implemented the observability stack:
🔹 Step 1: Set Up Kubernetes Environment
We began by creating a Minikube cluster on an AWS EC2 instance.
Installed Minikube on an Amazon Linux 2 EC2 instance
Installed and configured
kubectl
Deployed sample applications and services to simulate workloads
Ensured network access and DNS resolution for external observability tools
🔹 Step 2: Deploy and Configure OpenObserve (External to Kubernetes)
Instead of deploying OpenObserve inside the Kubernetes cluster, we ran it on a separate EC2 instance using Docker.
Deployment Highlights:
Docker-based setup: Pulled and ran OpenObserve as a container
Public access: The container’s default port (
5080
) was exposedSecurity groups were adjusted to allow ingress traffic
Accessed the web UI via
http://<ec2-public-ip>:5080
Ingestion:
Logs were pushed via OpenObserve’s HTTP API using Python scripts (or optionally Vector)
Logs were tagged with metadata like namespace, pod name, and timestamp
Retention and indexing policies were configured for performance and cost control
🔹 Step 3: Configure Prometheus to Scrape Metrics
We configured Prometheus to collect metrics from the applications:
Enabled
/actuator/prometheus
endpoint (for Spring Boot apps)Exposed relevant custom metrics (like log count or request volume)
Configured Prometheus to auto-discover application pods via Kubernetes service discovery
Ensured Prometheus was running either inside Minikube or externally, as required
🔹 Step 4: Collect Pod Metadata Using Python
A custom Python script was developed to interface with the Kubernetes API and extract:
Pod status: Running, Pending, CrashLoopBackOff, etc.
Pod age: By calculating the time since creation timestamp
Restart count: To identify instability in applications
Other metadata: Such as pod name, namespace, node name, etc.
This script helped us build a dynamic and real-time picture of pod health.
🔹 Step 5: Push Logs to OpenObserve
Using the same Python script (or optionally Vector):
Logs were collected from each pod using the Kubernetes logs API
Each log line was enriched with custom fields (e.g., pod name, level, timestamp)
Logs were pushed to OpenObserve’s ingestion endpoint over HTTP/HTTPS
This enabled centralized log search and visualization
🔹 Step 6: Generate Custom Metrics
We also tracked metrics derived from log data:
Log volume per pod (count of logs over time)
Breakdown by log severity: info, warn, error
Application-level metrics like total requests, error rate, etc.
These metrics were exposed via HTTP endpoints and scraped by Prometheus.
🔹 Step 7: Visualize Data in OpenObserve
Once data was flowing into OpenObserve, we created dashboards to bring insights to life:
Time-series graphs for log counts and error spikes
Health dashboards for pod status, age, and restart frequency
Search filters to isolate logs per namespace, pod, or severity
Set up alerts and thresholds (e.g., error rate per minute)
Architecture Overview:
✅ Conclusion
This observability setup using OpenObserve, Prometheus, and Python automation proved to be:
Lightweight: No heavy agents or resource-intensive components
Flexible: Easily customizable for any Kubernetes setup
Cost-effective: Fully open-source and easy to scale
DevOps-friendly: Integrates seamlessly into modern pipelines
Subscribe to my newsletter
Read articles from Pavithra directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
