Real-Time Kubernetes Monitoring with OpenObserve, Prometheus, and Python

PavithraPavithra
4 min read

🧩 Introduction

In today’s cloud-native ecosystem, observability is a must. While tools like Grafana and Loki are widely used, I recently explored a different approach using OpenObserve, a high-performance, low-cost observability platform.

In this post, I’ll walk you through how we monitored Kubernetes pods—capturing pod status, logs, age, and log counts—using OpenObserve, Prometheus, and Python scripts, with an addition of Vector for log forwarding.

🛠️ Tech Stack

  • Kubernetes (Minikube running on an AWS EC2 instance)

  • Prometheus – for scraping application and pod-level metrics

  • Python – for collecting Kubernetes metadata and logs

  • Vector (optional) – for forwarding logs

  • OpenObserve – for storing and visualizing logs and metrics

📌 Use Case Overview

Our observability goals included:

  • Monitoring pod lifecycle: status, restarts, and age

  • Collecting and centralizing pod logs from all namespaces

  • Dynamically tracking pod age and restart patterns

  • Generating custom request count metrics using Prometheus actuator endpoints

  • Creating dashboards and alerts in OpenObserve for real-time insights

🧭 Step-by-Step Procedure

Here’s a structured breakdown of how we implemented the observability stack:

🔹 Step 1: Set Up Kubernetes Environment

We began by creating a Minikube cluster on an AWS EC2 instance.

  • Installed Minikube on an Amazon Linux 2 EC2 instance

  • Installed and configured kubectl

  • Deployed sample applications and services to simulate workloads

  • Ensured network access and DNS resolution for external observability tools

🔹 Step 2: Deploy and Configure OpenObserve (External to Kubernetes)

Instead of deploying OpenObserve inside the Kubernetes cluster, we ran it on a separate EC2 instance using Docker.

Deployment Highlights:

  • Docker-based setup: Pulled and ran OpenObserve as a container

  • Public access: The container’s default port (5080) was exposed

  • Security groups were adjusted to allow ingress traffic

  • Accessed the web UI via http://<ec2-public-ip>:5080

Ingestion:

  • Logs were pushed via OpenObserve’s HTTP API using Python scripts (or optionally Vector)

  • Logs were tagged with metadata like namespace, pod name, and timestamp

  • Retention and indexing policies were configured for performance and cost control

🔹 Step 3: Configure Prometheus to Scrape Metrics

We configured Prometheus to collect metrics from the applications:

  • Enabled /actuator/prometheus endpoint (for Spring Boot apps)

  • Exposed relevant custom metrics (like log count or request volume)

  • Configured Prometheus to auto-discover application pods via Kubernetes service discovery

  • Ensured Prometheus was running either inside Minikube or externally, as required

🔹 Step 4: Collect Pod Metadata Using Python

A custom Python script was developed to interface with the Kubernetes API and extract:

  • Pod status: Running, Pending, CrashLoopBackOff, etc.

  • Pod age: By calculating the time since creation timestamp

  • Restart count: To identify instability in applications

  • Other metadata: Such as pod name, namespace, node name, etc.

This script helped us build a dynamic and real-time picture of pod health.

🔹 Step 5: Push Logs to OpenObserve

Using the same Python script (or optionally Vector):

  • Logs were collected from each pod using the Kubernetes logs API

  • Each log line was enriched with custom fields (e.g., pod name, level, timestamp)

  • Logs were pushed to OpenObserve’s ingestion endpoint over HTTP/HTTPS

  • This enabled centralized log search and visualization

🔹 Step 6: Generate Custom Metrics

We also tracked metrics derived from log data:

  • Log volume per pod (count of logs over time)

  • Breakdown by log severity: info, warn, error

  • Application-level metrics like total requests, error rate, etc.

These metrics were exposed via HTTP endpoints and scraped by Prometheus.

🔹 Step 7: Visualize Data in OpenObserve

Once data was flowing into OpenObserve, we created dashboards to bring insights to life:

  • Time-series graphs for log counts and error spikes

  • Health dashboards for pod status, age, and restart frequency

  • Search filters to isolate logs per namespace, pod, or severity

  • Set up alerts and thresholds (e.g., error rate per minute)

    Architecture Overview:

✅ Conclusion

This observability setup using OpenObserve, Prometheus, and Python automation proved to be:

  • Lightweight: No heavy agents or resource-intensive components

  • Flexible: Easily customizable for any Kubernetes setup

  • Cost-effective: Fully open-source and easy to scale

  • DevOps-friendly: Integrates seamlessly into modern pipelines

0
Subscribe to my newsletter

Read articles from Pavithra directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Pavithra
Pavithra