Observability Practices in Action: Real-Time Monitoring with Prometheus and Grafana

Table of contents
- Introduction
- What is Observability?
- Real-World Scenario: Monitoring a Python Application
- Step 1: Instrumenting the Application with Prometheus Client
- Step 2: Setting Up Prometheus
- Step 3: Visualizing Metrics in Grafana Cloud
- Step 4: Creating Beautiful Dashboards
- Step 5: Automating Observability Checks with GitHub Actions
- Why Is This Powerful?
- Source Code
- Real Demo
- Conclusion

Introduction
In today’s fast-paced, cloud-driven world, understanding what happens inside your applications and infrastructure is more critical than ever. Observability is not just a buzzword—it's a necessity for delivering reliable, performant, and user-friendly software. But what does observability look like in practice? In this article, I’ll walk you through a hands-on, real-world example of implementing observability using two of the most popular open-source platforms: Prometheus and Grafana.
Whether you’re a backend developer, DevOps engineer, or just getting started with site reliability engineering (SRE), this article will help you understand how to instrument your code, collect metrics, and visualize them in beautiful dashboards that deliver actionable insights.
What is Observability?
Observability refers to how well you can understand the internal state of a system based on the data it produces—typically logs, metrics, and traces. A highly observable system lets you quickly detect, diagnose, and resolve issues, even those you didn’t anticipate.
Key components:
Metrics: Numeric values that represent the health or performance of your system (e.g., request count, latency, error rates).
Logs: Text records of discrete events.
Traces: Information about the flow of requests through various components.
In this article, we’ll focus on metrics using Prometheus and Grafana.
Real-World Scenario: Monitoring a Python Application
Imagine you have a Python web application serving users. You want to know:
How many requests are being processed?
How long do they take?
What’s the current temperature in your server room (or any custom metric)?
Let’s make your app observable!
Step 1: Instrumenting the Application with Prometheus Client
We'll use the prometheus_client
library to expose metrics. Here’s a minimal example:
from prometheus_client import start_http_server, Summary, Counter, Gauge
import random
import time
# Metrics
REQUEST_COUNT = Counter('request_count_total', 'Número de peticiones procesadas')
REQUEST_TIME = Summary('request_processing_seconds', 'Tiempo de procesamiento de la solicitud')
ROOM_TEMP = Gauge('room_temperature_celsius', 'Temperatura de la habitación en Celsius')
def process_request():
"""Simulate request processing"""
REQUEST_COUNT.inc()
with REQUEST_TIME.time():
ROOM_TEMP.set(20 + random.random() * 5)
time.sleep(random.random())
if __name__ == "__main__":
start_http_server(8000)
print("Prometheus metrics available on http://localhost:8000/metrics")
while True:
process_request()
time.sleep(1)
This snippet exposes /metrics
on port 8000, which Prometheus can scrape.
Step 2: Setting Up Prometheus
You can run Prometheus locally by downloading it from prometheus.io.
Your prometheus.yml
should include:
global:
scrape_interval: 5s
scrape_configs:
- job_name: 'python_app'
static_configs:
- targets: ['localhost:8000']
Start Prometheus and visit http://localhost:9090.
Step 3: Visualizing Metrics in Grafana Cloud
Instead of running Grafana locally, let’s use Grafana Cloud for simplicity and scalability.
Set up a Prometheus data source in Grafana Cloud using the
remote_write
configuration.In your local
prometheus.yml
, add:
remote_write:
- url: "<your-grafana-cloud-prometheus-remote-write-url>"
basic_auth:
username: "<your-username>"
password: "<your-api-key>"
Now, metrics from your local Prometheus will flow to Grafana Cloud!
Step 4: Creating Beautiful Dashboards
In Grafana Cloud:
Go to Dashboards > New Dashboard.
Add a panel for your metric, e.g., request_count_total.
Try visualizing the rate of requests:
Add panels for request_count_total.
You can now monitor your app's health, performance, and even custom business metrics in real time!
Step 5: Automating Observability Checks with GitHub Actions
Monitoring is only valuable if you can trust that your observability pipeline is always working. In a manual workflow, you would have to:
Install dependencies
Run code linting to check for style errors
Launch your application
Test that the
/metrics
endpoint is live and returns the expected metrics
Doing this manually every time you make a change is tedious and error-prone. This is where automation saves the day.
What gets automated?
With GitHub Actions, you can automate the entire validation process. Every time you push code or open a pull request, GitHub Actions will:
Check out the latest version of your code
Set up the Python environment
Install all dependencies from
requirements.txt
Run linting checks with
flake8
Launch your application and make a request to
/metrics
to ensure metrics are exposed correctly
If any step fails, you get immediate feedback, ensuring that your codebase always remains observable.
Example: GitHub Actions Workflow
Here’s a sample workflow you can add to .github/workflows/ci.yml
in your repository:
name: CI - Python Metrics App
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: pip install -r requirements.txt
- name: Lint code
run: |
pip install flake8
flake8 app.py
- name: Run app and test /metrics endpoint
run: |
nohup python app.py &
sleep 5
curl http://localhost:8000/metrics
Why automate this?
No manual validation needed: Every code change is automatically validated.
Prevents regressions: If someone breaks the
/metrics
endpoint, the workflow fails and you know right away.Ensures code quality: Linting is enforced as part of the pipeline.
Builds confidence: Your observability solution remains reliable as your code evolves.
Why Is This Powerful?
Proactive Monitoring: Spot slowdowns or errors before users complain.
Custom Metrics: Track what actually matters to your business.
Open-Source and Cloud-Ready: Start locally, scale globally.
Source Code
You can find the complete source code for this project on GitHub:
observability-python-prometheus-grafana (GitHub Repo)
Real Demo
Here’s a YouTube video where I walk through this entire process, from code to dashboard!
Conclusion
Observability isn’t just for big tech companies—anyone can start today with open-source tools like Prometheus and Grafana. By instrumenting your code and visualizing metrics, you gain deep insights into your system, improve reliability, and deliver better user experiences.
Ready to level up your monitoring game?
Try out this example, and let me know your thoughts or questions in the comments!
Subscribe to my newsletter
Read articles from JAIME ELIAS FLORES QUISPE directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
