Day 39 of 90 Days of DevOps Challenge: What is Promethus?


Yesterday, I deployed Prometheus and Grafana using Helm charts, exposed both tools via LoadBalancer services, and successfully logged into Grafana to explore pre-built Kubernetes monitoring dashboards.
After deploying Prometheus and Grafana for Kubernetes, I realized it’s crucial not just to use these tools, but also to understand how they work under the hood. Today, I’m diving deeper into Prometheus by understanding its detailed architecture, exploring how it collects data through its scraping mechanism, and examining key components such as the Time-Series Database (TSDB), service discovery, and the pull-based model. I’ll also walk through the essential configurations defined in the prometheus.yml
file.
What is Prometheus?
Prometheus is a leading open-source monitoring system originally developed at SoundCloud It excels in collecting and storing time-series data by periodically scraping metrics from configured targets.
Its robust architecture and flexible querying language (PromQL) make it ideal for cloud native and containerized environments like Kubernetes.
Prometheus Architecture Breakdown
Here's a high-level breakdown of how Prometheus works:
1. Time-Series Database (TSDB)
Prometheus stores all metrics as time-series data with a timestamp, value, and optional labels (key-value pairs).
TSDB supports fast querying and efficient storage using WAL (Write-Ahead Logs) and chunked storage.
2. Data Collection via Scraping
Prometheus pulls metrics from targets by scraping HTTP endpoints (usually
/metrics
).Each target must expose metrics in Prometheus-compatible format.
3. Service Discovery
Prometheus can dynamically discover scrape targets in Kubernetes via built-in support for K8s APIs.
This ensures Prometheus stays up-to-date with newly created pods/services without manual changes.
4. Prometheus Server
The main components of the Prometheus server include:
Scrape Manager: Handles periodic scraping from targets.
Storage Layer: Stores data in TSDB.
Query Engine: Parses and runs PromQL queries.
Web UI/API: Exposes real-time data, alerts, and allows manual queries.
5. Alertmanager (External Component)
Although not part of the core Prometheus server, it integrates seamlessly.
Sends alert notifications via email, Slack, PagerDuty, etc.
Configuration Basics: prometheus.yml
Here’s a basic configuration snippet to scrape metrics from Kubernetes nodes:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__address__]
regex: '(.*):10250'
target_label: __address__
replacement: '${1}:9100'
scrape_interval
: Frequency of data collectionjob_name
: Logical group namekubernetes
_sd_configs
: Enables dynamic service discoveryrelabel_configs
: Modifies target metadata before scraping
How Prometheus Collects Metrics in Kubernetes
Each Kubernetes component (node, pod, service) exposes a
/metrics
endpoint.Prometheus uses Kubernetes service discovery to fetch a list of targets.
It pulls the metrics at regular intervals and stores them in TSDB.
These metrics are then visualized via PromQL in Grafana.
Final Thoughts
Understanding the internal architecture of Prometheus has helped me appreciate the why behind its pull-based model and time-series storage engine. It's fascinating to see how Prometheus orchestrates dynamic scraping, efficient data storage, and real-time querying, all while being cloud-native and Kubernetes-friendly.
With this foundation, I'm now ready to explore Prometheus Exporters tomorrow, which serve as the bridge between non-instrumented services and Prometheus.
Stay tuned for Day 40, where we’ll deep-dive into popular exporters like node_expor
ter
, cadvis
or
, and how to expose custom metrics!
Subscribe to my newsletter
Read articles from Vaishnavi D directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
