Day 3: Observability Series โ Mastering PromQL in Prometheus


Welcome to Day 3 of the Observability Series! In this installment, weโll focus on PromQL (Prometheus Query Language), the tool that makes Prometheus a powerful monitoring solution. If you're diving into Prometheus, PromQL is your gateway to querying, analyzing, and gaining insights into your system's metrics.
๐ What is PromQL?
PromQL is a flexible and powerful query language designed to work with time-series data stored in Prometheus. It allows you to:
Retrieve data from specific metrics.
Perform mathematical operations for analysis.
Aggregate and manipulate data based on labels or dimensions.
Build complex queries to monitor system behavior effectively.
๐ Structure of a PromQL Query
A PromQL query typically includes:
Metric Name: The specific measurement (e.g.,
http_requests_total
).Labels: Filters for narrowing down results (e.g.,
{method="POST", status="500"}
).Range Selectors: Time ranges for fetching historical data (e.g.,
[10m]
).Functions: Built-in operations to process data (e.g.,
rate()
,sum()
).
๐ Basic PromQL Commands
Single Metric Query
http_requests_total
Fetches all time series data for the metric http_requests_total
.
Label Filtering
http_requests_total{method="GET", status="200"}
Retrieves time series data for successful GET
requests.
Time Range Query
http_requests_total{status="404"}[5m]
Fetches data for all 404
responses in the last 5 minutes.
โ๏ธ Aggregation in PromQL
Aggregation combines multiple time series into meaningful summaries.
Summing Time Series
sum(rate(container_cpu_usage_seconds_total[5m]))
Calculates the total CPU usage rate across containers over the past 5 minutes.
Grouping by Labels
avg(node_memory_Active_bytes) by (instance)
Returns the average active memory usage grouped by instance
.
Maximum and Minimum
max_over_time(node_memory_MemAvailable_bytes[1h])
min_over_time(node_memory_MemAvailable_bytes[1h])
Finds the maximum and minimum memory available over the last hour.
๐ Advanced PromQL Functions
PromQLโs advanced functions enable deep analysis of metrics.
Rate
rate(http_requests_total[1m])
Computes the per-second increase in http_requests_total
over 1 minute.
Increase
increase(kube_pod_container_status_restarts_total[1h])
Calculates the total number of container restarts in the past hour.
Histogram Quantile
histogram_quantile(0.90, sum(rate(request_duration_seconds_bucket[5m])) by (le))
Finds the 90th percentile of request durations.
Predict Linear
predict_linear(node_network_receive_bytes_total[30m], 3600)
Forecasts the network bytes received in the next hour based on the last 30 minutes.
๐งช Additional Commands for Real-World Use Cases
Error Analysis
rate(http_requests_total{status=~"5.."}[10m])
Tracks the rate of server errors (5xx) over the last 10 minutes.
Top Resource Consumers
topk(3, rate(container_memory_usage_bytes[5m]))
Finds the top 3 containers consuming the most memory over 5 minutes.
Disk Usage Trends
delta(node_filesystem_free_bytes[1h])
Calculates the change in available disk space over an hour.
๐ PromQL in Action: Monitoring and Alerting
Kubernetes Pod Metrics
sum(rate(container_cpu_usage_seconds_total{namespace="prod"}[1m])) by (pod)
Aggregates CPU usage across pods in the prod
namespace.
Service Latency Analysis
avg_over_time(http_request_duration_seconds{job="web"}[10m])
Calculates the average response time for a web service over 10 minutes.
Alert for High Memory Usage
container_memory_usage_bytes > 1e+09
Triggers an alert when container memory usage exceeds 1 GB.
๐ก Tips for Writing Effective PromQL Queries
Start Simple: Begin with basic queries to understand the metrics.
Layer Functions: Combine functions like
rate()
andsum()
for deeper insights.Test and Iterate: Use the Prometheus UI or Grafana to validate your queries.
Optimize Filters: Leverage labels to fine-tune queries and reduce unnecessary data retrieval.
๐ Conclusion
PromQL is a game-changer for monitoring and observability, transforming raw metrics into actionable insights. By mastering its commands and functions, you can monitor complex systems effectively, analyze trends, and set up meaningful alerts.
As part of this Observability Series, weโve explored PromQL fundamentals and advanced queries. Stay tuned for Day 4, where weโll dive into setting up Grafana dashboards for Prometheus metrics!
Whatโs your favorite PromQL query? Share it in the comments below!
Subscribe to my newsletter
Read articles from Navya A directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Navya A
Navya A
๐ Welcome to my Hashnode profile! I'm a passionate technologist with expertise in AWS, DevOps, Kubernetes, Terraform, Datree, and various cloud technologies. Here's a glimpse into what I bring to the table: ๐ Cloud Aficionado: I thrive in the world of cloud technologies, particularly AWS. From architecting scalable infrastructure to optimizing cost efficiency, I love diving deep into the AWS ecosystem and crafting robust solutions. ๐ DevOps Champion: As a DevOps enthusiast, I embrace the culture of collaboration and continuous improvement. I specialize in streamlining development workflows, implementing CI/CD pipelines, and automating infrastructure deployment using modern tools like Kubernetes. โต Kubernetes Navigator: Navigating the seas of containerization is my forte. With a solid grasp on Kubernetes, I orchestrate containerized applications, manage deployments, and ensure seamless scalability while maximizing resource utilization. ๐๏ธ Terraform Magician: Building infrastructure as code is where I excel. With Terraform, I conjure up infrastructure blueprints, define infrastructure-as-code, and provision resources across multiple cloud platforms, ensuring consistent and reproducible deployments. ๐ณ Datree Guardian: In my quest for secure and compliant code, I leverage Datree to enforce best practices and prevent misconfigurations. I'm passionate about maintaining code quality, security, and reliability in every project I undertake. ๐ Cloud Explorer: The ever-evolving cloud landscape fascinates me, and I'm constantly exploring new technologies and trends. From serverless architectures to big data analytics, I'm eager to stay ahead of the curve and help you harness the full potential of the cloud. Whether you need assistance in designing scalable architectures, optimizing your infrastructure, or enhancing your DevOps practices, I'm here to collaborate and share my knowledge. Let's embark on a journey together, where we leverage cutting-edge technologies to build robust and efficient solutions in the cloud! ๐๐ป