Working with PromQL

Vidushi BansalVidushi Bansal
4 min read

In today’s cloud-native world, monitoring is essential to keep our applications healthy and running smoothly. Prometheus is a powerful open-source monitoring tool, and at its heart lies PromQL—Prometheus Query Language—which allows us to explore, query, and visualize metrics collected by Prometheus.

In this blog, we will learn how to use PromQL to query metrics and generate results for real-time monitoring.

PromQL (Prometheus Query Language) is the language used to query data stored in Prometheus. With PromQL, you can extract and manipulate time series data to understand the state of your infrastructure or application. Whether you want to get a simple count of HTTP requests or perform advanced analysis like rate calculations, PromQL enables you to pull the right metrics efficiently.

A PromQL expression evaluates metrics over time, allowing you to:

  • Fetch raw data

  • Aggregate or filter data based on conditions

  • Perform calculations (e.g., sum, average, min/max, rates)

  • Visualize trends, patterns, or sudden spikes in metrics

Values generated by PromQL can be:

String: a simple string value

Scalar: A simple numeric floating point value

Instant Vector: set of time series containing a single sample for each time series, all sharing the same timestamp.

Range Vector: set of time series containing a range of data points over time for each time series.

Basic PromQL Queries

1. Fetching Raw Metrics

To retrieve raw data for a specific metric, you simply need to reference the metric by its name. For example:

http_requests_total

This query will return the total number of HTTP requests collected over time, showing all the available time series for this metric.

2. Filtering by Labels

To filter a metric by its labels, you can use label selectors. Let’s say you want to get only the requests that were successful (i.e., those with a 200 status code):

promqlCopy codehttp_requests_total{status="200"}

This query will return only the time series where the status label equals "200."

3. Using Arithmetic Operators

PromQL supports arithmetic operations between time series or between a time series and a constant. For example, you can divide the total requests by 100 to convert them into percentage:

promqlCopy codehttp_requests_total / 100

4. Aggregating Data

One of the key features of PromQL is its ability to aggregate time series data. If you want to sum up all HTTP requests across all instances:

promqlCopy codesum(http_requests_total)

This query returns the total number of HTTP requests by summing across all instances.

5. Calculating Rates

In Prometheus, the rate function is often used to calculate how fast a counter metric is increasing. For instance, to calculate the rate of HTTP requests per second over the last 5 minutes:

promqlCopy coderate(http_requests_total[5m])

This query helps you see the number of HTTP requests per second within the past 5 minutes, which is useful for understanding traffic patterns.

Examples

Lets check the disk information of the two nodes:

node_disk_info

Filtering by Labels. We will use instance IP as label to filter out the result for a specific instance.

node_disk_info{instance="34.203.243.197:9100"}

To execute more than one label, use “,” to add more. For example,

node_disk_info{instance="34.203.243.197:9100",device="xvda"}

To get the metrics over a period of time

Scraping node_memory_MemFree_bytes metric for last 1minute from nodes.

node_memory_MemFree_bytes{job="nodes"}[1m]

To get value of a metric a from some time ago. For example: 10 minutes ago, you can use offset.

node_memory_MemFree_bytes{job="nodes"}offset 10m

Operators

Greater Than/Lesser Than

Retrieve all the nodes where the free memory (node_memory_MemFree_bytes) is greater than 162 million bytes (about 154 MB).

node_memory_MemFree_bytes{job="nodes"} > 162000000

Boolean Value

Check whether the free memory on nodes is greater than 162 million bytes and returns a boolean result (true-1 or false-0) instead of the actual value.

node_memory_MemFree_bytes{job="nodes"} > bool 162000000

Logical Operators in PromQL

PromQL has 3 logical operators:

  1. OR

  2. AND

  3. UNLESS

OR Operator

Return time series where the total CPU seconds is greater than 30, or the CPU time spent in "user" mode.

node_cpu_seconds_total{job="nodes"} > 30 or node_cpu_seconds_total{job="nodes",mode="user"}

AND Operator

Return time series for nodes where the free memory is between 162 million and 172 million bytes (i.e., between approximately 154 MB and 164 MB).

node_memory_MemFree_bytes{job="nodes"} > 162000000 and node_memory_MemFree_bytes{job="nodes"} < 172000000

UNLESS Operator

returns time series where node_cpu_seconds_total is greater than 30, but excludes the series where the CPU time is in the "steal" mode.

node_cpu_seconds_total{job="nodes"} > 30 unless node_cpu_seconds_total{job="nodes",mode="steal"}

With its support for functions, operators, and aggregations, PromQL gives you the tools to understand what's happening in your system at any point in time, making it an essential part of any modern monitoring stack.

By learning PromQL, you can unlock the full potential of Prometheus and make your metrics truly actionable.

0
Subscribe to my newsletter

Read articles from Vidushi Bansal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vidushi Bansal
Vidushi Bansal

DevOps enthusiast on a constant quest for knowledge! From wrangling complex pipelines to exploring the latest tech stacks, I’m all about learning, leveling up, and debugging with a smile. Whether it’s automating, collaborating, or diving into the world of cloud, I’m always ready to build and improve. Let’s innovate together!