Prometheus Grafana Metrics in multi thread multi pod environment

I encountered an issue where incorrect metrics were being logged in Prometheus within a Kubernetes cluster. These metrics were exposed from a Python-based REST API service which was running in a multi-pod environment built using the FastAPI framework.

This tutorial will help in handling Gauge and Histogram on multi-pod environment that publishes metrics wrong overtime due to usage of single metric.

What are Custom Metrics?

Custom metrics are those metrics that need to be exposed from your Apps for all analytics purposes seen in Visualization tools like Grafana.

This does not include system metrics like API requests or CPU usages, etc.
Usually metrics related to product’s business performance.

Publishing metrics

This is a straightforward thing, you will have a service in your Kubernetes application that exposes Prometheus metrics for Prometheus to scrape data.

The service I was working on was written in python and the library used for exposing metrics was prometheus_fastapi_instrumentator and metrics defined in prometheus_client that had Gauges and Histograms.

When does the Problem come?

The problem comes when you are working with services which are multi-pod and multi-threaded. Prometheus will scrape data from each pod and from multi-threaded servers like Gunicorn with FastAPI.

This will expose on the same Prometheus metrics on the Prometheus’ server leading to data being (actual data) X (number of pods) X (number of threads) which is not easy to understand the reason behind the issue as you as a developer would have exposed it once in your service’s code.

This can be detected by checking Prometheus’ UI with filtering out your metrics and hovering over the values, for which the library would have added “pod” label, which ends up adding to the same metric(each thread in the pod, if present, it would have added the values on the same “pod” label of the metric).

How to resolve this?

Using custom service that is single pod and single thread

Dedicated service just to get metrics out to Prometheus that usually need not have any other use cases in the system as it won’t be configured to scale.

Altering the Scraping mechanism on the existing service and Grafana queries

Mechanism to manage the metrics on the same service, is to alter the scraping mechanism to set the metrics on the service.

Prometheus, a time-series database, collects and timestamps metrics from services. In the library, Histograms use the observe() method, while Gauges use inc(). A challenge arises in multi-threaded environments, as Prometheus gathers data from service which are set by multiple threads across pods. Although using remove() before observe() and inc() might seem helpful, it often doesn't work due to the concurrent nature of multi-threaded pods.

set() is a way to get things shaped properly without thinking of remove() option as the main goal is to expose straightforward custom metric. Custom metrics usually aren’t system generated but things that come up from Database or Cache due to usage and altered as required in code.

Prometheus scraping changes in the service

The code used to look like this for Histogram

This fails to expose metrics properly

import prometheus_client as prom

# Buckets would look like 0, 1, 2, 3-5, 6-10, 11-20, 21-50, 51-100, 100+
buckets_req = (
    0,
    1,
    2,
    5,
    10,
    20,
    50,
    100,
    inf
)

HISTOGRAM = prom.Histogram(
    "MetricNameHistogram",
    "Description",
    labelnames=["SomeLabel"],
    buckets=buckets_req
)

# setting data into the metrics
def update_metrics():
    data = data_from_somewhere # DB or Cache or Somewhere specific
    try:
        HISTOGRAM.remove("SomeLabel")
    except KeyError as e:
        log("Failed clearing")

    try:
       for item in data:
            count = item.get("someCount")
            HISTOGRAM.labels(SomeLabel="CustomLabel").observe(count)
    except Exception as e:
        log("Failed adding metrics data to Histogram")

def update_prometheus_metics():
    while True:
        update_metrics()
        time.sleep(3600) # update metrics every hour

Converted it to this, which solves the issue

set() function is only present to Gauges, hence need to convert Histogram to Gauge and simulate buckets using the labels

import prometheus_client as prom

# Buckets would look like 0, 1, 2, 3-5, 6-10, 11-20, 21-50, 51-100, 100+
buckets_req = (
    0,
    1,
    2,
    5,
    10,
    20,
    50,
    100,
    inf
)

labels_list = (
    "label1",
    "label2",
    "label3"
)

def return_bucket(val):
    for bucket in buckets_req:
        if val <= bucket:
            return bucket

# Converted from Histogram
HISTOGRAM = prom.Gauge(
    "MetricNameHistogram",
    "Description",
    labelnames=["SomeLabel", "le"],
)

# Workaround for making Histogram from Gauge
# Alphabets are for Grafana to Sort the Buckets
# The alphabets will be removed as part of the transform in Grafana
BUCKET_SLOT = {
    0: "a 0",
    1: "b 1",
    2: "c 2",
    5: "d 3 - 5",
    10: "e 6 - 10",
    20: "f: 11 - 20",
    50: "g 21 - 50",
    100: "h 51 - 100",
    inf: "i 100+",
}

# setting data into the metrics
def update_metrics():
    data = data_from_somewhere # DB or Cache or Somewhere specific
    mapBuckets = {}

    # Need all buckets even if they come out 0
    for labels in labels_list:
        for bucket in buckets_req:
            mapBuckets[label][bucket] = 0

    for item in data:
        count = data.get("someCount")
        bucket = return_bucket(count)
        mapBucket[bucket] += 1

    try:
        for label in labels_list:
            for bucket in mapBuckets:
                bucket_str = (
                    BUCKET_SLOT.get(bucket)
                )
                HISTOGRAM.labels(
                    SomeLabel=label,
                    le=bucket_str
                ).set(mapBuckets[bucket])

    except Exception as e:
        log("Failed adding metrics data to Histogram")

def update_prometheus_metics():
    while True:
        update_metrics()
        time.sleep(3600) # update metrics every hour

The code looked like this for Gauge

This fails to expose metrics properly

import prometheus_client as prom

GAUGE = prom.Gauge(
    "MetricNameGauge",
    "Description",
    labelnames=["SomeLabel"],
)

# setting data into the metrics
def update_metrics():
    data = data_from_somewhere # DB or Cache or Somewhere specific
    try:
        GAUGE.remove("SomeLabel")
    except KeyError as e:
        log("Failed clearing")

    try:
       for item in data:
            count = item.get("someCount")
            label = item.get("label")
            GAUGE.labels(SomeLabel=label).inc(count)
    except Exception as e:
        log("Failed adding metrics data to Histogram")

def update_prometheus_metics():
    while True:
        update_metrics()
        time.sleep(3600) # update metrics every hour

Convert it to

import prometheus_client as prom

GAUGE = prom.Gauge(
    "MetricNameGauge",
    "Description",
    labelnames=["SomeLabel"],
)

# setting data into the metrics
def update_metrics():
    data = data_from_somewhere # DB or Cache or Somewhere specific

    try:
       for item in data:
            count = item.get("someCount")
            label = item.get("label")
            GAUGE.labels(SomeLabel=label).set(count)
    except Exception as e:
        log("Failed adding metrics data to Histogram")

def update_prometheus_metics():
    while True:
        update_metrics()
        time.sleep(3600) # update metrics every hour

Use collect() and log it to view the actual data taken into metrics for debugging.

Grafana Changes

For Histogram

Query expression should be for Histogram

max by (le, SomeLabel) (MetricNameHistogram{Somelabel="label1"})

Format to be

"format": "time_series"

Transformation should be for Histogram

"transformations": [
    {
        "options": {
            "renameByName": {
                "a 0": "0",
                "b 1": "1",
                "c 2": "2",
                "d 3 - 5": "3 - 5",
                "e 6 - 10": "6 - 10",
                "f: 11 - 20": "11 - 20",
                "g 21 - 50": "21 - 50",
                "h 51 - 100": "51 - 100",
                "i 100+": "100+"
            }
        }
    }
]

For Gauges

Only Query changes

max by (MetricNameGauge{Somelabel="label1"})

Max aggregation is essential for this to work, its because Multiple Pods exposes the same metrics and there’s a special label set that shows the pod from which the metric is coming from, with all metrics having the same value, max would pick one metric’s value out of all the same values.

Note: Its always better to make changes to Grafana with Edit option from the UI and then later Export the YAML of the dashboard which can be added in your Grafana’s config map.

Conclusion

In conclusion, this solution is a workaround for custom metrics in Prometheus and Grafana in multi-pod environments where these metrics are usually taken from DB or cache.

This is a complete workaround to keep things intact, instead, if you have any other approaches, or discuss anything on cloud native applications or technology in general, please leave a comment or contact me on Twitter or LinkedIn.

Publishing Custom Metrics from Multi-Pod and Multi-Thread Applications to Grafana in Prometheus Client in Python

Table of contents