Introduction

Prometheus is widely used for monitoring and alerting in cloud-native environments, particularly in Kubernetes. When collecting metrics, they can generally be categorized into three levels:

Node-Level Metrics (System-level metrics)
Cluster-Level Metrics (Kubernetes-level metrics)
Application-Level Metrics (Application-specific metrics)

Understanding these levels helps in designing an efficient monitoring setup that provides insights into infrastructure health, cluster performance, and application behavior.

1. Node-Level Metrics

What Are Node-Level Metrics?

Node-level metrics provide insights into the health and performance of the underlying infrastructure (VMs, physical machines, or Kubernetes nodes). These metrics are essential for understanding system resource utilization and detecting performance bottlenecks.

Examples of Node-Level Metrics:

Metric	Description	PromQL Query Example
`node_cpu_seconds_total`	Total CPU usage per mode (user/system/idle)	`rate(node_cpu_seconds_total[5m])`
`node_memory_MemAvailable_bytes`	Available memory in bytes	`node_memory_MemAvailable_bytes`
`node_disk_io_time_seconds_total`	Disk I/O operations time	`rate(node_disk_io_time_seconds_total[5m])`
`node_network_receive_bytes_total`	Network received bytes	`rate(node_network_receive_bytes_total[5m])`

How to Collect Node-Level Metrics?

The most commonly used exporter for collecting node-level metrics is Node Exporter.

Install Node Exporter on a Linux Node:

wget https://github.com/prometheus/node_exporter/releases/latest/download/node_exporter-linux-amd64.tar.gz
tar -xvzf node_exporter-linux-amd64.tar.gz
./node_exporter &

By default, Node Exporter exposes metrics at http://localhost:9100/metrics, which Prometheus can scrape.

Configure Prometheus to Scrape Node Exporter:

scrape_configs:
  - job_name: "node"
    static_configs:
      - targets: ["localhost:9100"]

2. Cluster-Level Metrics

What Are Cluster-Level Metrics?

Cluster-level metrics provide insights into Kubernetes resource utilization. These metrics help in monitoring the health and performance of the Kubernetes cluster, including nodes, pods, deployments, and API server interactions.

Examples of Cluster-Level Metrics:

Metric	Description	PromQL Query Example
`kube_node_status_ready`	Indicates if a node is ready	`kube_node_status_ready`
`kube_pod_status_phase`	Number of running/pending pods	`count(kube_pod_status_phase{phase="Running"})`
`kube_deployment_status_replicas_available`	Available replicas in deployments	`kube_deployment_status_replicas_available`
`kubelet_volume_stats_used_bytes`	Storage volume usage per pod	`kubelet_volume_stats_used_bytes`

How to Collect Cluster-Level Metrics?

For Kubernetes environments, the Kube State Metrics exporter is commonly used to gather cluster-level metrics.

Deploy Kube State Metrics in Kubernetes:

kubectl apply -f https://github.com/kubernetes/kube-state-metrics/releases/latest/download/kube-state-metrics.yaml

This exposes Kubernetes resource metrics at /metrics, which Prometheus can scrape.

Configure Prometheus to Scrape Kube State Metrics:

scrape_configs:
  - job_name: "kube-state-metrics"
    static_configs:
      - targets: ["kube-state-metrics.kube-system.svc.cluster.local:8080"]

3. Application-Level Metrics

What Are Application-Level Metrics?

Application-level metrics provide insights into the performance and behavior of individual applications running inside the cluster. These metrics can include HTTP request counts, response times, error rates, database queries, and business-specific KPIs.

Examples of Application-Level Metrics:

Metric	Description	PromQL Query Example
`http_requests_total`	Total number of HTTP requests received	`rate(http_requests_total[5m])`
`http_request_duration_seconds`	Duration of HTTP requests	`histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))`
`db_query_duration_seconds`	Database query duration	`avg(rate(db_query_duration_seconds[5m]))`
`redis_commands_total`	Number of Redis commands executed	`rate(redis_commands_total[5m])`

How to Collect Application-Level Metrics?

Applications must expose their own metrics using instrumentation libraries or exporters.

Example: Exposing Metrics in a Python Flask App

from flask import Flask
from prometheus_flask_exporter import PrometheusMetrics

app = Flask(__name__)
metrics = PrometheusMetrics(app)

@app.route('/')
def home():
    return "Hello, World!"

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8080)

This exposes metrics at http://localhost:8080/metrics.

Configure Prometheus to Scrape the Application Metrics:

scrape_configs:
  - job_name: "flask-app"
    static_configs:
      - targets: ["localhost:8080"]

Summary

Level	Description	Example Metrics	Exporters/Tools
Node Level	System-level metrics (CPU, memory, disk, network)	`node_cpu_seconds_total`, `node_memory_MemAvailable_bytes`	Node Exporter
Cluster Level	Kubernetes resource-level metrics	`kube_node_status_ready`, `kube_pod_status_phase`	Kube State Metrics
Application Level	Application-specific performance metrics	`http_requests_total`, `http_request_duration_seconds`	Custom instrumentation, Flask, JMX Exporter

Conclusion

Understanding Node-Level, Cluster-Level, and Application-Level Metrics is crucial for effectively monitoring a Kubernetes-based system.

Node-level metrics help in tracking infrastructure health.
Cluster-level metrics give insights into Kubernetes resources and workloads.
Application-level metrics monitor app performance and business KPIs.

Each level requires the right exporters and Prometheus configurations to ensure a comprehensive monitoring setup.

Node Level, Cluster Level, and Application-Level Metrics in Prometheus

Table of contents