Discovering and Using Existing Prometheus Setup on AKS

sharath chandrasharath chandra
3 min read

When adding monitoring for your applications on AKS, you often don't need to start from scratch. Most AKS clusters already have monitoring infrastructure in place. Lets walk through steps discovering and leveraging an existing Prometheus setup.

Discovery

Ensure that we have prometheus installed as custom operator

kubectl get crds | grep "prometheus"

The above should o/p the installed operators

prometheusagents.monitoring.coreos.com            2024-03-25T15:25:22Z
prometheuses.monitoring.coreos.com                2024-03-25T15:25:24Z
prometheusrules.monitoring.coreos.com             2024-03-25T15:25:25Z

Check what monitoring components are already running in your cluster:

# List all pods in the monitoring namespace
kubectl get pods -n monitoring

You might see output like this:

NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-prometheus-stack-kube-prom-alertmanager-0   2/2     Running   0          2d6h
prometheus-prometheus-stack-kube-prom-prometheus-0       2/2     Running   0          28h
prometheus-stack-grafana-dsajdksja-dksla                 3/3     Running   0          2d1h
prometheus-stack-kube-prom-operator-xyxxxxxxx-pqrst      1/1     Running   0          2d22h
prometheus-stack-kube-state-metrics-xyxxxxxxx-pqrst      1/1     Running   0          3d3h
prometheus-stack-prometheus-node-exporter-xyxxx          1/1     Running   0          2d5h
prometheus-stack-prometheus-node-exporter-xyxxx          1/1     Running   0          3d15h

This output indicates the presence of:

  • A Prometheus server (prometheus-stack-kube-prom-prometheus-0)

  • Grafana (prometheus-stack-grafana-...)

  • Prometheus Operator (prometheus-stack-kube-prom-operator-...)

  • Kube State Metrics (prometheus-stack-kube-state-metrics-...)

  • Node Exporter (prometheus-stack-prometheus-node-exporter-...)

Checking Existing Rules and Alerts

# Port-forward to Prometheus UI (temporary access)
kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090

Browse http://localhost:9090/rules in your browser to view the rules

Adding New Alert Rules

In order to add new rules, we first need to identify the selectors on which the installed prometheus rules are applied.

kubectl get prometheus -n monitoring -o yaml
ruleNamespaceSelector: {}  # Which namespaces to look for rules
ruleSelector:              # Which rules to pick up
  matchLabels:
    release: prometheus-stack

This tells you that Prometheus is looking for rules with the label release: prometheus-stack.

Add new custom rules as k8s resource using kind PrometheusRule

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: airflow-health-alerts
  namespace: airflow  # Your application namespace
  labels:
    release: prometheus-stack  # Match the ruleSelector!
spec:
  groups:
  - name: airflow-health.rules
    rules:
    - alert: AirflowWebserverUnhealthy
      expr: kube_pod_container_status_ready{namespace="airflow", container="webserver"} == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Airflow webserver is not ready"
        description: "Airflow webserver pod has failed readiness probe for more than 5 minutes."

Apply the rule kubectl apply -f prometheusrule.yaml and wait for the rule to be picked up.

Verifying Your New Rules

  1. Wait a few moments for Prometheus to reload rules (usually < 1 minute)

  2. Visit the Prometheus rules page again

  3. Look for your rule name under the rule groups

The expression kube_pod_container_status_ready{namespace="airflow", container="webserver"} == 0 will monitor the readiness state of your Airflow webserver pods.

Testing Your Alerts

You can test if your alerts work by:

  1. Temporarily modifying a deployment to fail its readiness checks

  2. Waiting for the alert duration (e.g., 5 minutes)

  3. Checking if the alert fires in Prometheus UI and AlertManager

Managing Prometheus Rules in GitOps

For proper operations, store your rules in your application's Git repository and manage them with your CI/CD or GitOps tool (like Argo CD):

my-application/
├── deployment.yaml
├── service.yaml
└── monitoring/
    └── prometheusrule.yaml

This way, your application and its monitoring configuration stay in sync.

0
Subscribe to my newsletter

Read articles from sharath chandra directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

sharath chandra
sharath chandra