Prometheus and Grafana with Cilium

Umair KhanUmair Khan
4 min read
  • Cilium and Hubble can both be configured to serve prometheus metrics.

  • Cilium metrics show how Cilium components are operating and Hubble metrics provide information about network flows and network performance.


Metrics Types

Cilium Operator Metrics

  • Cilium operator metrics can be enabled using Helm and are used for providing observability for Cilium operator.

  • Enabling operator metrics is done as a part of Cilium install process and cilium operator pods will be annotated to aid in Prometheus endpoint discovery.

  • Operator metrics include information concerning the state of the Cilium operator.

  • These metrics are prefixed with cilium_operator_.

Cilium Agents Metrics

  • The Cilium agent metrics encompass much of the operational aspects of Cilium.

  • Cilium agent pods will be annotated to aid in Prometheus endpoint discovery.

  • The Cilium agent metrics can be grouped into several categories:

    • Cluster Health

    • Node Connectivity

    • Cluster Mesh

    • Datapath

    • IPSec

    • eBPF

    • Drops/Forwards (L3/L4)

    • Policy

    • Policy L7 (HTTP/Kafka)

    • Identity

    • Kubernetes

    • IPAM

Hubble Metrics

  • Hubble metrics are based on network flow information and as such are most relevant to understanding traffic flows, rather than the operational performance of Cilium itself.

  • Hubble metrics are categorized into:

    • DNS

    • Drops

    • Flow

    • HTTP

    • TCP

    • ICMP

    • Port Distribution

  • When Hubble metrics are enabled on installation, an annotated headless Kubernetes service named hubble-metrics is created to aid in Prometheus endpoint discovery inside the cluster.

  • No Hubble metrics are enabled by default you have to explicitly configure the Hubble metrics you are interested in.

Hubble Metrics Context Options

  • Flows are so rich with context, it's impossible to adequately map all the information to the Prometheus metrics format in a useful way without running into problems of high cardinality. However we can choose which information to include as labels.

  • Hubble metrics support the use of configurable source and destination labels that you can fill with a piece of flow context that best represents the source and destination.

  • We can set sourceContext and destinationContext options to one of the supported flow attributes.

  • Hubble metrics can also be configured to include additional labels populated from flow information using the labelContext option.


Cilium Helm Chart Options

Cilium Operator Metrics Options

  • operator.prometheus.port Set the operator metrics tcp port (default is 9962)

  • operator.prometheus.enabled - true/false (default is false)

Cilium Agent Metrics Options

  • prometheus.proxy.port - Set the proxy metrics tcp port (default is 9964)

  • prometheus.port - Set the agent metrics tcp port (default is 9962)

  • prometheus.metrics - Space delimited string indicated with Cilium agent metrics to enable/disable

    • Ex: -cilium_node_connectivity_status +cilium_bpf_map_pressure
  • prometheus.enabled - true/false (default is false)

Hubble Metrics Options

  • hubble.metrics.enableOpenMetrics true/false (default is false)

  • hubble.metrics.port Set the Hubble metrics tcp port (default is 9965)

  • hubble.metrics.enabled Comma-delimited list of metrics to enable, with each metric having its own list of options to enable. At least one metric must be provided to enable the Hubble metrics server
    Ex: {first-metric:metric-option1;metric-option2, second-metric, third-metric}

  • prometheus.enabled true/false (default is false)


Cilium Metrics Example

  • For this example we will use Helm based Cilium installation to directly modify Cilium configurations.

  • Install Helm CLI and add Cilium repo.
    helm repo add cilium ht‌tps://helm.cilium.io

  • Uninstall Cilium from cluster.
    cilium uninstall

  • Now install Cilium using Helm.
    helm install cilium cilium/cilium --version 1.13.1 --namespace kube-system --set operator.replicas=1

  • Check Cilium status.
    cilium status

  • Delete all Network Policies.
    kubectl delete --all CiliumNetworkPolicies

  • Lets redeploy DeathStarAPI and CiliumNetworkPolicie.
    kubectl apply -f https:‌//raw.githubusercontent.com/cilium/cilium/v1.13/examples/minikube/http-sw-app.yaml
    kubectl apply -f https:‌//raw.githubusercontent.com/cilium/cilium/v1.13/examples/minikube/sw_l3_l4_l7_policy.yaml

  • Now let's enable the Cilium operator and agent metrics using Helm.
    helm upgrade cilium cilium/cilium --version 1.13.1 --namespace=kube-system --reuse-values --set prometheus.enabled=true --set operator.prometheus.enabled=true

  • Cilium pods should now have annotations.
    kubectl get -n kube-system pod/cilium-59bf7 -o json | jq .metadata.annotations

    {  
      "prometheus.io/port": "9962",  
      "prometheus.io/scrape": "true"  
    }
  • Now curl from Tiefighter to a Cilium agent pod which is on same node to get metrics.
    kubectl get -n kube-system pod/cilium-59bf7 -o json | jq .status.podIP
    kubectl exec -ti pod/tiefighter -- curl http:‌//10.89.0.8:9962/metrics

  • Now lets enable Hubble metrics.
    sudo helm upgrade cilium cilium/cilium --version 1.13.1 --namespace kube-system --reuse-values --set hubble.enabled=true --set hubble.metrics.enabled="{dns,drop,tcp,flow,port-distribution,httpV2}"

  • Restart Cilium damonset.
    kubectl rollout restart daemonset/cilium -n kube-system

  • This will create a new headless service hubble-metrices.
    kubectl -n kube-system get services
    kubectl -n kube-system describe service/hubble-metrics

  • To get Hubble metrics.
    kubectl exec -it tiefighter -- curl http://172.19.0.2:9965/metrics | grep hubble_drop
    kubectl exec -it tiefighter -- curl http://172.19.0.2:9965/metrics | grep hubble_tcp

  • Current Hubble metrics are general their is no additional label context to distinguish by flow to fix this we can add source and destination label.
    helm upgrade cilium cilium/cilium --version 1.13.1 --namespace kube-system --reuse-values --set hubble.enabled=true --set hubble.metrics.enabled="{dns,drop:sourceContext=pod;destinationContext=pod,tcp,flow,port-distribution,httpV2}"
    kubectl rollout restart daemonset/cilium -n kube-system

  • Now we can see source and destination with hubble packets.
    sudo kubectl exec -it pod/xwing -- curl http:// 172.19.0.4:9965/metrics | grep hubble_drop

  • Now add Prometheus and Grafan for Hubble metrics visualization.
    kubectl apply -f https:‌//raw.githubusercontent.com/cilium/cilium/v1.13/examples/kubernetes/addons/prometheus/monitoring-example.yaml

  • Set up a local port forward of the Grafana service.
    kubectl -n cilium-monitoring port-forward service/grafana --address 0.0.0.0 --address :: 3000:3000
    Now Grafana dashboard is avalibel at http://localhost:3000/


0
Subscribe to my newsletter

Read articles from Umair Khan directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Umair Khan
Umair Khan