Set up your own Kubernetes Monitoring Stack with Prometheus and Grafana on Minikube

Hey everyone! In this blog we’re going to setup a local kubernetes cluster with monitoring using helm charts,prometheus and grafana.Kubernetes clusters must be monitored to ensure health, performance and reliability. Kubernetes monitoring is essential so you can track performance metrics, audit changes… and retrieve logs that help debug application crashes. Troubleshooting Kubernetes without a monitoring solution is tedious and error-prone.In this stack, Prometheus serves as the time-series metrics database and alert engine, while Grafana provides a dashboard UI to query and visualize metrics. Prometheus scrapes metrics exposed by various exporters (components that publish data in Prometheus format), and Alertmanager handles any triggered alerts.

πŸš€ Why Monitoring?

In Kubernetes environments, monitoring allows you to:

  • Track CPU, memory, and network usage of pods and nodes.

  • Identify bottlenecks and failures in real time.

  • Trigger alerts based on specific conditions.

  • Visualize metrics to make data-driven decisions.

Architecture Overview

The monitoring stack runs on a single-node Kubernetes cluster (Minikube). In Minikube, the Control Plane (API server, scheduler, etcd) runs on your laptop. The kubelet is a node-agent on each worker that ensures pods are running; it takes a set of PodSpecs and ensures that the containers described are running and healthy. We deploy sample workloads – a CPU stress pod (cpu-stress), a memory burn Job (memory-hog), and an nginx web app (nginx-demo) –which generate measurable resource usage.

A Node Exporter runs on each node (as a DaemonSet) and exposes OS-level metrics (CPU, memory, disk, network, etc.) that Prometheus scrapes.Another key exporter is kube-state-metrics, which listens to the Kubernetes API and emits metrics about the state of objects (Deployments, Pods, Nodes, etc.). Prometheus periodically scrapes these exporters and the Kubernetes API server to collect metrics, storing them in its time-series database. Prometheus also evaluates alerting rules: when a metric crosses a defined threshold, Prometheus sends an alert to Alertmanager.

The Alertmanager component receives alerts from Prometheus, then groups, duplicates, and routes them to notification channels according to its config.Meanwhile, Grafana connects to Prometheus as a data source and lets users build dashboards to visualize metrics. Grafana provides a user-friendly interface for visualizing metrics through customizable dashboard and integrates seamlessly with Prometheus. This stack gives full visibility Prometheus/Alertmanager handle metric collection and alerts, and Grafana handles dashboards and queries.

Installation

Ensure this file stucture after cloning the project

(https://github.com/Prianshu-git/Prometheus-Grafana-K8-monitoring-stack)

minikube-prometheus-grafana-demo/
β”œβ”€β”€ charts/
β”œβ”€β”€ manifests/
β”‚   β”œβ”€β”€ cpu-stress.yaml
β”‚   β”œβ”€β”€ nginx.yaml
β”‚   └── memory-hog.yaml
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ start.sh
β”‚   β”œβ”€β”€ deploy-demo.sh
β”‚   β”œβ”€β”€ access.sh
β”‚   └── cleanup.sh
β”œβ”€β”€ README.md
└── .gitignore

cri-dockerd is used to bridge Kubernetes with Docker to run pods defined in this project. Install by cloning their repository

git clone https://github.com/Mirantis/cri-dockerd
cd cri-dockerd
go build -o bin/cri-dockerd
sudo install -o root -g root -m 0755 bin/cri-dockerd /usr/local/bin/cri-dockerd

To start the service

sudo cp packaging/systemd/* /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable cri-docker.service
sudo systemctl start cri-docker.service

The files inside the manifests/ are for creating Kubernetes pods with containers like nginx,stress and busybox for defining the services.Each pod may run one or more containers.

cpu-stress.yaml Generates consistent CPU load for demonstrating CPU graphs in Grafana.

nignx.yaml Creates a basic NGINX web server for showing network I/O and pod restarts.

memory-hog.yaml Creates a memory-intensive job that sleeps for simulating memory spikes and testing alert thresholds.

The files in scripts/ are regular shell scripts with names respective to their functions

scripts/start.sh

minikube start --cpus=4 --memory=8192
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack

Minikube starts a local Kubernetes cluster with specified resources.Helm provides a simple, declarative way to install and manage Prometheus and Grafana on Kubernetes using Helm Charts which are packaged applications that include templated Kubernetes manifests and default configuration values which covers things like monitoring,resource definitions,cluster object state as metrics,etc.

scripts/deploy-demo.sh

kubectl apply -f manifests/cpu-stress.yaml
kubectl apply -f manifests/nginx.yaml
kubectl apply -f manifests/memory-hog.yaml

It applies all the kubernetes manifest files using kubectl.When we deploy test workloads like cpu-stress, nginx, and memory-hog using kubectl apply, Kubernetes schedules these pods and delegates container execution to Docker via the CRI-Dockerd shim, ensuring compatibility with Kubernetes even after Docker support was deprecated. Simultaneously, the CNI plugin assigns each pod a virtual IP from an internal subnet and sets up virtual networking that allows seamless pod-to-pod and service-to-pod communication. This enables Prometheus, installed via Helm, to discover and scrape metrics from the pods through ServiceMonitor definitions that resolve DNS and route traffic over the CNI-managed overlay network.

scripts/access.sh

kubectl port-forward svc/prometheus-grafana 3000:80
kubectl port-forward svc/prometheus-kube-prometheus-prometheus

The kubectl port-forward commands work by exposing a service or pod running inside your Kubernetes cluster to your local machine, making it accessible via localhost.svc/prometheus-grafana refers to a kubernetes service that routes to the pods running Grafana and port 80 is the service port inside the cluster.3000:80 Forwards traffic from your localhost:3000 β†’ Service's port 80 β†’ Grafana Pod.

You can access your Grafana Dashboard now at: http://localhost:3000 username:admin password:prom-operator

Additionally you can use forwarding such as localhost:9090 β†’ Service: prometheus-kube-prometheus-prometheus (port auto-mapped, defaults to 9090) to access prometheus UI.

kubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090

Prometheus UI at:http://localhost:9090

scripts/cleanup.sh

Tears everything down

helm uninstall prometheus
kubectl delete -f manifests/
minikube delete

Handy Debugs

Command CategoryCommandDescription
Pod CPU & Memory Usagekubectl top pod -AShows real-time CPU/Memory of all pods (needs metrics-server)
Node CPU & Memory Usagekubectl top nodeShows CPU/Memory usage per node
Live Watch on Resource Usagewatch -n 2 kubectl top pod -ALive updating resource metrics
Pod Restarts / Failureskubectl get pods -A --sort-by='.status.containerStatuses[0].restartCount'Lists pods with highest restart counts
Detailed Pod Debugkubectl describe pod <pod> -n <ns>Shows events, container status, reasons for failures
View Pod Logskubectl logs <pod> or kubectl logs -f <pod>Get logs (static or real-time) from container
Multi-container Podskubectl logs <pod> -c <container>Get logs from a specific container
Shell Into Podkubectl exec -it <pod> -- /bin/shDebug from inside the container
All Events (Chronological)kubectl get events --sort-by='.metadata.creationTimestamp'Inspect warnings, OOMs, scheduling failures
DNS Resolution Checkkubectl exec <pod> -- nslookup kubernetes.defaultTests internal DNS is working
Test Network Reachabilitykubectl exec <pod> -- curl <service>.<ns>.svc.cluster.local:<port>Verifies service connectivity from inside cluster
List Network Policieskubectl get networkpolicy -AShows active network restrictions
All Services (Debug Routing)kubectl get svc -A, kubectl describe svc <svc>Validate routing, ports, selectors
All Ingress Routeskubectl get ingress -ACheck external HTTP(S) entry points
Pod IPs & Node Mappingkubectl get pods -o wideShows IP, node, container info
Port Forward Servicekubectl port-forward svc/<svc> <local>:<svc-port>Access a service locally via localhost:<port>
Port Forward Podkubectl port-forward pod/<pod> <local>:<container-port>Same as above, but direct to pod
Node Metrics via Node Exporterkubectl port-forward svc/prometheus-node-exporter 9100:9100 -n monitoringAccess raw node metrics at localhost:9100/metrics
View Prometheus Targetskubectl port-forward svc/prometheus-kube-prometheus-prometheus 9090Use Prometheus UI to debug scraping issues
Grafana Dashboard Accesskubectl port-forward svc/prometheus-grafana 3000:80Access monitoring dashboards on localhost:3000
SSH into Minikube VMminikube sshInspect system-level stats like disk and processes
Disk Usage (Inside node)df -hView partition usage
Memory Usage (Inside node)free -m or topSee available RAM and active processes
Network Interfaces (Inside node)ip a or ifconfigView pod bridges and node IPs
Cluster Component Statuskubectl get componentstatusesStatus of etcd, scheduler, controller-manager
List All Namespaceskubectl get nsValidate if workloads exist in correct namespaces
All Deployments & ReplicaSetskubectl get deploy,rs -AMonitor rollout status, scaling, replicas
All Nodes & Conditionskubectl get nodes -o wide, kubectl describe node <node>View allocatable resources, taints, disk pressure, etc.

This setup is great for learning, testing, and development. For production, you’d deploy on multi-node clusters with persistent volumes, external dashboards, and secure routing.

Resources

0
Subscribe to my newsletter

Read articles from Prianshu Mukherjee directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Prianshu Mukherjee
Prianshu Mukherjee