My Pod Got OOMKilled

Introduction: What Triggered This Post?
I was working with Kubernetes and encountered a real-world issue:
🚨 My pod kept crashing with an OOMKilled
error even though the node had free memory.
So, I decided to debug the problem step-by-step — and what I found helped me better understand Requests, Limits, and Metrics Server in Kubernetes.
🔍 What Are Requests and Limits in K8s?
Request = Minimum resources a pod is guaranteed.
Limit = Maximum resources a pod can use.
If a pod uses more than its memory limit, it's terminated with
OOMKilled
.This protects the node from crashing by isolating the faulty pod.
❗ Real-World Scenarios I Faced
✅ Pod couldn't be scheduled due to insufficient resources
I had 2 nodes completely full. When I deployed a new pod, Kubernetes couldn’t schedule it.
It showed:0/2 nodes are available: Insufficient cpu.
✅ Pod crashed when memory exceeded
A pod used more memory than its limit. Result:
OOMKilled – Container was terminated because it tried to use more than allowed.
The node stayed healthy, only the pod failed.
✅ Request vs Limit Difference
Request: Tells scheduler “I need at least this much”.
Limit: Tells kubelet “I can’t cross this much”.
➡️ Going beyond limit = crash
➡️ Going beyond request but within limit = allowed
✅ Why only the pod is killed, not the node
Kubernetes enforces this so one misbehaving pod doesn't bring down the entire node.
Smart design 💡
📊 Metrics Server & Monitoring
✅ What is Metrics Server?
A lightweight component that gives live resource usage of pods/nodes.
Used with kubectl top
and autoscaling.
✅ How I Installed It
kubectl apply -f metrics-server.yaml
✅ Check if it’s running
kubectl get po -n kube-system
✅ Check Node Metrics
kubectl top node
🧪 Memory Stress Testing – Live Experiment
I wanted to test what happens if I push a pod beyond its memory limit.
✅ Step 1: Create a namespace
kubectl create ns mem-example
✅ Step 2: Deploy a pod with memory request/limit
In my YAML::
resources:
requests:
memory: "50Mi"
limits:
memory: "100Mi"
Then I ran a container that tried to consume 250Mi
.
✅ Step 3: Watch behavior
kubectl get po -n mem-example kubectl top pod memory-demo -n mem-example
✅ Result: OOMKilled
The pod was terminated. Reason:
OOMKilled
.
Node stayed healthy. That’s exactly how Kubernetes should behave.
📌 Lesson Learned
Don’t leave limits undefined — risky!
Use
metrics-server
for live debugging.Always monitor behavior after deploying resource-intensive apps.
💡 My Thought
This was a small failure, but a huge learning.
I’ll keep facing and documenting these bugs as I work toward becoming a better cloud-native engineer ☁️👨💻
Subscribe to my newsletter
Read articles from TheTansih directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
