🛠️ 15 Common Kubernetes Errors and How to Fix Them


Kubernetes is powerful… but also tricky.
If you’ve worked with it in production, you’ve probably seen pods restarting endlessly, images failing to pull, or volumes stuck in Pending
.
In this guide, I’ll share 15 of the most common Kubernetes errors, why they happen, and how to fix them. I’ve added YAML snippets, commands, and diagrams so you can debug faster.
🔄 Pod Lifecycle (Quick Reference)
flowchart LR
A[Pending] --> B[ContainerCreating]
B --> C[Running]
C -->|Crash| D[CrashLoopBackOff]
C -->|Exit 0| E[Succeeded]
C -->|Exit 1| F[Failed]
1. ❌ CrashLoopBackOff
Meaning: Pod starts, crashes, and restarts in a loop.
Fix Steps:
kubectl logs <pod-name> kubectl describe pod <pod-name>
Example YAML (fix missing env vars & probes):
apiVersion: v1
kind: Pod
metadata:
name: myapp
spec:
containers:
- name: app
image: myapp:v1
env:
- name: DB_HOST
value: "db-service"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
2. ❌ ImagePullBackOff
Meaning: Kubernetes cannot pull the container image.
Fix:
Check image name & tag
Add secret for private registry
spec:
containers:
- name: app
image: myregistry.com/team/app:v2
imagePullSecrets:
- name: regcred
kubectl create secret docker-registry regcred \
--docker-server=myregistry.com \
--docker-username=user \
--docker-password=pass
3. ❌ ImagePullPolicy = Never
Meaning: Pod tries to use only local images.
Fix:
imagePullPolicy: IfNotPresent
4. ❌ Node NotReady
Meaning: Worker node unhealthy.
Fix:
kubectl describe node <node-name>
systemctl restart kubelet
5. ❌ Pod Stuck in Pending
Meaning: Pod can’t find a node.
Fix:
- Check resources, taints, tolerations.
kubectl describe pod <pod>
6. ❌ OOMKilled
Meaning: Pod killed due to memory overuse.
Fix (set limits):
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
7. ❌ Pod Stuck in Terminating
Meaning: Pod won’t delete.
Fix:
kubectl delete pod <pod> --grace-period=0 --force
8. ❌ Service Not Accessible
Meaning: Service misconfigured or no endpoints.
Fix:
kubectl describe service <svc>
kubectl get endpoints <svc>
Diagram:
flowchart LR
Client --> Service --> Endpoints --> Pod
9. ❌ DNS Resolution Failure
Meaning: CoreDNS not resolving service names.
Fix:
kubectl logs -n kube-system <coredns-pod>
10. ❌ RBAC Forbidden
Meaning: User doesn’t have permission.
Fix (RoleBinding):
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: dev-access
namespace: dev
subjects:
- kind: User
name: dev-user
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
11. ❌ Unauthorized (Kubeconfig Issue)
Meaning: Wrong kubeconfig credentials.
Fix: Update
~/.kube/config
or re-download from cluster.
12. ❌ PVC Pending
Meaning: PVC not binding.
Fix:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-volume
spec:
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/pv"
13. ❌ ImagePullSecret Missing
Meaning: Secret not found.
Fix: Ensure correct reference under
imagePullSecrets
.
14. ❌ Read-Only Filesystem
Meaning: Volume mounted as read-only.
Fix:
volumeMounts:
- name: data
mountPath: /app/data
readOnly: false
15. ❌ Kube-API Server Down
Meaning: Control plane unresponsive.
Fix:
systemctl restart kube-apiserver
⚡ Key Takeaways
Always start with:
kubectl describe pod <pod> kubectl logs <pod>
Define resource limits, probes, and secrets properly.
Use monitoring (Prometheus, Grafana, CloudWatch) to catch issues early.
💡 Have you faced any of these errors in production? Share your experience in the comments — let’s debug together!
Subscribe to my newsletter
Read articles from Rohit Jangra directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
