🧠 Supercharge Kubernetes Troubleshooting with K8sGPT + AI: A Step-by-Step Guide 🚀

Kubernetes is a powerful container orchestration tool — but troubleshooting can sometimes feel like finding a needle in a YAML stack. Enter K8sGPT, an open-source AI-powered diagnostic tool that simplifies the debugging of your Kubernetes clusters with natural language explanations powered by OpenAI or local LLMs like Ollama.
In this guide, I’ll walk you through how to set up K8sGPT end-to-end, integrate it with OpenAI or Llama3 via Ollama, and how it works in real production use cases.
Let’s dive in!
🤖 What is K8sGPT?
K8sGPT is a tool that scans your Kubernetes cluster, analyzes the state of your resources (like pods, nodes, deployments), and provides AI-generated explanations and suggestions for resolving issues.
It supports:
✅ Kubernetes diagnostics
🔍 Natural language output via OpenAI, Azure OpenAI, Ollama (local LLMs)
📦 Helm installation
🧪 Works with K3s, EKS, GKE, AKS, etc.
🚧 Prerequisites
A running Kubernetes cluster (
kubectl
configured)Helm installed
Optional: OpenAI API key OR Ollama installed for local inference
🛠️ Step 1: Install K8sGPT CLI
Install the K8sGPT CLI on your local machine:
curl -s https://raw.githubusercontent.com/k8sgpt-ai/k8sgpt/main/install.sh | bash
Verify it:
k8sgpt version
☁️ Step 2: Install K8sGPT in Your Cluster Using Helm
helm repo add k8sgpt https://charts.k8sgpt.ai
helm repo update
helm install k8sgpt k8sgpt/k8sgpt \
--namespace k8sgpt --create-namespace \
--set config.backend=openai \
--set config.apiKey="sk-xxxxxxxxxxxxxxxxxxxxxxx" \
--set config.model=gpt-4
Replace
apiKey
with your actual OpenAI key.
You can also use ollama
as backend if you want offline LLM capability (more on this later).
🔍 Step 3: Scan Your Cluster for Issues
k8sgpt analyze
This command scans your entire cluster and outputs human-friendly diagnostics like:
❌ Deployment nginx-deployment is not progressing.
Reason: ImagePullBackOff
Explanation: The container image could not be pulled. Possible reasons include incorrect image name, lack of access, or network issues.
Suggestion: Verify the image name and ensure it's accessible.
Pretty neat, right?
🧠 Step 4: Run K8sGPT with Ollama + Llama3 (Local AI)
If you don’t want to rely on OpenAI, you can run K8sGPT entirely offline using Ollama, a local LLM runner.
Install Ollama:
curl -fsSL https://ollama.com/install.sh | sh
Pull and Run Llama3:
ollama pull llama3
ollama run llama3
Now Ollama is listening on
http://localhost:11434
.
Configure K8sGPT with Ollama:
k8sgpt auth add \
--provider ollama \
--model llama3 \
--baseUrl http://localhost:11434
Then run:
k8sgpt analyze
It will use Llama3 running locally to generate insights. No internet required.
🧪 Bonus: Run K8sGPT as a CronJob for Continuous Health Checks
Here’s a simple CronJob
YAML to run k8sgpt analyze
every 10 minutes and log to a file or push to Slack/Discord:
apiVersion: batch/v1
kind: CronJob
metadata:
name: k8sgpt-analyze
namespace: k8sgpt
spec:
schedule: "*/10 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: k8sgpt
image: ghcr.io/k8sgpt-ai/k8sgpt:latest
command: ["k8sgpt", "analyze"]
env:
- name: K8SGPT_APIKEY
valueFrom:
secretKeyRef:
name: openai-secret
key: apiKey
restartPolicy: OnFailure
📊 Visualize Results in Slack or Discord
Integrate AI alerts by piping output into webhooks:
k8sgpt analyze --silent | curl -H "Content-Type: application/json" \
-d '{"content":"'"$(cat -)"'"}' \
https://discord.com/api/webhooks/your-webhook-url
🛡️ Real-World Use Cases
Detecting ImagePullBackOff, CrashLoopBackOff, and OOMKilled errors
Finding misconfigured services, unready probes, or bad PVC mounts
Quickly debugging Helm charts gone wrong
Alerting on degraded apps during CI/CD rollouts
📌 Summary
FeatureTraditional K8sWith K8sGPT + AITroubleshootingManual kubectl + logsAI-generated root causesMulti-resource diagnosisRequires expert effort1 commandOffline support❌✅ via Ollama + Llama3IntegrationPartial (Prometheus)Alerting, Cron, Dashboards
🚀 Wrapping Up
K8sGPT is a powerful tool that brings the magic of AI directly into your Kubernetes cluster. Whether you’re debugging broken pods or hunting for cryptic errors, it translates raw K8s chaos into natural language recommendations.
✅ Works with OpenAI or Ollama
✅ Deployable via Helm
✅ Super easy to integrate with CI/CD + Observability
Follow me for more on Kubernetes + AI integrations, DevOps automation, and real-world infra setups!
If this helped, give it a 👏 or drop a comment below. Happy debugging!
#DevOps #AWS #Docker #Kubernetes #Security #AI #PromptOps
Subscribe to my newsletter
Read articles from Ankit Asthana directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
