🧠 Supercharge Kubernetes Troubleshooting with K8sGPT + AI: A Step-by-Step Guide 🚀

Ankit AsthanaAnkit Asthana
4 min read

Kubernetes is a powerful container orchestration tool — but troubleshooting can sometimes feel like finding a needle in a YAML stack. Enter K8sGPT, an open-source AI-powered diagnostic tool that simplifies the debugging of your Kubernetes clusters with natural language explanations powered by OpenAI or local LLMs like Ollama.

In this guide, I’ll walk you through how to set up K8sGPT end-to-end, integrate it with OpenAI or Llama3 via Ollama, and how it works in real production use cases.

Let’s dive in!

🤖 What is K8sGPT?

K8sGPT is a tool that scans your Kubernetes cluster, analyzes the state of your resources (like pods, nodes, deployments), and provides AI-generated explanations and suggestions for resolving issues.

It supports:

  • ✅ Kubernetes diagnostics

  • 🔍 Natural language output via OpenAI, Azure OpenAI, Ollama (local LLMs)

  • 📦 Helm installation

  • 🧪 Works with K3s, EKS, GKE, AKS, etc.

🚧 Prerequisites

  • A running Kubernetes cluster (kubectl configured)

  • Helm installed

  • Optional: OpenAI API key OR Ollama installed for local inference

🛠️ Step 1: Install K8sGPT CLI

Install the K8sGPT CLI on your local machine:

curl -s https://raw.githubusercontent.com/k8sgpt-ai/k8sgpt/main/install.sh | bash

Verify it:

k8sgpt version

☁️ Step 2: Install K8sGPT in Your Cluster Using Helm

helm repo add k8sgpt https://charts.k8sgpt.ai
helm repo update
helm install k8sgpt k8sgpt/k8sgpt \
  --namespace k8sgpt --create-namespace \
  --set config.backend=openai \
  --set config.apiKey="sk-xxxxxxxxxxxxxxxxxxxxxxx" \
  --set config.model=gpt-4

Replace apiKey with your actual OpenAI key.

You can also use ollama as backend if you want offline LLM capability (more on this later).

🔍 Step 3: Scan Your Cluster for Issues

k8sgpt analyze

This command scans your entire cluster and outputs human-friendly diagnostics like:

❌ Deployment nginx-deployment is not progressing.
Reason: ImagePullBackOff
Explanation: The container image could not be pulled. Possible reasons include incorrect image name, lack of access, or network issues.
Suggestion: Verify the image name and ensure it's accessible.

Pretty neat, right?

🧠 Step 4: Run K8sGPT with Ollama + Llama3 (Local AI)

If you don’t want to rely on OpenAI, you can run K8sGPT entirely offline using Ollama, a local LLM runner.

Install Ollama:

curl -fsSL https://ollama.com/install.sh | sh

Pull and Run Llama3:

ollama pull llama3
ollama run llama3

Now Ollama is listening on http://localhost:11434.

Configure K8sGPT with Ollama:

k8sgpt auth add \
  --provider ollama \
  --model llama3 \
  --baseUrl http://localhost:11434

Then run:

k8sgpt analyze

It will use Llama3 running locally to generate insights. No internet required.

🧪 Bonus: Run K8sGPT as a CronJob for Continuous Health Checks

Here’s a simple CronJob YAML to run k8sgpt analyze every 10 minutes and log to a file or push to Slack/Discord:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: k8sgpt-analyze
  namespace: k8sgpt
spec:
  schedule: "*/10 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: k8sgpt
              image: ghcr.io/k8sgpt-ai/k8sgpt:latest
              command: ["k8sgpt", "analyze"]
              env:
                - name: K8SGPT_APIKEY
                  valueFrom:
                    secretKeyRef:
                      name: openai-secret
                      key: apiKey
          restartPolicy: OnFailure

📊 Visualize Results in Slack or Discord

Integrate AI alerts by piping output into webhooks:

k8sgpt analyze --silent | curl -H "Content-Type: application/json" \
  -d '{"content":"'"$(cat -)"'"}' \
  https://discord.com/api/webhooks/your-webhook-url

🛡️ Real-World Use Cases

  • Detecting ImagePullBackOff, CrashLoopBackOff, and OOMKilled errors

  • Finding misconfigured services, unready probes, or bad PVC mounts

  • Quickly debugging Helm charts gone wrong

  • Alerting on degraded apps during CI/CD rollouts

📌 Summary

FeatureTraditional K8sWith K8sGPT + AITroubleshootingManual kubectl + logsAI-generated root causesMulti-resource diagnosisRequires expert effort1 commandOffline support❌✅ via Ollama + Llama3IntegrationPartial (Prometheus)Alerting, Cron, Dashboards

🚀 Wrapping Up

K8sGPT is a powerful tool that brings the magic of AI directly into your Kubernetes cluster. Whether you’re debugging broken pods or hunting for cryptic errors, it translates raw K8s chaos into natural language recommendations.

✅ Works with OpenAI or Ollama
✅ Deployable via Helm
✅ Super easy to integrate with CI/CD + Observability

Follow me for more on Kubernetes + AI integrations, DevOps automation, and real-world infra setups!

If this helped, give it a 👏 or drop a comment below. Happy debugging!

#DevOps #AWS #Docker #Kubernetes #Security #AI #PromptOps

0
Subscribe to my newsletter

Read articles from Ankit Asthana directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ankit Asthana
Ankit Asthana