Intermediatekubernetesk8saimldevopssretroubleshootinggolangcloud-nativemonitoring

k8sgpt: Kubernetes AI-powered troubleshooting and diagnostics

AI-powered Kubernetes cluster scanner that diagnoses issues and provides actionable insights by combining SRE expertise with AI enrichment.

Step 1
What is k8sgpt?
k8sgpt is a powerful tool for scanning Kubernetes clusters, diagnosing issues, and triaging problems in simple English. It brings SRE experience codified into its analyzers and uses AI to pull out the most relevant information and enrich it with intelligent insights.

The tool supports out-of-the-box integration with multiple AI providers including OpenAI, Azure, Cohere, Amazon Bedrock, Google Gemini, Anthropic (Claude), IBM WatsonX, HuggingFace, and local models via Ollama.

k8sgpt provides:
- Built-in analyzers for common Kubernetes issues
- AI-powered explanations and recommendations
- Model Context Protocol (MCP) server for AI assistants
- Custom analyzer support
- Remote caching capabilities
- Configuration management
Step 2
Technology stack
k8sgpt is built using Go (Golang) with a comprehensive set of libraries for Kubernetes integration, AI/ML backends, and cloud provider APIs:

Core Language & Build:
- Go 1.24.1 (with toolchain go1.24.11)
- Go modules for dependency management
- CGO disabled for static builds
Kubernetes Client Libraries:
- k8s.io/api v0.32.3
- k8s.io/apimachinery v0.32.3
- k8s.io/client-go v0.32.3
- k8s.io/kubectl v0.32.2
- sigs.k8s.io/controller-runtime v0.19.3
- sigs.k8s.io/gateway-api v1.2.1
CLI Framework:
- github.com/spf13/cobra v1.8.1 (CLI framework)
- github.com/spf13/viper v1.19.0 (configuration management)
- github.com/pterm/pterm v0.12.80 (terminal UI)
AI/ML Integrations:
- github.com/sashabaranov/go-openai v1.36.0 (OpenAI)
- github.com/anthropics/anthropic-sdk-go v1.44.0 (Claude)
- github.com/cohere-ai/cohere-go/v2 v2.12.2 (Cohere)
- github.com/ollama/ollama v0.17.1 (Local Ollama)
- github.com/hupe1980/go-huggingface v0.0.15 (HuggingFace)
- github.com/google/generative-ai-go v0.19.0 (Google Gemini)
- github.com/IBM/watsonx-go v1.0.1 (IBM WatsonX)
Cloud Provider SDKs:
- AWS: github.com/aws/aws-sdk-go-v2 (Bedrock, various services)
- Azure: github.com/Azure/azure-sdk-for-go (Auth, Storage)
- Google Cloud: cloud.google.com/go (Storage, Vertex AI)
- Oracle: github.com/oracle/oci-go-sdk/v65
Helm & Package Management:
- helm.sh/helm/v3 v3.17.4
- github.com/mittwald/go-helm-client v0.12.14
Monitoring & Integration:
- github.com/prometheus/prometheus v0.306.0
- github.com/kedacore/keda/v2 v2.16.0
- github.com/kyverno/policy-reporter-kyverno-plugin v1.6.4
Model Context Protocol (MCP):
- github.com/mark3labs/mcp-go v0.36.0
- gRPC and Protocol Buffers for MCP server
Utilities & Testing:
- github.com/stretchr/testify
- github.com/fatih/color
- github.com/olekukonko/tablewriter
- golang.org/x/term
```
Core Stack:
├── Go 1.24.1
├── Cobra CLI Framework
├── Viper Configuration
└── Pterm Terminal UI

Kubernetes:
├── k8s.io/api v0.32.3
├── k8s.io/client-go v0.32.3
├── controller-runtime v0.19.3
└── gateway-api v1.2.1

AI Backends:
├── OpenAI (go-openai)
├── Anthropic Claude
├── Cohere
├── Google Gemini
├── AWS Bedrock
├── IBM WatsonX
├── HuggingFace
└── Ollama (local)

Integrations:
├── Prometheus
├── KEDA
├── Kyverno
├── Helm
└── MCP Server
```

Step 3

Installation methods

k8sgpt can be installed via multiple methods:

Homebrew (macOS/Linux):

brew install k8sgpt

Binary Download (Linux x86_64):

curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.4.33/k8sgpt_amd64.deb
sudo dpkg -i k8sgpt_amd64.deb

Docker:

docker run -it -v ~/.kube/config:/root/.kube/config ghcr.io/k8sgpt-ai/k8sgpt:latest analyze

Helm (Kubernetes Deployment):

helm repo add k8sgpt https://k8sgpt-ai.github.io/k8sgpt
helm install k8sgpt k8sgpt/k8sgpt --namespace k8sgpt --create-namespace

Build from Source:

git clone https://github.com/k8sgpt-ai/k8sgpt.git
cd k8sgpt
make build
./bin/k8sgpt --help

# Homebrew
brew install k8sgpt

# Binary download
curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.4.33/k8sgpt_amd64.deb
sudo dpkg -i k8sgpt_amd64.deb

# Docker
docker run -it -v ~/.kube/config:/root/.kube/config ghcr.io/k8sgpt-ai/k8sgpt:latest analyze

# Helm
helm repo add k8sgpt https://k8sgpt-ai.github.io/k8sgpt
helm install k8sgpt k8sgpt/k8sgpt --namespace k8sgpt --create-namespace

Step 4

AI provider configuration

k8sgpt supports multiple AI providers for enhanced analysis and explanations:

OpenAI:

export OPENAI_API_KEY="sk-..."
k8sgpt analyze --ai openai --explain

AWS Bedrock:

export AWS_REGION="us-east-1"
k8sgpt analyze --ai bedrock --explain

Google Gemini:

export GOOGLE_API_KEY="your-google-api-key"
k8sgpt analyze --ai gemini --explain

Local Ollama:

k8sgpt analyze --ai localai --ollama-base-url "http://localhost:11434" --explain

Anthropic Claude:

export ANTHROPIC_API_KEY="sk-ant-..."
k8sgpt analyze --ai anthropic --explain

# OpenAI
export OPENAI_API_KEY="sk-..."
k8sgpt analyze --ai openai --explain

# AWS Bedrock
export AWS_REGION="us-east-1"
k8sgpt analyze --ai bedrock --explain

# Google Gemini
export GOOGLE_API_KEY="your-google-api-key"
k8sgpt analyze --ai gemini --explain

# Local Ollama
k8sgpt analyze --ai localai --ollama-base-url "http://localhost:11434" --explain

Step 5

Kubernetes cluster analysis

The primary function of k8sgpt is to analyze your Kubernetes cluster for issues:

Basic Analysis:

k8sgpt analyze

Analysis with AI Explanations:

k8sgpt analyze --explain

Analyze specific namespace:

k8sgpt analyze --namespace production

Use specific analyzers only:

k8sgpt analyze --filters Deployment,Pod,Node

Output to JSON file:

k8sgpt analyze --output json --output-file results.json

# Basic cluster analysis
k8sgpt analyze

# Analysis with AI explanations
k8sgpt analyze --explain

# Analyze specific namespace
k8sgpt analyze --namespace production

# Use specific analyzers
k8sgpt analyze --filters Deployment,Pod,Node

# Output to JSON
k8sgpt analyze --output json --output-file results.json

Step 6
Built-in analyzers
k8sgpt includes 30+ built-in analyzers:

Workload Analyzers:
- Pod: Detects pods in error, pending, or crashlooping states
- Deployment: Checks for deployment failures
- StatefulSet: Monitors StatefulSet health
- DaemonSet: Validates DaemonSet node scheduling
- Job: Detects failed or incomplete jobs
Infrastructure:
- Node: Checks node health
- PVC: Detects pending volume claims
- Service: Validates service endpoints
Network:
- Ingress: Validates ingress configuration
- NetworkPolicy: Checks network policies
Operators (OLM):
- ClusterServiceVersion
- Subscription
- CatalogSource
Monitoring:
- Prometheus
- KEDA
- Kyverno
```
Available Analyzers:

Workloads:
├── Pod
├── Deployment
├── StatefulSet
├── DaemonSet
├── Job
└── CronJob

Infrastructure:
├── Node
├── PVC
├── Service
└── ConfigMap

Network:
├── Ingress
├── NetworkPolicy
└── Gateway

Monitoring:
├── Prometheus
├── KEDA
└── Kyverno
```

Step 7

Model Context Protocol (MCP) server

k8sgpt provides an MCP server for AI assistants like Claude Desktop:

Start MCP server (stdio mode):

k8sgpt serve --mcp

Start MCP server (HTTP mode):

k8sgpt serve --mcp --mcp-http --mcp-port 8089

Configure in Claude Desktop:

{
  "mcpServers": {
    "k8sgpt": {
      "command": "k8sgpt",
      "args": ["serve", "--mcp"]
    }
  }
}

# Start MCP server (stdio)
k8sgpt serve --mcp

# Start MCP server (HTTP)
k8sgpt serve --mcp --mcp-http --mcp-port 8089

# Test with curl
curl -X POST http://localhost:8089/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

Step 8
Resources
Official website: https://k8sgpt.ai
GitHub repository: https://github.com/k8sgpt-ai/k8sgpt
Documentation: https://docs.k8sgpt.ai
Docker Hub: https://ghcr.io/k8sgpt-ai/k8sgpt
Helm Chart: https://k8sgpt-ai.github.io/k8sgpt
License: Apache License 2.0
CNCF Sandbox Project: Part of the CNCF ecosystem
```
Official Site: https://k8sgpt.ai
Docs: https://docs.k8sgpt.ai
GitHub: https://github.com/k8sgpt-ai/k8sgpt
Docker: https://ghcr.io/k8sgpt-ai/k8sgpt
Helm: https://k8sgpt-ai.github.io/k8sgpt
License: Apache 2.0
```

k8sgpt: Kubernetes AI-powered troubleshooting and diagnostics

What is k8sgpt?

Technology stack

Installation methods

AI provider configuration

Kubernetes cluster analysis

Built-in analyzers

Model Context Protocol (MCP) server

Resources

Feature requests

Discussion