IntermediateKubernetesk8sContainersOrchestrationDevOpsCloud NativeDockerMicroservicesScalingCNCF

Kubernetes: Production-Grade Container Orchestration Platform Setup

Complete setup guide for Kubernetes - an open-source container orchestration platform for automating deployment, scaling, and management of containerized applications. Originally designed by Google, now maintained by the Cloud Native Computing Foundation (CNCF). Includes cluster setup, core concepts, deployment patterns, and production best practices.

Step 1

Understanding Kubernetes Architecture

Kubernetes is a portable, extensible platform for automating deployment, scaling, and operation of containerized applications. It consists of a control plane (master components) and worker nodes where your applications run. Key concepts include Pods (smallest deployable units), Services (network abstraction), Deployments (declarative updates), and Namespaces (virtual clusters).

Control Plane Components:
- kube-apiserver: API server and frontend to cluster data
- etcd: Consistent and highly-available key-value store
- kube-scheduler: Assigns pods to nodes
- kube-controller-manager: Runs controller processes
- cloud-controller-manager: Cloud provider integration

Worker Node Components:
- kubelet: Node agent managing pods
- kube-proxy: Network proxy maintaining rules
- containerd: Container runtime
- CNI plugins: Network interface configuration

Step 2

System Prerequisites

Kubernetes requires a cluster of machines (physical or virtual) running Linux (Ubuntu, Debian, CentOS) or Windows Server. Minimum requirements: 2+ GB RAM per node, 2+ CPU cores, 20+ GB disk space. For production: dedicated nodes with hardware acceleration. You'll need root/sudo access, container runtime installed, and network connectivity between nodes.

# Check system requirements
uname -a  # Linux kernel 3.10+ recommended
grep MemTotal /proc/meminfo  # 2GB+ RAM
grep processor /proc/cpuinfo | wc -l  # 2+ cores
df -h /  # 20GB+ available

# Verify container runtime availability
docker --version  # Docker 20.10+
# OR
containerd --version  # containerd 1.6+

# Check network requirements
ping -c 3 8.8.8.8  # Internet connectivity
systemctl status firewalld  # Firewall management

Step 3

Installation via kubeadm (Recommended for Learning/Testing)

kubeadm is the official Kubernetes bootstrap tool for creating clusters. It handles control plane setup and initial worker node joining. Suitable for development, testing, and some production workloads. Choose your container runtime first (Docker, containerd, or CRI-O).

# Install container runtime (containerd recommended)
curl -fsSL https://dl.k8s.io/release/$(curl -fsSL https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubeadm | sudo tee /usr/local/bin/kubeadm > /dev/null
chmod +x /usr/local/bin/kubeadm

# Install kubectl CLI
curl -LO "https://dl.k8s.io/release/$(curl -fsSL https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/

# Install kubelet
curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
sudo chmod 644 /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/add-ons/apt stable main' | sudo tee /etc/apt/sources.list.d/kubernetes-addons.list > /dev/null

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl

# Verify installations
kubeadm version
kubectl version --client
kubelet --version

⚠ Heads up: kubeadm creates a cluster suitable for testing. For production, consider managed services (EKS, GKE, AKS) or distributions like RKE, K3s, or OpenShift.

Step 4

Initialize Control Plane

Create the control plane on your first node. kubeadm generates configuration, downloads necessary components, and starts the control plane. The command outputs a kubeadm join token for worker nodes. Control plane nodes are tainted (no regular workloads by default).

# Initialize control plane (as root or with sudo)
sudo kubeadm init --pod-network-cidr=10.244.1.0/24

# Output shows:
# - Kubernetes control plane initialized
# - kubeadm join command with token
# - Admin kubeconfig path: ~/.kube/config

# Copy kubeconfig for current user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# Verify cluster status
kubectl get nodes  # Should show 1 node with NotReady status
kubectl get pods --namespace=kube-system  # Control plane pods

⚠ Heads up: The --pod-network-cidr flag is required when using CNI plugins like Flannel. Different CNI plugins may require different CIDR ranges (Flannel: 10.244.0.0/16, Calico: 192.168.0.0/16, Weave: 10.32.0.0/16).

Step 5

Deploy Pod Network (CNI)

Kubernetes requires a Container Network Interface (CNI) plugin for pod-to-pod communication across nodes. Popular options: Flannel (simple), Calico (advanced networking/security), Weave (easy setup), Cilium (eBPF-based). Without a CNI, nodes remain NotReady.

# Deploy Flannel (simple, reliable)
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

# Alternative: Calico (advanced features)
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/tigera-operator.yaml
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/custom_default.yaml

# Alternative: Weave Net
kubectl apply -f "https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s.yaml"

# Wait for CNI to be ready
watch kubectl get pods -n kube-system

# Verify all nodes are Ready
kubectl get nodes
# STATUS should show Ready for all nodes

Step 6

Add Worker Nodes

Join worker nodes to the cluster using the kubeadm join command from the init output. Each worker needs the join token, CA hash, and potentially the discovery token. Tokens expire after 24 hours by default.

# On each worker node (run the command from kubeadm init output)
sudo kubeadm join <control-plane-ip>:6443 --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash>

# Verify worker joined
kubectl get nodes
# Should show new node with Ready status

# List all nodes with details
kubectl get nodes -o wide

# Check node labels
kubectl describe node <worker-node-name>

# If token expired, generate new one
kubeadm token create --print-join-command

# Renew token validity
kubeadm token create --ttl 0

⚠ Heads up: Worker nodes need network connectivity to control plane on port 6443, and to other nodes on ports 10250, 2379-2380. Ensure firewall rules allow these communications.

Step 7

Verify Cluster Health

Confirm your cluster is fully operational by checking node status, system pods, and running a simple test application. All kube-system pods should be running, and nodes should be Ready.

# Check all nodes are Ready
kubectl get nodes
# All should show STATUS=Ready

# Check control plane pods
kubectl get pods -n kube-system
# All should be Running/ContainerCreating

# Describe specific node for details
kubectl describe node <node-name>

# Test with nginx deployment
kubectl create deployment nginx --image=nginx:latest
kubectl expose deployment nginx --port=80 --type=NodePort

# Get service details
kubectl get services
kubectl describe service nginx

# Access via node IP and port
# curl http://<node-ip>:<nodeport>

# Clean up test
kubectl delete deployment nginx
kubectl delete service nginx

Step 8

Create Your First Deployment

Deployments are the primary way to run applications in Kubernetes. They provide declarative updates for Pods and ReplicaSets. A Deployment ensures the specified number of pod replicas are running at all times.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  labels:
    app: web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
      - name: nginx
        image: nginx:1.24
        ports:
        - containerPort: 80
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
          requests:
            memory: "64Mi"
            cpu: "250m"

Step 9

Apply Deployment and Expose Service

Save the deployment YAML to a file and apply it. Services provide stable networking for your pods with load balancing. NodePort exposes the service externally on each node's IP.

# Save deployment.yaml (from previous step)

# Apply deployment
kubectl apply -f deployment.yaml

# Verify deployment
kubectl get deployments
kubectl get pods -l app=web-app

# Create service to expose deployment
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: web-app-service
spec:
  type: NodePort
  selector:
    app: web-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
      nodePort: 30007
EOF

# Get service details
kubectl get service web-app-service

# Access application on any node
curl http://<any-node-ip>:30007

# Check pod logs
kubectl logs -l app=web-app

Step 10

Scaling Applications

Kubernetes makes horizontal scaling trivial. Scale deployments up or down with a single command. For automatic scaling based on metrics, use HorizontalPodAutoscaler (HPA) with metrics-server.

# Scale deployment manually
kubectl scale deployment web-app --replicas=5

# Scale down
kubectl scale deployment web-app --replicas=2

# Scale via deployment spec
kubectl set image deployment/web-app nginx=nginx:1.25

# Check scaled pods
kubectl get pods -l app=web-app -w

# Install metrics-server for HPA
curl -LO https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl apply -f components.yaml

# Create HorizontalPodAutoscaler
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
EOF

# Verify HPA
kubectl get hpa web-app-hpa

Step 11

Configure Secrets and ConfigMaps

Store sensitive data in Secrets (base64 encoded) and configuration in ConfigMaps. Both can be mounted as environment variables or volumes in pods. Never store secrets in plain text or commit them to version control.

# Create ConfigMap from literals
kubectl create configmap web-config \
  --from-literal=APP_ENV=production \
  --from-literal=LOG_LEVEL=info

# Create ConfigMap from file
kubectl create configmap web-config \
  --from-file=config.properties=./app/config.properties

# Create Secret from literals
kubectl create secret generic db-credentials \
  --from-literal=DB_USERNAME=admin \
  --from-literal=DB_PASSWORD=secret123

# Create Secret from file
kubectl create secret generic tls-secret \
  --from-file=tls.crt=./certs/tls.crt \
  --from-file=tls.key=./certs/tls.key

# View secrets (base64 encoded)
kubectl get secret db-credentials -o yaml

# Decode secret value
kubectl get secret db-credentials -o jsonpath='{.data.DB_PASSWORD}' | base64 --decode

# Use in deployment (update your deployment.yaml)
# spec.containers[0].env or spec.containers[0].envFrom

⚠ Heads up: Kubernetes Secrets are base64 encoded, NOT encrypted. For production, use encryption at rest, external secret managers (Vault, AWS Secrets Manager), or sealed-secrets.

Step 12

Persistent Volumes for Data Storage

Pods are ephemeral; use PersistentVolumes (PV) and PersistentVolumeClaims (PVC) for durable storage. PVs are cluster resources, PVCs are requests for storage. StorageClasses enable dynamic provisioning.

# StorageClass for dynamic provisioning (hostPath for testing)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

---
# PersistentVolumeClaim
global:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-storage
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: local-storage

---
# Pod using PVC
apiVersion: v1
kind: Pod
metadata:
  name: database
spec:
  containers:
  - name: postgres
    image: postgres:15
    env:
    - name: POSTGRES_DB
      value: myapp
    - name: POSTGRES_PASSWORD
      valueFrom:
        secretKeyRef:
          name: db-credentials
          key: DB_PASSWORD
    ports:
    - containerPort: 5432
    volumeMounts:
    - name: db-storage
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: db-storage
    persistentVolumeClaim:
      claimName: database-storage

Step 13

Namespaces for Multi-Tenancy

Namespaces provide isolation between groups of resources within a single cluster. Use them for environments (dev, staging, prod), teams, or projects. Resources are isolated by default within namespaces.

# Create namespace
kubectl create namespace development
kubectl create namespace production

# List namespaces
kubectl get namespaces

# Set default namespace
kubectl config set-context --current --namespace=development

# Create resources in specific namespace
kubectl run test-pod --image=nginx -n development

# View resources by namespace
kubectl get pods -A  # All namespaces
kubectl get pods -n production

# Apply manifest with namespace
kubectl apply -f deployment.yaml -n staging

# Resource quotas per namespace
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: development
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    pods: "20"
EOF

# Check quota usage
kubectl describe quota compute-quota -n development

Step 14

Monitoring and Observability

Monitor cluster health with kube-state-metrics, node exporter, and Prometheus. Use kubectl describe for debugging. Enable logging aggregation with EFK stack (Elasticsearch, Fluentd, Kibana) or Loki.

# Check node status
kubectl describe node <node-name>

# Check pod events and status
kubectl describe pod <pod-name>

# View pod logs
kubectl logs <pod-name>
kubectl logs <pod-name> --previous  # Previous container instance

# Follow logs in real-time
kubectl logs -f <pod-name>

# Multi-container pod logs
kubectl logs <pod-name> -c <container-name>

# Exec into running container
kubectl exec -it <pod-name> -- /bin/bash

# Run temporary debugging pod
kubectl run debug --image=nicolaka/netshoot -it --rm

# Check events across cluster
kubectl get events --all-namespaces --sort-by='.lastTimestamp'

# Install Prometheus (kube-prometheus-stack)
kubectl apply -f https://github.com/prometheus-community/helm-charts/releases/download/kube-prometheus-stack-56.6.0/kube-prometheus-stack.yaml

# Check metrics endpoint
kubectl top nodes
kubectl top pods

Step 15

Rolling Updates and Rollbacks

Kubernetes enables zero-downtime deployments through rolling updates. Configure update strategies for controlled deployments. Rollbacks are simple with kubectl's built-in revision history.

# Update deployment with new image
kubectl set image deployment/web-app nginx=nginx:1.25

# Watch rolling update progress
kubectl rollout status deployment/web-app

# Check rollout history
kubectl rollout history deployment/web-app

# View specific revision
kubectl rollout history deployment/web-app --revision=2

# Rollback to previous version
kubectl rollout undo deployment/web-app

# Rollback to specific revision
kubectl rollout undo deployment/web-app --to-revision=1

# Pause rollout for manual intervention
kubectl rollout pause deployment/web-app

# Resume rollout
kubectl rollout resume deployment/web-app

# Configure update strategy in deployment.yaml
# spec.strategy:
#   type: RollingUpdate
#   rollingUpdate:
#     maxUnavailable: 25%
#     maxSurge: 25%

Step 16

Network Policies for Security

Network Policies define how pods communicate with each other and external networks. They provide white-list based security for pod traffic. Require a network plugin that supports NetworkPolicy (Calico, Cilium, Weave).

# Allow only ingress from specific pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-web-frontend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: web-app
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 80

---
# Deny all ingress (default deny)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress

---
# Allow egress to specific destinations
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
    ports:
    - protocol: UDP
      port: 53

Step 17

Helm for Package Management

Helm is Kubernetes' package manager. It manages Charts (collections of Kubernetes resources) and provides templating, versioning, and rollbacks. Essential for deploying complex applications.

# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Verify installation
helm version

# Add chart repository
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

# Search for charts
helm search repo nginx
helm search hub postgres

# Install from chart
helm install my-nginx bitnami/nginx

# Install with custom values
helm install my-postgres bitnami/postgresql \
  --set auth.postgresPassword=admin123 \
  --set primary.persistence.size=10Gi \
  --namespace databases

# List releases
helm list
helm list -A  # All namespaces

# Get release info
helm status my-nginx

# Upgrade release
helm upgrade my-nginx bitnami/nginx --set image.tag=1.26.0

# Rollback release
helm rollback my-nginx 1

# Uninstall release
helm uninstall my-nginx

# Generate manifests without installing
helm template my-nginx bitnami/nginx > nginx.yaml

Step 18

Troubleshooting Common Issues

Common Kubernetes issues include pods stuck in CrashLoopBackOff, ImagePullBackOff, pending states, and network connectivity problems. Use kubectl describe and logs extensively.

# Pod stuck in CrashLoopBackOff
kubectl describe pod <pod-name>  # Check events and container state
kubectl logs <pod-name> --previous  # Check previous instance logs

# ImagePullBackOff
kubectl describe pod <pod-name>  # Check image pull errors
# Verify image exists: docker pull <image>
# Check imagePullSecrets in deployment

# Pod in Pending state
kubectl describe pod <pod-name>  # Check events for scheduling issues
kubectl describe node <node-name>  # Check node capacity
kubectl get events --sort-by='.lastTimestamp'

# Insufficient resources
kubectl describe node <node-name>  # Check allocatable resources
kubectl get resourcequota -A  # Check quotas

# Network issues
kubectl run -it --rm debug --image=nicolaka/netshoot -- /bin/bash
# Inside pod:
# ping <pod-ip>
# curl -v <service-name>
# nslookup <service-name>

# Check DNS resolution
kubectl run -it --rm dns-test --image=busybox -- /bin/sh
cat /etc/resolv.conf
nslookup kubernetes.default

# Certificate issues
kubectl get secret -n kube-system | grep tls
# Check certificate expiry:
kubectl describe secret <tls-secret> -n kube-system

Step 19

Cluster Upgrade and Maintenance

Kubernetes clusters need regular upgrades for security patches and new features. Follow the upgrade path (one minor version at a time). Upgrade control plane first, then worker nodes one by one.

# Check current version
kubectl get nodes -o wide
kubectl version

# Check available versions
curl -L -s https://dl.k8s.io/release/stable.txt
curl -L -s https://dl.k8s.io/release/ci/latest.txt

# Upgrade kubeadm
sudo apt-mark unhold kubeadm kubectl kubelet
sudo apt-get update
sudo apt-get install -y kubeadm=1.29.0-00

# Plan upgrade
kubeadm upgrade plan

# Upgrade control plane
kubeadm upgrade apply v1.29.0

# Upgrade kubelet on all nodes
sudo apt-get install -y kubelet=1.29.0-00
sudo systemctl daemon-reload
sudo systemctl restart kubelet

# Upgrade kubectl
kubectl version --client

# Roll out updates to all nodes
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
sudo apt-get install -y kubelet=1.29.0-00 kubeadm=1.29.0-00
sudo systemctl restart kubelet
kubectl uncordon <node-name>

# Verify cluster health
kubectl get nodes
kubectl get pods -A

⚠ Heads up: Always backup etcd before upgrades: kubeadm init phase upload-certs --upload-dir /backup/certs. Test upgrades in staging first. Check component compatibility before upgrading.

Step 20

Alternative Installation Methods

For different use cases, consider alternative installation methods: minikube/kind for local development, K3s for lightweight clusters, managed services (EKS/GKE/AKS) for production, or RKE for Rancher integration.

# Minikube for local development
# https://minikube.sigs.k8s.io/docs/start/
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube
minikube start
minikube status

# Kind (Kubernetes in Docker)
# https://kind.sigs.k8s.io/
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.22.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind
kind create cluster
kind get clusters

# K3s for lightweight/single-node
# https://k3s.io/
curl -sfL https://get.k3s.io | sh-
k3s server

# MicroK8s for Ubuntu
# https://microk8s.io/
sudo snap install microk8s --classic
microk8s.status --wait-ready
microk8s.enable dns storage

# For managed services:
# - AWS EKS: eksctl install
# - GCP GKE: gcloud container clusters create
# - Azure AKS: az aks create

Step 21

Next Steps and Resources

Continue learning with official Kubernetes documentation, CNCF certifications (CKA, CKAD), and hands-on practice. Explore advanced topics: operators, service mesh (Istio), GitOps (ArgoCD), and multi-cluster management.

Recommended Resources:
- Official Docs: https://kubernetes.io/docs/
- CNCF Curriculum: https://www.cncf.io/certifications/
- Kubernetes the Hard Way: https://github.com/kelseyhightower/kubernetes-the-hard-way
- Book: Kubernetes Up & Running (O'Reilly)
- Practice: Katacoda Kubernetes scenarios

Key Concepts to Master:
- Controllers: Deployment, StatefulSet, DaemonSet, Job, CronJob
- Services: ClusterIP, NodePort, LoadBalancer, Ingress
- Security: RBAC, Service Accounts, Network Policies, Pod Security Standards
- Storage: PV, PVC, StorageClass, CSI drivers
- Advanced: Operators, Custom Resources (CRDs), Helm Charts, Operators

Certification Paths:
- CKA: Certified Kubernetes Administrator
- CKAD: Certified Kubernetes Application Developer
- CKS: Certified Kubernetes Security Specialist

Join the Community:
- Kubernetes Slack: https://slack.k8s.io/
- Kubernetes Forums: https://discuss.kubernetes.io/
- Local Meetups: https://www.meetup.com/topics/kubernetes/