The Problem
Containers solve packaging, but not fleet-level operations.
Why Containers Aren't Enough
Containers solve packaging — your app, dependencies, and config travel together. But what happens when things go wrong? Click each scenario to find out.
Desired State: Declare, Don't Script
Kubernetes flips the operational model. Instead of writing scripts to start, stop, and restart containers, you declare what you want: “I need 3 copies of this app, always running.” Kubernetes continuously compares desired state (stored in etcd) to actual state and reconciles any drift — forever.
Try It: Watch the Reconciliation Loop
Desired Replicas
Running Pods
0Pod Instances
Controller Log
How the reconciliation loop works
Kubernetes controllers run a continuous observe → diff → act loop. Every few seconds the ReplicaSet controller compares desired state (what you declared in YAML) against actual state (what is alive in the cluster). If they diverge it creates or deletes pods one at a time until they match — including after crashes. You never imperatively command “start a pod”; you just update a number and the control plane makes it true.
Cluster Basics
Understand the control plane and node responsibilities.
Anatomy of a Cluster
A Kubernetes cluster has two halves: the control plane (the brain) and worker nodes (the muscle). Click each component to learn what it does.
Explore: Cluster Components
Cluster Anatomy
Click any component to explore it, or trace a request end-to-end.
Select any component above to learn what it does — or press Trace a request to watch a deployment flow through the cluster.
Essential Cluster Commands
Click to copy. These commands orient you when you first connect to a cluster.
Every kubectl command goes through the API server, which validates it, writes to etcd, and lets controllers react.
Pods, Deployments, and Workloads
Pick the right controller for each workload shape.
Choosing the Right Controller
You rarely create individual pods. Instead, you tell a controller what you need, and it manages pods for you. Each controller type handles a different workload shape. Pick the wrong one and you fight the system; pick the right one and Kubernetes handles the hard parts. Click each card to learn when to use it.
Try It: Match Workloads to Controllers
Pick the Right Workload
Run a stateless web API with rolling updates and 3 replicas
Working with Deployments
Click any command to copy it. Deployments are the most common workload type—here is the daily workflow.
A Pod is the smallest unit Kubernetes manages. Controllers like Deployments ensure the right number are always running.
Services and Traffic Flow
Expose Pods safely and understand request routing.
Stable Networking for Ephemeral Pods
Pods come and go—they crash, scale up, get rescheduled. If you hardcode a pod's IP, it'll break within hours. A Service provides a stable virtual IP and DNS name that automatically routes to healthy pods using label selectors. Think of it as a load balancer that Kubernetes manages for you. Click each type to learn when to use it.
Explore: Service Routing
Service Routing Simulator
Select a Service type and send a request to visualize the packet path.
Packet path
Service (ClusterIP)
kube-proxy / IPVS rules
Service spec
apiVersion: v1
kind: Service
metadata:
name: my-app
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
type: ClusterIP # defaultHow ClusterIP works
ClusterIP assigns a stable virtual IP inside the cluster. kube-proxy programs iptables / IPVS rules so any Pod can reach the Service IP — which then load-balances across healthy endpoints.
Controlling Traffic with NetworkPolicy
By default, every pod can talk to every other pod—no restrictions. A NetworkPolicy lets you define which pods can communicate. Once any policy selects a pod, all non-allowed traffic is denied. Think of it as a firewall rule for pod-to-pod traffic.
Try It: Apply a Network Policy
NetworkPolicy Simulator
Ingress: HTTP Routing to Services
For HTTP traffic from outside the cluster, Ingress provides path-based and host-based routing to backend Services. Click any command to copy it.
Services decouple pod identity from network identity. Pods change; the Service endpoint stays the same.
Config, Secrets, and Access
Separate config from secrets and bind least-privilege access.
Separating Config from Code
Hardcoding database URLs or API keys into container images is a recipe for pain. Kubernetes separates configuration from code using two objects: ConfigMaps for non-sensitive data and Secrets for sensitive data. Both can be mounted as files or injected as environment variables. Click each card to understand the difference.
Access Control with RBAC
Not every user or service should be able to do everything. RBAC (Role-Based Access Control) lets you define who can do what. A Role lists permitted actions (verbs like get, create, delete on resources like pods, deployments). A RoleBinding attaches that Role to a user or ServiceAccount.
Explore: Config and Access Boundaries
Config & RBAC Playground
Namespaces and Resource Quotas
Namespaces partition a cluster into virtual sub-clusters. Combined with ResourceQuota, they prevent one team from consuming all resources. Click any command to copy it.
Keep secrets out of images, scope permissions narrowly, and use namespaces to isolate teams.
Storage and Stateful Data
Learn why ephemeral container filesystems need persistent volumes.
Why Pods Need Persistent Storage
Container filesystems are ephemeral—when a pod restarts, everything written inside the container vanishes. For databases, file uploads, or any data that must survive restarts, you need storage that lives outside the pod lifecycle. Kubernetes solves this with PersistentVolumes. Click each object to understand its role.
Try It: See What Survives a Restart
Storage Persistence
Access Modes and Lifecycle
Access modes control how many nodes can mount a volume simultaneously. The reclaim policy decides what happens to a PV after its PVC is deleted. Click any line to copy it.
If your data matters, mount a PVC. Container filesystems are scratch space.
Scheduling and Resources
Requests, limits, and placement choices control runtime behavior.
Requests, Limits, and Placement
When you create a pod, you tell Kubernetes how much CPU and memory it needs (requests) and the maximum it may use (limits). The scheduler uses requests to find a node with enough room. If a container exceeds its memory limit, the kernel kills it (OOMKilled). CPU over-limit is throttled, not killed. Click each concept to understand the difference.
Try It: Schedule Pods onto Nodes
Scheduler Playground
Set your pod's resource requests and see which node the scheduler picks.
Autoscaling with HPA
The HorizontalPodAutoscaler watches a metric—usually CPU utilization—and adjusts replica count automatically. When load rises above target, HPA adds pods. When it drops, HPA removes them. It respects min and max bounds so you never scale to zero or infinity.
Try It: Watch HPA React to Load
HPA Autoscaler
Taints, Tolerations, and Affinity
Beyond resources, Kubernetes offers fine-grained placement control. Click any command to copy it.
Set requests honestly, limits conservatively. Let HPA handle traffic spikes so you don't.
Reliability and Rollouts
Use probes, rollout strategy, and rollback to ship safely.
Health Checks: Probes
Kubernetes does not just start your container and hope for the best. It continuously checks whether your app is healthy using probes. A liveness probe restarts stuck containers. A readiness probe gates traffic—failing it removes the pod from Service endpoints without killing it. A startup probe protects slow-starting apps. Click each probe to understand when to use it.
Try It: Rolling Update and Rollback
Deployment: my-app • Strategy: RollingUpdate
Pods (4)
Zero-downtime insight: With maxUnavailable: 0, Kubernetes never terminates an old pod until a new pod has passed its readiness probe. Traffic is only routed to pods in the Running state, so end users see no service interruption even as pods are replaced one by one.
Troubleshooting: A Systematic Approach
When something breaks, resist the urge to delete and redeploy. Instead, follow a diagnostic sequence: start with kubectl get pods to see the status, then kubectl logs for application errors, then kubectl describe pod for events and conditions, then cluster-wide events. Each command reveals a different layer of the problem.
Explore: Debug in the Right Order
Troubleshooting Flow Simulator
Practice the systematic order for debugging Kubernetes pod issues. Select a scenario, then click the commands in the correct diagnostic order.
Pod stuck in Pending
A pod has been submitted but never transitions to Running. Something is preventing it from being scheduled.
Rollout Management
Rolling updates replace pods incrementally. If something goes wrong, rollback instantly. Click any command to copy it.
Probes keep your app honest. Rolling updates keep deployments safe. When things break, diagnose before you delete.
Control Plane Deep Mental Model
Trace manifests through API server, etcd, scheduler, and kubelet.
The Lifecycle of kubectl apply
When you run kubectl apply -f deployment.yaml, a chain of asynchronous events unfolds. The manifest hits the API server, which validates and stores it in etcd. The Deployment controller notices the new desired state and creates a ReplicaSet. The scheduler assigns pods to nodes. The kubelet on each node pulls images and starts containers. No single component orchestrates the full sequence—each watches, reacts, and hands off. Click each component to understand its role.
Explore: Trace the Request Path
Control Plane Journey
Trace a Pod from kubectl to Running
kubectl apply: kubectl serialises your YAML manifest and sends an authenticated HTTP POST to the API Server.
How it works: The API Server is the single entry point — all components communicate through it. etcd is the only stateful component. The Scheduler and Kubelet are stateless reconcilers.
Watching the System Work
These commands let you observe the control plane in action. Click any command to copy it.
Kubernetes is not one big program—it is a set of independent controllers watching and reacting. Understanding this model is the key to debugging anything.