The Problem

Containers solve packaging, but not fleet-level operations.

Why Containers Aren't Enough

Containers solve packaging — your app, dependencies, and config travel together. But what happens when things go wrong? Click each scenario to find out.

terminal

Desired State: Declare, Don't Script

Kubernetes flips the operational model. Instead of writing scripts to start, stop, and restart containers, you declare what you want: “I need 3 copies of this app, always running.” Kubernetes continuously compares desired state (stored in etcd) to actual state and reconciles any drift — forever.

You declare
"3 replicas, always"
K8s stores in
etcd (source of truth)
K8s reconciles
desired → actual, forever

Try It: Watch the Reconciliation Loop

RECONCILIATION CONTROLLERReplicaSet / watch loop
Desired: 3Actual: 0Reconciling… (+3 pods)

Desired Replicas

3

Running Pods

0

Pod Instances

Waiting for pods…

Controller Log

07:53:30Controller started. Watching pod state...

How the reconciliation loop works

Kubernetes controllers run a continuous observe → diff → act loop. Every few seconds the ReplicaSet controller compares desired state (what you declared in YAML) against actual state (what is alive in the cluster). If they diverge it creates or deletes pods one at a time until they match — including after crashes. You never imperatively command “start a pod”; you just update a number and the control plane makes it true.

Cluster Basics

Understand the control plane and node responsibilities.

Anatomy of a Cluster

A Kubernetes cluster has two halves: the control plane (the brain) and worker nodes (the muscle). Click each component to learn what it does.

terminal

Explore: Cluster Components

Cluster Anatomy

Click any component to explore it, or trace a request end-to-end.

kubectl(your machine)
Control Plane
node-1
pod
node-2
pod
node-3
pod

Select any component above to learn what it does — or press Trace a request to watch a deployment flow through the cluster.

Essential Cluster Commands

Click to copy. These commands orient you when you first connect to a cluster.

Every kubectl command goes through the API server, which validates it, writes to etcd, and lets controllers react.

Pods, Deployments, and Workloads

Pick the right controller for each workload shape.

Choosing the Right Controller

You rarely create individual pods. Instead, you tell a controller what you need, and it manages pods for you. Each controller type handles a different workload shape. Pick the wrong one and you fight the system; pick the right one and Kubernetes handles the hard parts. Click each card to learn when to use it.

terminal

Try It: Match Workloads to Controllers

Pick the Right Workload

0/0 correct
Scenario 1/6

Run a stateless web API with rolling updates and 3 replicas

Key insight: The right workload type depends on whether your app is stateless or stateful, needs to run on every node, or is a one-time task. Most web apps use Deployments.

Working with Deployments

Click any command to copy it. Deployments are the most common workload type—here is the daily workflow.

A Pod is the smallest unit Kubernetes manages. Controllers like Deployments ensure the right number are always running.

Services and Traffic Flow

Expose Pods safely and understand request routing.

Stable Networking for Ephemeral Pods

Pods come and go—they crash, scale up, get rescheduled. If you hardcode a pod's IP, it'll break within hours. A Service provides a stable virtual IP and DNS name that automatically routes to healthy pods using label selectors. Think of it as a load balancer that Kubernetes manages for you. Click each type to learn when to use it.

terminal

Explore: Service Routing

Service Routing Simulator

Select a Service type and send a request to visualize the packet path.

In-cluster traffic onlyInternal-only. Reachable solely by Pods within the cluster.

Packet path

Service (ClusterIP)

kube-proxy / IPVS rules

pod-A
pod-B
pod-C
source podClusterIP:80endpoint pod

Service spec

service.yaml
apiVersion: v1
kind: Service
metadata:
  name: my-app
spec:
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP   # default

How ClusterIP works

ClusterIP assigns a stable virtual IP inside the cluster. kube-proxy programs iptables / IPVS rules so any Pod can reach the Service IP — which then load-balances across healthy endpoints.

Controlling Traffic with NetworkPolicy

By default, every pod can talk to every other pod—no restrictions. A NetworkPolicy lets you define which pods can communicate. Once any policy selects a pod, all non-allowed traffic is denied. Think of it as a firewall rule for pod-to-pod traffic.

Try It: Apply a Network Policy

NetworkPolicy Simulator

No policy
frontend
role=frontend
api
role=api
database
role=database
No policies. Kubernetes allows all pod-to-pod traffic by default. Every pod can reach every other pod. Apply a NetworkPolicy to restrict ingress on the database pod.

Ingress: HTTP Routing to Services

For HTTP traffic from outside the cluster, Ingress provides path-based and host-based routing to backend Services. Click any command to copy it.

Services decouple pod identity from network identity. Pods change; the Service endpoint stays the same.

Config, Secrets, and Access

Separate config from secrets and bind least-privilege access.

Separating Config from Code

Hardcoding database URLs or API keys into container images is a recipe for pain. Kubernetes separates configuration from code using two objects: ConfigMaps for non-sensitive data and Secrets for sensitive data. Both can be mounted as files or injected as environment variables. Click each card to understand the difference.

terminal

Access Control with RBAC

Not every user or service should be able to do everything. RBAC (Role-Based Access Control) lets you define who can do what. A Role lists permitted actions (verbs like get, create, delete on resources like pods, deployments). A RoleBinding attaches that Role to a user or ServiceAccount.

Explore: Config and Access Boundaries

Config & RBAC Playground

ConfigMap: app-config
app.envproduction
log.levelinfo
max.connections100
ServiceAccount Role
Verbs: get, list, watch
Key insight: ConfigMaps hold plain config, Secrets hold sensitive values (base64-encoded). RBAC controls who can read, create, or delete either. Always use Secrets for passwords and API keys, and scope RBAC roles to least privilege.

Namespaces and Resource Quotas

Namespaces partition a cluster into virtual sub-clusters. Combined with ResourceQuota, they prevent one team from consuming all resources. Click any command to copy it.

Keep secrets out of images, scope permissions narrowly, and use namespaces to isolate teams.

Storage and Stateful Data

Learn why ephemeral container filesystems need persistent volumes.

Why Pods Need Persistent Storage

Container filesystems are ephemeral—when a pod restarts, everything written inside the container vanishes. For databases, file uploads, or any data that must survive restarts, you need storage that lives outside the pod lifecycle. Kubernetes solves this with PersistentVolumes. Click each object to understand its role.

terminal

Try It: See What Survives a Restart

Storage Persistence

Pod Running
PersistentVolumeClaim mounted
Ephemeral (tmpfs)
app.log
2.1 KB
Lost when pod restarts
PVC Volume
data.db
48 MB
uploads/photo.jpg
3.2 MB
Persists across restarts
Key insight: Pod filesystems are ephemeral by default — a restart wipes everything. PersistentVolumeClaims decouple data from pod lifecycle, letting databases, uploads, and logs survive restarts and rescheduling.

Access Modes and Lifecycle

Access modes control how many nodes can mount a volume simultaneously. The reclaim policy decides what happens to a PV after its PVC is deleted. Click any line to copy it.

If your data matters, mount a PVC. Container filesystems are scratch space.

Scheduling and Resources

Requests, limits, and placement choices control runtime behavior.

Requests, Limits, and Placement

When you create a pod, you tell Kubernetes how much CPU and memory it needs (requests) and the maximum it may use (limits). The scheduler uses requests to find a node with enough room. If a container exceeds its memory limit, the kernel kills it (OOMKilled). CPU over-limit is throttled, not killed. Click each concept to understand the difference.

terminal

Try It: Schedule Pods onto Nodes

Scheduler Playground

Set your pod's resource requests and see which node the scheduler picks.

CPU Request1 vCPU
Memory Request2 Gi
my-pod1 vCPU / 2 Gi
node-1
CPU3 / 4 free
Memory5 / 8 Gi free
node-2
CPU1 / 2 free
Memory2 / 4 Gi free
node-3
CPU2 / 8 free
Memory4 / 16 Gi free
Taint: gpu=true:NoSchedule
Key insight: The scheduler filters nodes that can't fit the pod (insufficient resources or taints), then scores remaining candidates. Resource requests are guarantees — set them accurately so the scheduler makes good decisions.

Autoscaling with HPA

The HorizontalPodAutoscaler watches a metric—usually CPU utilization—and adjusts replica count automatically. When load rises above target, HPA adds pods. When it drops, HPA removes them. It respects min and max bounds so you never scale to zero or infinity.

Try It: Watch HPA React to Load

HPA Autoscaler

Target: 50%Min: 1Max: 6
Avg CPU Utilization
30%
0%Target 50%100%
Replicas2 / 6
P1
P2
--
--
--
--
Key insight: The HPA checks metrics every 15s (default) and adjusts replicas to keep CPU near target. Scaling up spreads load, scaling down saves resources. Always set resource requests accurately — HPA compares actual usage against requested amounts.

Taints, Tolerations, and Affinity

Beyond resources, Kubernetes offers fine-grained placement control. Click any command to copy it.

Set requests honestly, limits conservatively. Let HPA handle traffic spikes so you don't.

Reliability and Rollouts

Use probes, rollout strategy, and rollback to ship safely.

Health Checks: Probes

Kubernetes does not just start your container and hope for the best. It continuously checks whether your app is healthy using probes. A liveness probe restarts stuck containers. A readiness probe gates traffic—failing it removes the pod from Service endpoints without killing it. A startup probe protects slow-starting apps. Click each probe to understand when to use it.

terminal

Try It: Rolling Update and Rollback

Rolling Update Simulator

Deployment: my-app •  Strategy: RollingUpdate

maxSurge:1|maxUnavailable:0

Pods (4)

v1
Running
my-app-v1-t8soa
v1
Running
my-app-v1-z00mw
v1
Running
my-app-v1-p5mvh
v1
Running
my-app-v1-bfwxq
Ready to deploy0%
v1 (old)
v2 (new)
Creating
Terminating
Event Log

No events yet — click Deploy v2 to start.

Zero-downtime insight: With maxUnavailable: 0, Kubernetes never terminates an old pod until a new pod has passed its readiness probe. Traffic is only routed to pods in the Running state, so end users see no service interruption even as pods are replaced one by one.

Troubleshooting: A Systematic Approach

When something breaks, resist the urge to delete and redeploy. Instead, follow a diagnostic sequence: start with kubectl get pods to see the status, then kubectl logs for application errors, then kubectl describe pod for events and conditions, then cluster-wide events. Each command reveals a different layer of the problem.

Explore: Debug in the Right Order

Troubleshooting Flow Simulator

Practice the systematic order for debugging Kubernetes pod issues. Select a scenario, then click the commands in the correct diagnostic order.

Pod stuck in Pending

A pod has been submitted but never transitions to Running. Something is preventing it from being scheduled.

Rollout Management

Rolling updates replace pods incrementally. If something goes wrong, rollback instantly. Click any command to copy it.

Probes keep your app honest. Rolling updates keep deployments safe. When things break, diagnose before you delete.

Control Plane Deep Mental Model

Trace manifests through API server, etcd, scheduler, and kubelet.

The Lifecycle of kubectl apply

When you run kubectl apply -f deployment.yaml, a chain of asynchronous events unfolds. The manifest hits the API server, which validates and stores it in etcd. The Deployment controller notices the new desired state and creates a ReplicaSet. The scheduler assigns pods to nodes. The kubelet on each node pulls images and starts containers. No single component orchestrates the full sequence—each watches, reacts, and hands off. Click each component to understand its role.

terminal

Explore: Trace the Request Path

Control Plane Journey

Trace a Pod from kubectl to Running

1/6

kubectl apply: kubectl serialises your YAML manifest and sends an authenticated HTTP POST to the API Server.

output

How it works: The API Server is the single entry point — all components communicate through it. etcd is the only stateful component. The Scheduler and Kubelet are stateless reconcilers.

Watching the System Work

These commands let you observe the control plane in action. Click any command to copy it.

Kubernetes is not one big program—it is a set of independent controllers watching and reacting. Understanding this model is the key to debugging anything.