What Does Kubernetes Actually Do? A Guide From a Practitioner

I remember my first Kubernetes deployment like it was yesterday. We spent three weeks setting up a cluster. Two more weeks debugging networking. The system ran for exactly four hours before crashing.

Everyone told me Kubernetes would solve all my problems. It didn't. Not because Kubernetes is bad. Because I didn't understand what it actually does.

Here's the hard truth: Kubernetes is not magic. It's a distributed systems operating system. Nothing more. Nothing less.

What is Kubernetes? Kubernetes is an open-source container orchestration platform that automates deployment, scaling, and management of containerized applications across a cluster of machines. Born from Google's Borg system, it handles scheduling, networking, storage, and health monitoring so you don't have to write custom infrastructure code.

In this guide, I'll share what I've learned building production AI systems at SIVARO. No theory. Just practical knowledge from running clusters that process 200K+ events per second.

Kubernetes Solves Specific Infrastructure Problems

Most people think Kubernetes is about containers. They're wrong. Kubernetes solves three specific problems that plague distributed systems.

1. Resource management at scale
When you have 50+ microservices, manual resource allocation breaks. Kubernetes handles bin packing automatically. According to CNCF Annual Survey 2025, 68% of organizations now run Kubernetes in production, primarily to solve resource utilization problems.

2. Self-healing infrastructure
Applications crash. Nodes fail. Networks partition. Kubernetes continuously reconciles desired state with actual state. If a pod dies, it spawns a replacement. If a node goes dark, it reschedules workloads.

3. Declarative configuration
You tell Kubernetes what you want, not how to do it. This shifts your mental model from imperative commands to declarative state management. It's a fundamental shift in how you think about infrastructure.

In my experience, the teams that struggle most are those expecting Kubernetes to fix bad application architecture. It won't. A distributed monolith in Kubernetes is still a distributed monolith.

The Core Architecture Components

Understanding what Kubernetes actually does requires knowing its components. There are two planes: control plane and data plane.

Control Plane Components

The control plane makes decisions. It includes:

kube-apiserver: The front door. All communication goes through this REST API.
etcd: The source of truth. A distributed key-value store that holds cluster state.
kube-scheduler: Decides which node runs each pod.
kube-controller-manager: Runs controller processes that handle replication, endpoints, and node health.

Worker Node Components

Worker nodes run your actual applications:

kubelet: The node agent. Ensures containers run in pods.
kube-proxy: Maintains network rules on each node.
Container runtime: Actually runs containers (containerd, CRI-O).

Here's what this looks like in practice. A basic deployment configuration:

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inference-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: inference
  template:
    metadata:
      labels:
        app: inference
    spec:
      containers:
      - name: model-server
        image: sivaroml/inference:2.4.0
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "4Gi"
            cpu: "2"
          limits:
            memory: "8Gi"
            cpu: "4"

This isn't just configuration. It's a contract. Kubernetes guarantees three replicas of this inference service will exist. If one dies, it creates another. No manual intervention.

Key Benefits for Your Project

Why should you care about Kubernetes? Because it changes how you operate systems.

1. Predictable scaling

You define scaling rules. Kubernetes handles the rest. Horizontal Pod Autoscaler (HPA) adjusts replicas based on metrics like CPU, memory, or custom metrics.

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: inference-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: inference-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

According to Datadog's 2025 Container Report, organizations using Kubernetes HPA saw 40% fewer infrastructure costs compared to static provisioning.

2. Rolling updates with zero downtime

Kubernetes handles deployment strategies out of the box. Rolling updates replace pods incrementally. If a new version fails health checks, the deployment rolls back automatically.

I've found that teams waste weeks building custom deployment pipelines. Kubernetes has this built-in. Use it.

3. Service discovery and load balancing

Kubernetes provides DNS-based service discovery. Every service gets a DNS name. Pods get virtual IP addresses. Traffic routes automatically to healthy pods.

$ kubectl get svc inference-service
NAME                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
inference-service   ClusterIP   10.96.123.45    <none>        8080/TCP   14d

This isn't trivial to build yourself. Kubernetes does it for free.

Technical Deep Dive: What Happens Under the Hood

Let's talk about what Kubernetes actually does when you run kubectl apply.

The Scheduling Dance

When you submit a Deployment, here's the execution path:

Authentication: kube-apiserver validates your credentials
Admission control: Mutating and validating webhooks modify or reject the request
Storage: The resource spec gets written to etcd
Scheduling: The scheduler finds a candidate node based on resource requests, node affinity, taints, and tolerations
Pod creation: kubelet on the selected node pulls images and starts containers
Health checks: Liveness and readiness probes verify the application works
Service endpoints: kube-proxy updates iptables to route traffic

Here's how you can watch this happen in real-time:

bash
# Watch pod scheduling decisions
kubectl get pods -w --all-namespaces | grep inference

# See detailed scheduling events
kubectl describe pod inference-service-7d4f8b5c9-x2m3n

# Check scheduler logs
kubectl logs -n kube-system kube-scheduler-master-node

# Trace API server calls
kubectl get events --all-namespaces --sort-by='.lastTimestamp'

Networking: The Part Nobody Explains

Kubernetes networking is where most practitioners get confused. Here's the reality:

Every pod gets a unique IP address. Pods on different nodes communicate directly. No NAT. No port mapping. This is called the "flat network" model.

The Container Network Interface (CNI) plugin handles this. Popular choices include Calico, Flannel, and Cilium. Each implements the model differently.

bash
# Check which CNI plugin is installed
kubectl get pods -n kube-system | grep -i cni

# Inspect network policies
kubectl get networkpolicies --all-namespaces

# Test connectivity between pods
kubectl exec -it debug-pod -- curl http://inference-service:8080/health

In my experience, networking causes 70% of Kubernetes production issues. The CNI plugin choice matters. Calico gives you network policies. Cilium gives you eBPF-based observability. Flannel is simplest but limited.

Storage: The Hidden Complexity

Stateful applications in Kubernetes require PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs). This is where theory meets reality.

yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: model-storage
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: ssd-retain

The hard truth about Kubernetes storage: attaching to cloud block storage takes 30-60 seconds. Pod startup time increases dramatically. For AI workloads loading large models, this delay is unacceptable.

We solved this at SIVARO by using node-local SSDs with a DaemonSet that preloaded models. Startup time dropped from 90 seconds to 3 seconds. Read about our approach in The Kubernetes Stateful Workloads Guide.

Industry Best Practices

After running Kubernetes clusters for 5+ years across multiple companies, here's what works.

Resource Requests and Limits

Always set resource requests and limits. A pod without limits can starve the entire node.

yaml
resources:
  requests:
    memory: "2Gi"
    cpu: "500m"
  limits:
    memory: "4Gi"
    cpu: "1"

The rule: requests are what Kubernetes guarantees. Limits are what the pod can use if available. Setting them close together prevents noisy neighbors.

Pod Disruption Budgets

Production nodes need maintenance. Without PodDisruptionBudgets (PDBs), your application goes down when nodes drain.

yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: inference-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: inference

This ensures at least 2 pods stay running during voluntary disruptions.

Namespace Isolation

Use namespaces for environment isolation. Development, staging, and production should never mix. According to Kubernetes Security Report 2025, 42% of breaches involved applications in the same namespace as critical infrastructure.

bash
# Create namespaces for isolation
kubectl create namespace production
kubectl create namespace staging

# Apply resource quotas
kubectl apply -f resource-quota.yaml -n production

# Set context
kubectl config set-context prod-cluster --namespace=production

Making the Right Choice: When Kubernetes Makes Sense

Not every project needs Kubernetes. Here's my honest assessment.

You need Kubernetes when:

Running 10+ microservices that need independent scaling
Requiring zero-downtime deployments multiple times daily
Operating across multiple cloud providers or on-premises
Building systems that must self-heal from node failures

You don't need Kubernetes when:

Running 1-3 monolithic applications
Your team has no DevOps experience
Traffic patterns are predictable (no scaling needed)
You can restart everything during maintenance windows

The CNCF Annual Survey 2025 found that organizations with fewer than 50 employees saw negative ROI from Kubernetes. The complexity tax exceeds benefits at small scale.

At SIVARO, we use Kubernetes for production ML inference. We skip it for internal tools and experiments. Pragmatism over dogma.

Handling Common Challenges

Kubernetes problems fall into predictable categories. Here's how to solve the ones I've seen most.

Pod CrashLoopBackOff

This means your application keeps crashing. Debug systematically:

bash
# Check pod status
kubectl describe pod inference-service-7d4f8b5c9-x2m3n

# View logs
kubectl logs inference-service-7d4f8b5c9-x2m3n --previous

# Execute commands inside crashing pod
kubectl debug -it inference-service-7d4f8b5c9-x2m3n --image=busybox -- sh

Common causes: missing environment variables, incorrect command arguments, port conflicts.

Node Pressure

Nodes run out of resources. The kubelet evicts pods. Prevent this with resource quotas and priority classes.

yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: production
spec:
  hard:
    requests.cpu: "40"
    requests.memory: "100Gi"
    limits.cpu: "80"
    limits.memory: "200Gi"

Networking Timeouts

Services intermittent failing? Check DNS resolution:

bash
# Test DNS from within cluster
kubectl run -it --rm debug --image=busybox -- nslookup kubernetes.default

# Check service endpoints
kubectl get endpoints inference-service

# Verify CNI plugin health
kubectl get pods -n kube-system | grep -E 'calico|flannel|cilium'

According to Datadog's 2025 Container Report, 60% of Kubernetes networking issues stem from misconfigured network policies.

Frequently Asked Questions

Can Kubernetes run without internet?
Yes, but you need a private container registry and DNS. Air-gapped deployments are common in finance and defense. Pre-pull images to avoid registry latency.

How many nodes does a minimal cluster need?
Three control plane nodes for high availability. Two worker nodes minimum. Single-node clusters work for development but risk data loss.

Does Kubernetes charge money?
Kubernetes itself is free. You pay for underlying cloud resources: compute, storage, networking. Managed services like EKS, AKS, and GKE add management fees.

Can Kubernetes replace Docker?
Docker is a container runtime. Kubernetes is an orchestrator. They complement each other. Kubernetes can use containerd or CRI-O instead of Docker.

How long does it take to learn Kubernetes?
Basic operations: 2-4 weeks. Production-ready expertise: 6-12 months. The learning curve is steep because you're learning distributed systems concepts simultaneously.

What happens when etcd fails?
The cluster becomes read-only. Existing pods continue running. New deployments and scaling operations stop. This is why etcd backup is critical.

Can I run stateful applications in Kubernetes?
Yes, but it's harder. Use StatefulSets for stable network identities and ordered deployment. Avoid for databases if you can use managed services instead.

Is Kubernetes secure by default?
No. Default configurations are insecure. You must implement RBAC, network policies, pod security standards, and secrets management yourself.

Summary and Next Steps

Kubernetes is powerful but not simple. It solves real distributed systems problems: resource management, self-healing, and declarative configuration. But it introduces complexity in networking, storage, and operations.

My advice after years in the trenches:

Start with managed Kubernetes (EKS, AKS, GKE)
Use namespaces for isolation from day one
Set resource limits on everything
Invest in observability before going to production
Know when not to use Kubernetes

If you're building data-intensive AI systems like we do at SIVARO, Kubernetes is worth the investment. Just walk in with eyes open.

Next step: Deploy a simple application. Break it. Fix it. Learn the recovery patterns. That's where real understanding comes from.

Author Bio

Nishaant Dixit: Founder of SIVARO. Building data infrastructure and production AI systems since 2018. Built systems processing 200K events/sec. Connect on LinkedIn: Nishaant Veer Dixit

Sources

CNCF Annual Survey 2025
Datadog's 2025 Container Report
Kubernetes Security Report 2025
The Kubernetes Stateful Workloads Guide