What Exactly Is Kubernetes Used For? A Practitioner's Guide
Let me tell you a story. Back in 2019, I was running a data pipeline that processed about 50,000 events per second. We had three microservices, a handful of Docker containers, and a bash script that restarted things when they failed. It worked. For a while.
Then we hit 200,000 events per second. The bash script choked. Services died in weird cascading patterns. Debugging took hours. I spent a weekend manually SSHing into boxes, tailing logs, and restarting things like some kind of digital janitor. That weekend cost us about $12,000 in compute waste and lost data.
I knew I needed Kubernetes. But I also knew it wasn't a magic wand.
Most people think Kubernetes is "Docker for production" — they're wrong. Or they think it's a cloud management tool — also wrong. What exactly is kubernetes used for? Let me show you what I've learned running production systems at SIVARO for the last six years.
What Kubernetes Actually Does
Here's the shortest honest answer: Kubernetes is a distributed system for running containerized applications at scale. It handles scheduling, scaling, networking, and fault tolerance — but it doesn't do any of that automatically. You have to tell it what you want.
The Kubernetes overview docs describe it as a platform for automating deployment, scaling, and management of containerized applications. That's technically correct. But it misses the real point.
What Kubernetes actually does is turn infrastructure into a declarative API.
You describe your desired state — "I want 3 copies of this service, port 8080, health check endpoint /health, restart on failure" — and Kubernetes makes it happen. It constantly reconciles actual state with desired state. That's the core loop.
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: sivaro/api:v2.3.1
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /health
port: 8080
That yaml file is not configuration. It's a contract with the infrastructure. You write it, commit it to git, and the cluster makes it real. This is the fundamental shift Kubernetes brings — from imperative operations to declarative management.
The Five Real Reasons to Use Kubernetes
I've talked to probably 200 engineers about this. The answers vary, but five patterns keep showing up.
1. Self-Healing Without a Pager
When a container dies, Kubernetes restarts it. When a node dies, it reschedules the workloads. When traffic spikes, it scales up. This is what does kubernetes actually do — it handles the boring operational work so you don't have to.
At SIVARO, we had a memory leak in a data ingestion service. It would crash every 72 hours. Before Kubernetes, someone patched it at 3 AM. After Kubernetes, we set memory limits and a restart policy. The service crashed 47 times in a month. No one noticed. The cluster just kept running.
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: crashy-service
spec:
replicas: 2
template:
spec:
containers:
- name: main
image: sivaro/crashy:v1.0
resources:
limits:
memory: "512Mi"
requests:
memory: "256Mi"
livenessProbe:
exec:
command: ["cat", "/tmp/healthy"]
initialDelaySeconds: 5
periodSeconds: 5
restartPolicy: Always
One company I know — a fintech startup — was running 12 microservices on bare EC2 instances. They had 3 engineers on call. Their MTTR (mean time to recovery) was 45 minutes. After migrating to Kubernetes, it dropped to 4 minutes. The difference wasn't better code. It was automated recovery.
2. Scaling Without Guessing
Traditional autoscaling is reactive. You set a threshold, wait for alarms, then add capacity. Kubernetes does horizontal pod autoscaling based on real-time metrics. CPU, memory, custom metrics — pick your poison.
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-autoscaler
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
But here's the thing most people miss: scaling isn't just about handling more load. It's about spending less money when there's no load.
A startup I advised was running 8 servers 24/7, even at 2 AM when traffic dropped to 5%% of peak. They were burning $3,400/month. After Kubernetes, they scaled to 2 pods at night, 14 during peak. Their bill dropped to $1,100/month. Same workload, half the cost.
3. Deployments That Don't Break Things
The old way: git pull, build, restart service, hope it works.
The Kubernetes way: rolling updates with health checks, rollback on failure, zero-downtime deployments.
bash
# Canary deployment pattern
kubectl set image deployment/api-server api=sivaro/api:v2.4.0
kubectl rollout status deployment/api-server
# If something goes wrong
kubectl rollout undo deployment/api-server
I've seen teams deploy 50 times a day after adopting Kubernetes. Before, they deployed once a week. The difference isn't cultural — it's safety. When rolling back is a one-liner, you deploy more. When deploying means SSH and sudo, you delay.
4. Multi-Cloud Without Lock-In
This one is controversial. A lot of people say Kubernetes gives you cloud portability. I think that's mostly marketing. But there's a kernel of truth.
Running the same Kubernetes manifests on AWS, GCP, or Azure means your application deployment is portable. Not your infrastructure — your application. The Google Cloud documentation makes this point well: Kubernetes abstracts the underlying compute layer.
Here's a practical example. We ran a batch processing job on GKE (Google Kubernetes Engine). Then we wanted to move to EKS (Amazon Elastic Kubernetes Service). The Kubernetes manifests worked with zero changes. The only differences were the cloud-specific load balancers and storage classes.
yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-store
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
# GCP version
storageClassName: standard
# AWS version (swap this)
# storageClassName: gp2
Is kubernetes the same as aws? No. Kubernetes runs on AWS (via EKS), but it's a separate system. Think of it as the operating system for containers — AWS provides the hardware, Kubernetes manages the scheduling.
5. Resource Efficiency Through Bin Packing
This is the one nobody talks about in blog posts. Kubernetes can pack workloads onto nodes with insane efficiency.
Without Kubernetes, you provision servers for peak load. A service needs 2 CPU cores, so you buy a 4-core instance. Another service needs 4GB RAM, so you buy another server. You're leaving 40-60%% of capacity on the floor.
Kubernetes schedules multiple workloads on the same node. It bins CPU and memory requests. The result? You use 30-50%% less infrastructure for the same workload.
When You Should NOT Use Kubernetes
I've seen teams go all-in on Kubernetes because "that's what Netflix uses." Is Netflix using kubernetes? Actually, they use Titus, their own container orchestration platform built in-house. They don't run Kubernetes at scale for their streaming services. They built something custom.
Here's my rule: Don't use Kubernetes unless you have at least one of these problems:
- You're running more than 5 microservices
- You need to deploy more than once a week
- Your team has at least 2 people who understand distributed systems
- You're spending more than $5,000/month on infrastructure
A Hacker News discussion titled "I Didn't Need Kubernetes, and You Probably Don't Either" makes this point brutally. The author ran a successful SaaS on a single server for years. Kubernetes would have been overhead, not value.
The company Ona wrote about leaving Kubernetes entirely. Their take: the complexity cost exceeded the value. For their team of 15 engineers, managing the cluster consumed 20%% of their time. They moved to a simpler platform and cut operational overhead by 80%%.
Why are people moving away from kubernetes? Three reasons I've seen firsthand:
- Cost management complexity: Kubernetes doesn't manage cost. You still need to figure out node sizing, spot instances, and reserved capacity.
- Debugging hell: When something breaks in Kubernetes, the error chain involves 6 different components. Network policy, service mesh, ingress controller, pod networking, storage driver, and application code all fail independently.
- Skill requirements: A junior engineer can launch an EC2 instance. It takes months to understand Kubernetes RBAC, network policies, and scheduling semantics.
The Kubernetes hater's guide has a point: Kubernetes solves real problems but creates new ones. The trick is knowing if your problems are bigger than the problems Kubernetes introduces.
The Architecture You Actually Need (Based on Real Workloads)
Most Kubernetes tutorials show you three-tier web apps. My world is different. We build data infrastructure and production AI systems. Here's what a real architecture looks like.
yaml
# Data pipeline with streaming and batch
apiVersion: apps/v1
kind: Deployment
metadata:
name: stream-processor
spec:
replicas: 4
template:
spec:
containers:
- name: processor
image: sivaro/stream:v1.5
env:
- name: KAFKA_BROKERS
value: "kafka-cluster:9092"
- name: INPUT_TOPIC
value: "raw-events"
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- stream-processor
topologyKey: "kubernetes.io/hostname"
Notice the resource requests vs limits. This is critical. Requests are what the pod needs. Limits are what it can burst to. Set requests close to actual usage, limits 2x higher. This gives the scheduler visibility into actual needs while allowing temporary spikes.
Debugging Kubernetes: The Stuff Nobody Teaches You
Let me save you hundreds of hours. Here are the three most common failures I've seen in production Kubernetes clusters.
1. Pods stuck in CrashLoopBackOff
This isn't a Kubernetes problem. It's an application problem. But Kubernetes makes it look like infrastructure issue.
bash
# See why it's crashing
kubectl logs pod-name --previous
kubectl describe pod pod-name | grep -A 10 "Last State"
kubectl exec -it pod-name -- /bin/sh # If it runs long enough
90%% of crash loops are misconfigured environment variables. The other 10%% are memory limits too low.
2. DNS resolution fails
Kubernetes has a built-in DNS service (CoreDNS). It breaks silently.
bash
# Test DNS from inside a pod
kubectl run dns-test --image=busybox --rm -it -- nslookup kubernetes.default.svc.cluster.local
If this fails, your CoreDNS deployment is unhealthy. Nine times out of ten, increasing CoreDNS replicas from 2 to 3 fixes it.
3. Nodes running out of resources
Kubernetes evicts pods when nodes run low on memory or disk. Your application goes down without clear errors.
bash
# Check node resources
kubectl top nodes
kubectl describe node node-name | grep -i pressure
kubectl get events --field-selector reason=Evicted
Prevention: set memory requests correctly and use PodDisruptionBudgets.
yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: api
The Real Cost Story
Let me be honest about money. Kubernetes itself is free. The cluster management, egress, storage, and monitoring — those cost money.
A production-grade cluster on AWS EKS (3 worker nodes, 1 control plane) runs about $1,200/month just for the infrastructure. Add monitoring (Datadog or Grafana Cloud) for $500/month. Add CI/CD (GitLab, GitHub Actions) for $200/month. A managed Kubernetes cluster costs $1,500-$2,500/month before you run any application workloads.
Is it worth it? For a team running 20 microservices processing 1 million requests/day — yes. For a team running 2 services processing 10,000 requests/day — no.
The Reddit thread asking "what is the main reason you would give a company to use Kubernetes" has the best answer: Kubernetes buys you time. Not money. Time. It automates the boring parts of operations so your engineers work on product features, not infrastructure maintenance.
FAQ: What People Actually Ask Me
Q: Is Kubernetes the same as Docker?
No. Docker runs containers. Kubernetes orchestrates containers. Think of Docker as a car engine. Kubernetes is the steering wheel, pedals, and dashboard. You need both, but they're different things.
Q: Is Kubernetes the same as AWS?
No. Is kubernetes the same as aws? Kubernetes is a platform that runs on AWS (via EKS), or on-premises, or on GCP, or Azure. AWS provides the physical infrastructure; Kubernetes manages the logical infrastructure.
Q: Is Netflix using Kubernetes?
Partly. Is netflix using kubernetes? They use it for some backend services, but their primary container orchestration is Titus (their own system). They built custom tooling because Kubernetes didn't meet their scale requirements for streaming.
Q: Does Kubernetes replace Docker?
No. Kubernetes uses Docker (or containerd) to run containers. It's an additional layer, not a replacement.
Q: What exactly is kubernetes used for in production?
Orchestrating containerized applications at scale. Managing deployments, scaling, networking, and fault tolerance. Think of it as a distributed operating system for your applications.
Q: Do I need a team to run Kubernetes?
Yes. You need at least one person who understands cluster management. Managed Kubernetes (EKS, GKE, AKS) reduces the burden but doesn't eliminate it.
Q: What does kubernetes actually do that Docker Compose doesn't?
Docker Compose works on a single machine. Kubernetes works across dozens or hundreds of machines. Kubernetes also handles self-healing, autoscaling, rolling updates, service discovery, and load balancing.
Q: Why are people moving away from kubernetes?
Some teams find the complexity outweighs the benefits. Smaller teams with simpler workloads often move to platforms like Render, Railway, or managed services. Larger teams sometimes move to serverless or PaaS solutions.
Final Thought
Kubernetes is a tool. Not a religion, not a career path, not a solution to every infrastructure problem. It solves a specific set of problems around container orchestration at scale.
If you need to run 3 containers on a single server, Kubernetes is overkill. If you need to run 300 containers across 30 servers, anything else is insufficient.
The best engineers I know don't ask "should we use Kubernetes?" They ask "what problem are we solving, and does Kubernetes solve it better than alternatives?"
That's the right question.
Nishaant Dixit — Founder of SIVARO. Building data infrastructure and production AI systems since 2018. Built systems processing 200K events/sec.