Is Netflix Using Kubernetes? The Real Story Behind Their Infrastructure
I get asked this question constantly. "Is Netflix using Kubernetes?" It comes up in every consulting engagement, every conference Q&A, every Reddit thread where someone's trying to justify their cloud spend to a skeptical CTO.
Here's the short answer: Netflix does not use Kubernetes for their core streaming infrastructure. They built their own custom container management system called Titus.
But that's not the whole story. And the whole story matters more than the headline.
Netflix does use Kubernetes in some places — but not the way you'd expect. They run it for internal tools, machine learning workloads, and experimental services. Their core playback pipeline, the system that delivers Stranger Things to your living room? That's all custom.
I've spent years building data infrastructure at SIVARO. I've watched teams burn millions on Kubernetes clusters they didn't need. And I've watched other teams drown trying to replicate Netflix's architecture without Netflix's engineering capacity.
Let me walk you through what Netflix actually runs, why they made those choices, and — more importantly — what you should learn from it.
What Does Kubernetes Actually Do?
Before we get deeper into Netflix, let's be brutally honest about what is kubernetes?
Kubernetes is a container orchestrator. That's it. It schedules containers across machines, handles networking, manages service discovery, and tries to keep your apps running when things break.
But here's what most people miss: Kubernetes doesn't solve business problems. It solves operational problems. It's a tool for managing complexity you've already created, not a magic wand that simplifies your architecture.
What exactly is kubernetes used for? In practice? Running microservices at scale. CI/CD pipelines. Batch processing. Machine learning training jobs. Database management (with operators). And — increasingly — stateful workloads that previously required dedicated infrastructure.
The question isn't "can you run it on Kubernetes?" The question is "should you?"
Netflix answered that question with a clear "no" for their core platform.
Netflix's Real Container Story
Netflix built Titus in 2016. It's their internal container management platform. It runs on top of AWS EC2, handles their streaming pipelines, their encoding jobs, their CDN management, and their chaos engineering experiments.
Why didn't they use Kubernetes? Kubernetes didn't exist yet (well, it did, but v1.0 launched in July 2015 — way too immature for Netflix's scale).
By the time Kubernetes became production-ready, Netflix already had Titus handling millions of containers per day. The migration cost wasn't worth the benefit.
But here's the part nobody talks about: Netflix runs Kubernetes too. Their machine learning platform uses Kubernetes. Their experimentation platform uses Kubernetes. Their internal developer tools use Kubernetes.
Is kubernetes the same as aws? No. Kubernetes is a platform for running containers. AWS is a cloud provider that offers compute, storage, and networking. Kubernetes runs on AWS (through EKS or self-managed). But Netflix's core infrastructure isn't on Kubernetes because they didn't need the abstraction layer Kubernetes provides — they needed deep AWS integration.
Why People Are Moving Away From Kubernetes
I've seen the trend. Everyone's asking why are people moving away from kubernetes? The answer isn't simple, but it's real.
Complexity costs. Ona left Kubernetes because they couldn't justify the operational overhead for their team size. They had 20 engineers running 10 microservices. Kubernetes made everything harder, not easier.
Cost surprises. Kubernetes control plane costs add up. ETCD storage. Load balancers. Monitoring infrastructure. You can burn $500/month on a cluster that handles 3 containers.
Skill scarcity. Good Kubernetes admins are expensive. And rare. Most "Kubernetes experts" can spin up a cluster and deploy an app — they can't debug a flaky scheduler or tune network policies for production throughput.
The abstraction penalty. Every layer of abstraction adds latency, complexity, and debugging friction. The Kubernetes haters guide nails this: Kubernetes makes simple things complex and complex things possible. But most teams only need simple things.
I consulted for a fintech company last year. They had 14 microservices, running on 3 VMs. They'd spent 6 months migrating to Kubernetes. The migration cost $200K in engineering time. After migration? Their latency increased 12%%. Their ops team went from 2 people to 5. And they had zero deployments that couldn't have been handled by Docker Compose and a cron job.
Do you actually NEED kubernetes? Most of the time? No.
What Netflix Teaches Us About Platform Decisions
Netflix's infrastructure decisions tell us something important: build platforms, not abstractions.
Titus handles container scheduling, but it's deeply integrated with AWS networking, storage, and security primitives. Netflix didn't abstract away the cloud — they embraced it and built tooling around it.
What is the main reason you would give a company to use Kubernetes? The top answer on that Reddit thread: "When you have multiple teams deploying multiple services that need to share infrastructure." That's it. If you're a team of 10 running 2 services, Kubernetes is overhead.
Netflix has hundreds of teams. Thousands of services. Millions of containers. They needed orchestration. But they needed their orchestration, not a generic one.
How to Think About Your Own Infrastructure
I've been on both sides. I've run Kubernetes clusters handling 50,000 requests per second. I've also run production systems on a single VM with Docker Compose.
Here's my framework for deciding:
Use plain VMs or Docker Compose if:
- You have fewer than 10 services
- Your team has fewer than 5 engineers
- You deploy less than once per week
- Your traffic patterns are predictable
Consider Kubernetes if:
- You have multiple teams deploying independently
- You need dynamic scaling based on real-time metrics
- You're running 20+ services that share infrastructure
- You have dedicated ops/SRE headcount
Don't use Kubernetes for:
- Simple web apps
- Prototypes
- Teams that can't afford to lose a week when the control plane breaks
- Services that run on cron schedules (use a batch system instead)
Netflix's Actual Architecture (What They Share Publicly)
Netflix open-sourced parts of Titus. Their GitHub shows how they handle:
Container scheduling: Titus uses Apache Mesos underneath (yes, they were Mesos users). It manages resource allocation across EC2 instances.
Networking: They built an ENI (Elastic Network Interface) binding system. Each container gets its own IP address in the VPC. No overlays, no NAT, no port mapping.
Storage: EBS volumes attached directly to containers. No PVCs, no CSI drivers.
Security: IAM roles per container. Not per pod, per container.
Here's the thing: Kubernetes could do all of this today. EKS supports ENI-based networking through AWS VPC CNI. You can use IAM roles for service accounts. You can attach EBS volumes through CSI drivers.
But Netflix built this before Kubernetes existed. And they built it better for their specific use cases.
When Kubernetes Makes Sense (Real Examples)
I'm not anti-Kubernetes. I'm anti-Kubernetes-for-the-wrong-reasons.
Spotify uses Kubernetes extensively. They run over 1,000 microservices on it. Their team is over 6,000 engineers. Kubernetes lets them standardize deployment patterns across squads.
Pinterest runs Kubernetes for their ML training infrastructure. They spawn thousands of training jobs daily. Kubernetes handles resource contention and prioritization better than their previous system.
Shopify uses Kubernetes for their merchant-facing APIs. They needed global scaling across multiple regions. Kubernetes with multi-cluster federation works for them.
These companies share a pattern: large engineering organizations, multiple teams, high deployment frequency, and a need for standardization.
Netflix doesn't share that pattern — their engineering is centralized around streaming infrastructure, with deep specialization in video encoding, CDN management, and recommendation systems.
The Code Problem No One Talks About
Here's something I've learned the hard way: Kubernetes YAML is code, and it has a cost.
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
That's 20 lines for a simple web server. Now add health checks, resource limits, service accounts, config maps, secrets, ingress rules, network policies, pod disruption budgets, and horizontal pod autoscalers.
Your simple web server is now 200 lines of YAML.
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
And that's just the HPA. You also need network policies:
yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-nginx-ingress
spec:
podSelector:
matchLabels:
app: nginx
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 80
Each YAML file is a potential failure point. A misconfigured network policy can silently block traffic. A missing resource limit can cause pod evictions. A wrong label selector can orphan your services.
Netflix's Titus configuration? It's Python code. They wrote their own DSL. Their engineers don't touch YAML. They write Python objects that compile down to container specs.
That's the lesson: abstract away your platform. Don't make every developer learn Kubernetes internals.
The Real Answer to "Is Netflix Using Kubernetes?"
Yes and no.
No, they don't use Kubernetes for their core streaming infrastructure. That's Titus on Mesos.
Yes, they use Kubernetes for ML workloads, internal tools, and experimental services.
But the real answer is more nuanced: Netflix uses whatever solves the problem efficiently. They don't cargo-cult. They don't adopt technologies because they're trendy. They build what they need.
What exactly is kubernetes used for? For Netflix? Not their critical path. For your company? Maybe, maybe not.
What You Should Do Instead of Copying Netflix
Stop trying to be Netflix. You don't have their engineering capacity, their traffic patterns, or their infrastructure debt.
Instead:
1. Start simple. Use a single VM. If you outgrow it, add another VM with a load balancer. If you outgrow that, consider containers. If you need multiple services with independent scaling, then look at orchestration.
2. Use managed services. AWS ECS Fargate handles container scheduling without Kubernetes complexity. Google Cloud Run runs containers serverlessly. Azure Container Instances does the same. Don't jump to Kubernetes until you've exhausted simpler options.
3. Measure before you migrate. Track your current deployment time, resource utilization, and engineering overhead. If Kubernetes saves you 2 hours per week in deployment time but costs 10 hours per week in maintenance, you're losing.
4. Hire for ops before platform. A good SRE can run 10 services on 3 VMs better than a bad SRE can run 200 services on Kubernetes. Build ops competence before platform complexity.
5. Remember: many people don't need kubernetes. The HN thread is brutal but honest. Kubernetes solves problems that most companies don't have yet.
The Future: Where Kubernetes Is Actually Going
Three trends I'm watching:
1. Kubernetes as a distribution channel. Cloud providers are using Kubernetes as the interface for everything — databases, messaging, AI services. You deploy a "Kubernetes operator" and it manages an external service. This is interesting because it decouples the orchestration from the vendor.
2. Serverless Kubernetes. AWS Fargate, Google Cloud Run for Anthos, Azure Container Instances — these let you run containers on Kubernetes without managing nodes. This removes the #1 pain point: cluster operations.
3. Platform engineering. Companies are building internal platforms on top of Kubernetes. They abstract away the YAML, the networking, the monitoring. Developers see a simple API: "deploy my service." This is what Netflix did with Titus, but on Kubernetes.
I think the third trend is the most important. Kubernetes isn't dying — it's becoming infrastructure that most teams shouldn't touch directly.
FAQ
Q: Does Netflix use Kubernetes in production?
A: For some workloads, yes. Their ML training platform and experimentation infrastructure run on Kubernetes. But their core streaming platform runs on Titus (their custom container manager built on Apache Mesos).
Q: Is Netflix moving to Kubernetes?
A: Not for their core systems. The migration cost doesn't justify the benefit. They've invested years in Titus, and it handles millions of containers per day. They continue to use Kubernetes for newer, less critical workloads.
Q: Does is netflix using kubernetes matter for my startup?
A: No. Netflix is an outlier. Their scale, engineering team size, and infrastructure debt mean their decisions don't apply to most companies. You should look at companies at your stage, not hyperscalers.
Q: What should I use instead of Kubernetes for small teams?
A: Docker Compose on a single VM. AWS ECS Fargate. Google Cloud Run. A plain app server with a process manager. "If it fits on one machine, keep it there."
Q: Why does Kubernetes get so much hate?
A: Because it's hyped as a solution for everything. When you ask why people hate kubernetes, the answer is usually: complexity overhead, unclear cost models, and the expectation that "Kubernetes will fix our architecture problems." It won't — it only makes bad architecture more complex.
Q: Is kubernetes the same as aws?
A: No. AWS is a cloud provider. Kubernetes runs on AWS (or GCP, Azure, or on-premises). Kubernetes manages containers across machines. AWS manages virtual machines, storage, networking, and databases. They solve different problems.
Q: Can I run Netflix on Kubernetes?
A: Technically? Yes. But Netflix doesn't. And they have 200 million subscribers. The fact that they chose not to use Kubernetes for their core platform should tell you something.
A Final Thought
I've been building infrastructure for 15 years. I've seen Docker, Mesos, Kubernetes, Nomad, Lambda, and half a dozen "container orchestration" platforms come and go.
The question isn't "is Netflix using Kubernetes?"
The question is: is your team's architecture justified by the problem you're solving?
If you're running a monolith on three servers, you don't need Kubernetes. If you're running 200 microservices across 50 teams, you probably do. If you're somewhere in between, you need to think hard about whether Kubernetes helps or hurts.
Netflix built their own platform because they had to. You probably don't.
Nishaant Dixit — Founder of SIVARO. Building data infrastructure and production AI systems since 2018. Built systems processing 200K events/sec.