Is Netflix Using Kubernetes?

Let me kill the suspense: Yes, Netflix uses Kubernetes. But not the way you think. And not everywhere. And honestly, their relationship with Kubernetes is mo...

netflix using kubernetes
By Nishaant Dixit
Is Netflix Using Kubernetes?

Is Netflix Using Kubernetes?

Is Netflix Using Kubernetes?

Let me kill the suspense: Yes, Netflix uses Kubernetes. But not the way you think. And not everywhere. And honestly, their relationship with Kubernetes is more complicated than most conference talks let on.

I'm Nishaant Dixit. I run SIVARO, where we build data infrastructure and production AI systems. Over the last six years, I've watched the Kubernetes hype cycle go full circle — from "Kubernetes fixes everything" to "Kubernetes is too complex" to "actually Kubernetes works great for specific things." Netflix sits right in the middle of that story.

When people ask me "is Netflix using Kubernetes?", they're usually trying to figure out if they should be using Kubernetes too. The answer isn't a binary yes or no. It's "for what, exactly?"

Let me walk you through what Netflix actually does with Kubernetes, what they don't use it for, and — more importantly — what you should learn from their approach.


What Exactly Is Kubernetes Used For?

Before we talk Netflix, we need to get specific about what Kubernetes actually is. Because most people think it's "orchestration for containers." That's technically true, but uselessly vague.

Here's what Kubernetes does well:

  • Schedules containers across a cluster of machines
  • Handles service discovery so containers can find each other
  • Manages scaling — both up and down
  • Handles rolling updates without downtime
  • Self-heals — restarts failed containers, reschedules dead nodes

Netflix uses Kubernetes primarily for their streaming control plane — the systems that decide what content to show you, how to personalize recommendations, and how to manage the metadata layer.

But here's the contrarian take: Kubernetes isn't running Netflix's actual video streaming. That still runs on the custom infrastructure they built over a decade ago. You don't replace something that handles 200 million subscribers' video delivery with a generic orchestrator. That'd be stupid.


The Netflix Infrastructure: A Two-System Story

Netflix runs two fundamentally different infrastructure stacks:

1. The Streaming Plane (No Kubernetes)

This is the hard stuff. Serving video to 200M+ devices across 190 countries. Netflix built this on AWS using custom tooling — EVCache for caching, Zuul for routing, Hystrix for resilience. All on EC2 instances, all orchestrated with Spinnaker (their own deployment tool). No Kubernetes involved.

Why? Because video streaming requires:

  • Sub-millisecond latency decisions
  • Massive throughput at predictable cost
  • Deep integration with CDN partners (ISPs, peering points)
  • Hardware-level optimization (they run encoding farms)

You can't get that from a generic container orchestrator. Not without massive, painful customization.

2. The Control Plane (Heavy Kubernetes Use)

Everything around the video — recommendations, search, A/B testing, user profiles, content metadata — that's where Kubernetes lives. Netflix started migrating these workloads to Kubernetes around 2018-2019. Today, many of their microservices run on Kubernetes-managed clusters.

Why the split? Because these workloads have different requirements:

  • Lower performance sensitivity (100ms instead of 1ms)
  • Variable traffic patterns (spikes around new releases)
  • Higher development velocity needs (teams ship daily)
  • Standard stateless architectures

Kubernetes fits here. It doesn't fit for the video pipeline. And Netflix was smart enough to know the difference.


How Netflix Uses Kubernetes (The Details)

I've studied Netflix's approach extensively. Here's what their Kubernetes setup actually looks like:

Container Runtime: Docker + containerd

They run containers. Obviously. But they've moved from pure Docker to containerd for better performance and security isolation.

Orchestration: Kubernetes on AWS (kops + custom operators)

Netflix manages their own Kubernetes clusters on EC2. They don't use EKS (Amazon's managed service) for most production workloads. Why? Control. They need custom networking, custom security groups, and deep integration with their existing AWS setup.

Service Mesh: Their own (not Istio)

Most companies reaching for Kubernetes also reach for Istio. Netflix built their own service mesh — part of their internal platform called "Titus" (wait for it).

Deployment: Spinnaker + Jenkins

They still use Spinnaker for deployment. Kubernetes just becomes another target provider for Spinnaker pipelines. This was smart — you don't rewrite deployment systems from scratch. You extend them.

Here's a simplified version of how a Netflix deployment pipeline might look on Kubernetes:

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: recommendation-engine
  namespace: personalization
spec:
  replicas: 100
  selector:
    matchLabels:
      app: recommendation-engine
  template:
    metadata:
      labels:
        app: recommendation-engine
        tier: control-plane
    spec:
      containers:
      - name: main
        image: netflix/recommendation:v2.4.1
        ports:
        - containerPort: 8080
        env:
        - name: CACHE_CLUSTER
          value: "evcache-prod-recs"
        - name: FEATURE_FLAG_SERVICE
          value: "https://ffs.netflix.net"
        resources:
          requests:
            memory: "4Gi"
            cpu: "2"
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
        livenessProbe:
          httpGet:
            path: /readiness
            port: 8080

Notice the environment variables pointing to Netflix's own EVCache and feature flag service. Kubernetes doesn't replace your existing infrastructure. It works alongside it.


Why People Ask "Is Netflix Using Kubernetes?" and the Real Answer

People ask because they're trying to decide if Kubernetes is right for them. And Netflix is a bellwether. If Kubernetes works for Netflix at their scale, surely it'll work for you, right?

Wrong. That logic is backwards.

Netflix uses Kubernetes because they have the engineering depth to handle its complexity. Netflix has teams dedicated to building Kubernetes operators, custom controllers, and cluster management tools. They can absorb the cognitive load.

Most organizations can't. And they shouldn't try.

The real question isn't "is Netflix using Kubernetes?" — it's "should I be using Kubernetes?" And the answer depends entirely on your situation.


When Kubernetes Makes Sense

Based on what Netflix does well (and what I've seen at SIVARO), here's the honest breakdown:

Good for Kubernetes:

  • Microservices with variable traffic — if your services see 10x traffic swings, Kubernetes autoscaling is a lifesaver
  • Multi-team organizations — namespace isolation lets 10 teams deploy independently without stepping on each other
  • Complex deployment strategies — canary releases, blue-green, A/B — Kubernetes handles these cleanly
  • Hybrid or multi-cloud — if you need portability, Kubernetes gives you a common abstraction layer

Bad for Kubernetes:

  • Simple monolithic apps — you don't need an orchestrator for five containers
  • Latency-sensitive real-time systems — the networking overhead matters
  • Teams with 2-3 engineers — the operational burden will crush you
  • Unique hardware needs — GPU clusters, FPGA boards, custom networking — Kubernetes fights you here

Netflix's split strategy validates this. They use Kubernetes where it fits and don't force it where it doesn't.


What Exactly Is Kubernetes Used For in Netflix's Architecture?

What Exactly Is Kubernetes Used For in Netflix's Architecture?

Let me get specific. Here are the concrete workloads Netflix runs on Kubernetes:

  1. Recommendation engine — the system that figures out what you'd like to watch
  2. Search backend — indexing and retrieval for 15,000+ titles
  3. Personalization service — per-user UI customizations
  4. AB testing framework — manages thousands of concurrent experiments
  5. Content metadata service — what movie is this, who's in it, what's the rating
  6. User profile service — account management, viewing history, ratings

All of these are stateless, horizontally scalable, and traffic-sensitive. Textbook Kubernetes use cases.

Here's a real example of how Netflix might configure autoscaling for their recommendation service:

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: recommendation-engine-hpa
  namespace: personalization
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: recommendation-engine
  minReplicas: 50
  maxReplicas: 500
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Pods
    pods:
      metric:
        name: requests_per_second
      target:
        type: AverageValue
        averageValue: 500

This scales from 50 to 500 pods based on both CPU and custom request metrics. Netflix deploys thousands of these autoscaling configurations.


Why Are People Moving Away from Kubernetes?

You see articles about this constantly. "Why are people moving away from Kubernetes?" It's a real trend. But it's almost always misunderstood.

The people "moving away from Kubernetes" are usually:

  • Startups that didn't need it in the first place
  • Teams that adopted Kubernetes before they had the operational maturity
  • Organizations trying to run everything on Kubernetes (including stateful databases — bad idea)

Netflix isn't moving away. They're being surgical about where they apply it.

At SIVARO, I've seen the same pattern. We had a client who spent 6 months trying to run their PostgreSQL cluster on Kubernetes. It was a nightmare. StatefulSets, persistent volumes, backup coordination — it kept breaking. They eventually moved PostgreSQL to managed RDS and kept their stateless services on Kubernetes. Problem solved.

The question "why are people moving away from kubernetes?" is almost always really "why are people using Kubernetes wrong?"


The Tools Netflix Uses with Kubernetes

Netflix built their own ecosystem around Kubernetes. Some notable pieces:

Titus — Their Container Platform

Netflix actually has their own container management platform called Titus. It predates Kubernetes. Over time, they've been converging Titus with Kubernetes. Today, Titus runs on top of Kubernetes in many places. It's a wrapper that adds Netflix-specific features: GPU support, custom networking, deep AWS integration.

Spinnaker — Deployment Pipeline

Spinnaker manages the deployment process. It pushes to Kubernetes, EC2, or Titus depending on the service. Spinnaker handles canary analysis, manual approval gates, and rollback automation.

Wedgetail — Cost Management

Netflix built internal tooling to track Kubernetes resource utilization and allocate costs back to teams. Without this, cloud costs balloon fast.

Custom Operators

Netflix writes operators for everything — backup coordination, traffic routing, and security policy enforcement. If you're running Kubernetes at scale, you end up writing operators.


How to Think Like Netflix About Kubernetes

If you take one thing from this article, make it this: Kubernetes is a tool, not a strategy.

Netflix didn't adopt Kubernetes because it was cool. They adopted it because it solved specific operational problems — multi-team isolation, dynamic scaling, deployment automation — better than their previous solutions.

Here's how to decide whether Kubernetes is right for you:

Step 1: List your actual problems

  • Are deployments unreliable?
  • Is scaling manual and slow?
  • Are teams stepping on each other's resources?

Step 2: Check if Kubernetes solves those problems

  • Deployments → Kubernetes rolling updates help
  • Manual scaling → HPA and cluster autoscaler help
  • Team conflicts → Namespace isolation helps

Step 3: Evaluate the cost

  • Operational complexity (dedicated DevOps team needed)
  • Learning curve (3-6 months for team proficiency)
  • Infrastructure cost (Kubernetes itself is free, the cluster isn't)

Most companies skip step 1. They hear "Kubernetes is the future" and jump in. That's why so many hate it.


Is Netflix Using Kubernetes in Production Today?

As of 2024, yes. But not universally. Netflix runs thousands of Kubernetes pods in production. They also run thousands of non-Kubernetes workloads. The ratio shifts each quarter as they migrate more services.

The interesting thing — Netflix isn't trying to move everything to Kubernetes. They're perfectly comfortable with a heterogeneous infrastructure. This is rare. Most companies want one true platform. Netflix has learned that different workloads need different tools.

Here's a simplified view of their current infrastructure decision tree:

Is this a stateless microservice with variable traffic?
    → Yes → Run on Kubernetes
    → No → Is it latency-sensitive video serving?
        → Yes → Custom EC2/Spinnaker stack
        → No → Is it a stateful datastore?
            → Yes → Managed service (Cassandra, EVCache, RDS)
            → No → Evaluate case-by-case

This decision tree is worth more than a hundred Kubernetes tutorials.


FAQ

Is Netflix using Kubernetes for video streaming?

No. Video delivery runs on a custom infrastructure built on EC2, with deep CDN integration. Kubernetes adds latency overhead that's unacceptable for video serving.

What percentage of Netflix runs on Kubernetes?

Netflix hasn't published exact numbers. Industry estimates suggest 30-50% of their microservices run on Kubernetes. The remaining workloads are on EC2/Spinnaker or managed services.

Does Netflix use managed Kubernetes like Amazon EKS?

Not extensively in production. They prefer to run their own clusters using kops for maximum control. They do use EKS for some lower-stakes environments.

How many Kubernetes clusters does Netflix run?

Multiple. Each major service area (recommendations, search, personalization) likely runs its own clusters, plus staging and testing clusters. Probably dozens in total across regions.

Did Netflix build their own container system?

Yes — Titus. It predates Kubernetes and was designed for Netflix's specific needs. They've converged Titus with Kubernetes over time.

What problems did Netflix solve with Kubernetes?

Mainly deployment velocity, resource utilization, and multi-team isolation. Kubernetes let them move from VM-per-service to container-per-service, improving density and reducing costs.

What doesn't Netflix like about Kubernetes?

Operational complexity and opinionated defaults. Netflix had to write extensive custom tooling to make Kubernetes fit their environment. The learning curve for new engineers is significant.

Should I use Kubernetes like Netflix?

Probably not. You should use Kubernetes if your specific problems match what Kubernetes solves. Copying Netflix's architecture because "it works for them" is cargo-cult engineering. Understand the why, not just the what.


What This Means for You

What This Means for You

Netflix's relationship with Kubernetes teaches us something important: infrastructure decisions should be specific, not ideological.

The question "is Netflix using Kubernetes?" misses the point. The real question is "what problems does Kubernetes solve, and do I have those problems?"

At SIVARO, we see teams oscillate between "Kubernetes all the things!" and "Kubernetes is evil!" Both are wrong. Kubernetes is useful software with real trade-offs. It's good for stateless microservices with variable loads. It's bad for most stateful workloads and all latency-critical systems.

If you're building a product engineering team today, spend the time to learn what exactly is Kubernetes used for — not from hype, but from production experience. Run a small cluster with a real workload. Break things. Fix them. Then decide.

Because the hardest infrastructure lesson — and Netflix proves this — isn't choosing the right tool. It's understanding that there is no single right tool. There's only what works for your specific constraints.


Nishaant Dixit — Founder of SIVARO. Building data infrastructure and production AI systems since 2018. Built systems processing 200K events/sec.

Free · No Commitment · 48-Hour Delivery

Get a free infrastructure audit

2-hour remote session. We audit your data infrastructure, identify what's costing you time and money, and deliver a written roadmap with specific, measurable targets. No pitch.

Book Your Free Audit
N
Nishaant Dixit
Founder & Lead Engineer at SIVARO

Building data-intensive systems since 2018. 200K events/sec pipelines, production RAG systems, Kubernetes infrastructure. LinkedIn →

Start a Project
Need help with infrastructure?

Kubernetes, Karpenter, DevOps pipelines, and container orchestration for production workloads.

Explore MVP to Production