Is Kafka Good or Evil? The Truth About the Tech (and the Writer)
Here's what nobody tells you about Kafka: there are two of them. There's Franz Kafka, the early 20th century writer whose name became an adjective for bureaucratic nightmare. And there's Apache Kafka, the distributed event streaming platform that runs half the world's real-time data pipelines.
Both are having a moment. Gen Z is obsessing over the writer — Why GenZ is SECRETLY OBSESSED with this author has millions of views. Meanwhile, every engineering team I talk to is either running Kafka or trying to run from it.
So is Kafka good or evil? The answer is complicated. And it depends entirely on which Kafka you're asking about.
Let me save you the suspense: Franz Kafka the writer? Probably good, definitely misunderstood. Apache Kafka the technology? It's a tool. Tools aren't evil. But how they get used? That's another story.
The Other Kafka: Why Gen Z Can't Stop Reading Franz
I wasn't prepared for this. I started building data systems in 2018 and only knew Franz Kafka as the guy whose name got slapped on a distributed commit log. Then my 22-year-old intern started talking about The Metamorphosis like it was a Spotify Wrapped reveal.
Turns out she was onto something.
Why Gen-z is so obsessed by Kafka? has become a recurring thread across social platforms. The numbers don't lie. TikTok videos tagged #Kafka have accumulated hundreds of millions of views. Booksellers report Kafka's works selling faster than contemporary fiction.
Why? Because Gen-Z's obsession with Kafka & Dostoevsky isn't about literary trendiness. It's about recognition.
Franz Kafka wrote about:
- Bureaucratic systems that make no sense but can't be escaped
- Isolation in a world that demands constant connection
- The feeling of waking up one day and being fundamentally different with no explanation
Sound familiar? It should. That's the modern workplace. That's social media. That's the gig economy.
Why is Gen Z obsessed with Kafka? asks the question directly. The answer is brutal: because they're living his novels.
The Kafkaesque isn't abstract anymore. It's the insurance claim that gets denied for reasons nobody can explain. It's the job application that vanishes into an ATS black hole. It's the apartment lease fine print that traps you for a year.
100 years after his death, Gen Z loves Franz Kafka — and they should actually read him, not just quote him. Because the real Franz Kafka wasn't writing about despair. He was writing about the absurdity of systems that claim to be rational but aren't.
Most people think Kafka was a pessimist. They're wrong. He was a realist who happened to have a dark sense of humor.
Do you think that F. Kafka wanted his writings destroyed after his death? He told his friend Max Brod to burn everything. Brod didn't. Thank god. Or maybe Kafka knew exactly what he was doing by telling Brod. Hard to say.
Franz Kafka (1883-1924) lived a short life. He died of tuberculosis at 40. He never got to see his work become part of language itself.
Here's the irony: a man who wrote about feeling invisible and powerless became immortal. His name is now a verb, an adjective, and a database.
The Other Kafka: Apache Kafka
Now let's talk about the tech. Because this is where "good or evil" gets real.
Apache Kafka was created at LinkedIn in 2011. Jay Kreps, Neha Narkhede, and others needed a way to handle the fire hose of data flowing through LinkedIn's systems. They built a distributed commit log. It was fast, durable, and could replay data as many times as you wanted.
I used Kafka for the first time in 2019. We were building a real-time fraud detection pipeline. Traffic spike hits, we scale up. Traffic drops, we scale down. Kafka sat in the middle, swallowing events like a patient monster.
// Basic Kafka producer in Python
from kafka import KafkaProducer
import json
producer = KafkaProducer(
bootstrap_servers=['localhost:9092'],
value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
# Send an event
producer.send('fraud-alerts', {'user_id': 1234, 'amount': 5000.00, 'flagged': True})
producer.flush()
Simple, right? Until it isn't.
The Good
Kafka does one thing exceptionally well: it decouples producers from consumers. Your frontend can fire events without caring who reads them. Your analytics team can consume those events days later, replaying from the exact point they need.
I've seen Kafka handle 200K events per second on modest hardware. That's not a boast — that's what it's designed for.
// Kafka consumer with explicit offset management
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "fraud-analytics");
props.put("enable.auto.commit", "false");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("fraud-alerts"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
processFraudAlert(record.value());
// Manual offset commit after processing
consumer.commitSync();
}
}
The durability model is what sells it. Kafka writes to disk. It replicates across brokers. If a node dies, another picks up without missing a beat. We tested this at SIVARO by killing a broker during production traffic. Zero data loss. That's not luck — that's design.
The Evil
Here's where the "evil" accusations start.
Kafka is deceptively complex. The five-minute "hello world" demo works beautifully. The production deployment will haunt your dreams.
You need to understand:
- Partitioning strategies (get this wrong and your consumers idle while others choke)
- Replication factor vs. ISR (in-sync replicas) configurations
- Exactly-once semantics (which aren't exactly exactly-once in practice)
- Consumer group rebalancing (which can trigger cascading failures)
- ZooKeeper or KRaft (ZooKeeper is dying, KRaft isn't fully ready)
I watched a team at a Series B startup lose 3 days of production data because they set min.insync.replicas to 1 and a broker went down. The docs said "at least 2." They thought they knew better.
# A common Kafka tragedy waiting to happen
kafka-topics.sh --create --topic critical-orders --partitions 3 --replication-factor 2 --config min.insync.replicas=1 # <- This will bite you
--bootstrap-server localhost:9092
Another team I consulted for had Kafka eating 70%% of their AWS bill. They had 15 brokers running on r5.xlarge instances. Most were idle. Nobody had tuned the retention policies. They were storing 14 days of data when 48 hours was plenty.
The Real Evil: How Teams Misuse It
Kafka isn't evil. But the way teams deploy it? Sometimes close.
The "Kafka as a database" trap. Kafka isn't a database. It doesn't do queries. It doesn't have indexes. It's a log. Treating it like a database leads to pain.
The "let Kafka solve everything" approach. Kafka handles streaming. It doesn't handle batch processing well. It doesn't handle small message volumes efficiently. You don't need Kafka for 500 events a day. You need a Postgres queue.
The "we'll figure out monitoring later" mistake. Kafka without monitoring is a time bomb. You need to track consumer lag, broker disk usage, network throughput, and request rates. Most teams set this up after the first outage.
At a previous company, we had a consumer lag issue that went undetected for 3 weeks. The monitoring dashboard was "on the roadmap." By the time we noticed, the backlog was 50 million messages. Took us 4 days to clear it.
The Developer Experience: Where Kafka Fails
Let me be direct. The developer experience around Kafka isn't great.
You need to run local clusters for testing. That means Docker Compose files with ZooKeeper, multiple brokers, and Schema Registry. Your CI pipeline needs the same. Setup time ranges from "annoying" to "I quit."
# Minimal docker-compose.yml — still not minimal
version: '3'
services:
zookeeper:
image: confluentinc/cp-zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
kafka:
image: confluentinc/cp-kafka:latest
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
Compare that to Redis. Or RabbitMQ. Or even SQS. One command. Done.
The ecosystem has gotten better. Confluent's Kafka REST Proxy lets you avoid client libraries entirely. But if you're using Kafka-native clients, you're managing state, serialization, and offset tracking yourself.
And don't get me started on Schema Registry. Yes, Avro is efficient. Yes, schema evolution matters. But now you have another service to deploy, another failure domain, another thing that can break your pipeline.
When Kafka Is Actually the Right Choice
I've worked with teams that should use Kafka and don't. And teams that shouldn't and do.
Use Kafka when:
- You need replayable event streams (multiple consumers reading the same data at different times)
- You're building event sourcing or CQRS architectures
- You need to handle spikes gracefully (Kafka buffers naturally)
- You have multiple downstream systems that need the same data
- Your throughput exceeds 10K messages/second
Don't use Kafka when:
- You have one producer and one consumer
- Your message volume is under 100/second
- You need simple request-reply patterns
- Your team has no experience running distributed systems
- You're trying to replace a relational database
I'm not saying Kafka is bad for small teams. I'm saying it's expensive in operational complexity. That cost is worth it at scale. It's not worth it for your MVP.
The Evil of Kafka as a Cultural Phenomenon
Here's a thought that keeps me up.
Franz Kafka wrote about dehumanizing systems. Apache Kafka enables systems that process human behavior at scale — clicks, purchases, location data, conversations. We're building the machinery that makes the Kafkaesque possible.
Every time a user's data flows through a Kafka pipeline without their knowledge or consent, we're living the metaphor. The system works. But for whom?
Why GenZ is ADDICTED To This Author? points out that Kafka's appeal lies in naming the invisible structures that control us. The surveillance capitalism that tracks our every move? That runs on Kafka clusters.
I've built systems that process user behavior streams. We logged every click, every scroll, every pause. The Kafka topic was called "user-events" — sterile, technical, harmless. But that data fed ML models that optimized for engagement at the expense of well-being.
Am I the bureaucrat in Kafka's novel? Maybe.
How to Use Kafka Without Becoming Evil
Here's what I've learned from running Kafka in production for 6 years.
1. Set hard retention limits from day one. Don't store data "forever" because you might need it. Set TTLs. Your 90-day-old data is a liability, not an asset.
2. Monitor consumer lag before anything else. This is the single metric that tells you if your system is healthy. Lag grows → something is broken. Fix it now, not next sprint.
3. Never use default configurations. Every Kafka cluster I've seen using default settings has had a production incident within 3 months. Tune retention, replication, and partition counts for your specific workload.
4. Test partition strategy carefully. Bad partitioning means hot partitions that starve consumers while others idle. Hash by a key with good cardinality. UUIDs work. Customer IDs work. Timestamps alone do not work.
5. Have a plan for schema evolution. Avro with Schema Registry is the standard. But even JSON with a version field is better than having no plan. Because your producers will change, and your consumers will break.
6. Accept that Kafka isn't magic. It's a distributed log. It has tradeoffs. It can lose data (if you configure it wrong). It can duplicate data (exactly-once is a lie). It can deadlock (consumer rebalancing in large groups is terrifying).
The Verdict: Good or Evil?
Here's where I land.
Franz Kafka the writer was good. He wrote about the horror of modern life so that we could recognize it, name it, and maybe survive it. Has anyone read anything by Franz Kafka? — yes, and they found a mirror.
Apache Kafka the technology is neutral. It's a hammer. You can build a house with it or break a window.
The evil is in how we use it. When Kafka pipelines become the infrastructure that powers surveillance, manipulation, and control, we've built exactly the world Kafka the writer warned us about.
So is Kafka good or evil? It's both. It's a tool that reflects its creators. If you build systems that serve users instead of exploiting them, you're doing it right. If you're optimizing for engagement at any cost, you're the bureaucrat.
I choose good. But I check my partitions every morning.
FAQ
Q: Who is Franz Kafka and why should I care?
A: Franz Kafka was a Czech-born writer who died in 1924. His works — The Metamorphosis, The Trial, The Castle — describe bureaucratic absurdity and existential isolation. Franz Kafka Wikipedia page covers the basics. You should care because his writing predicted modern work life.
Q: Why is Gen Z so obsessed with Kafka?
A: Gen Z sees their lives in Kafka's work. Student debt they can't escape. Job applications that vanish into HR systems. Social media algorithms that control attention. Why is Gen Z obsessed with Kafka? explains it well.
Q: Is Apache Kafka hard to learn?
A: Yes. The basics are simple (produce, consume, topics). Production deployment is complex (partitioning, replication, monitoring, rebalancing). Plan for 2-4 weeks of learning before you're productive.
Q: Should I use Kafka for my startup's MVP?
A: Probably not. Use a simple queue or database. Add Kafka when you need multiple consumers, replayability, or high throughput. Premature Kafka adoption kills teams.
Q: Does Kafka guarantee exactly-once delivery?
A: It claims to with idempotent producers and transactional APIs. In practice, you'll see duplicates. Design your consumers for idempotency. Assume at-least-once semantics.
Q: Is Kafka the same as Franz Kafka?
A: No. Apache Kafka was named after Franz Kafka because its creators felt the software was "a system optimized for writing" — Kafka the writer was also known for his writing. The name is a tribute, not a connection.
Q: What's cheaper: Kafka or managed alternatives?
A: Managed Kafka (Confluent Cloud, AWS MSK, Redpanda) costs 2-3x more per month. Self-hosting is cheaper but costs in engineering time. At 50K events/second, self-hosting breaks even. Under that, managed wins.
Q: Is Kafka dying?
A: No. Kafka is still the standard for event streaming. Alternatives like Redpanda (Kafka API compatible) and Pulsar exist. But Kafka's ecosystem is massive. It's not going anywhere in the next 5 years.
Nishaant Dixit — Founder of SIVARO. Building data infrastructure and production AI systems since 2018. Built systems processing 200K events/sec.