Case Study

ClickHouse vs. PostgreSQL

Escaping the vacuum trap — How migrating from PostgreSQL to ClickHouse handled 180K events/sec and reduced costs by 64%.

310K

Inserts/sec

64%

Cost Reduction

210ms

P95 Query

2.4TB

Storage

01 / Context

A European IoT platform (Series A, 40 million connected devices) was ingesting telemetry data from industrial sensors — temperature, vibration, energy consumption — at 180,000 events per second during peak hours. Daily volume exceeded 15 billion records.

02 / Problem

Write Throughput

PostgreSQL could sustain only 40,000 inserts/sec. Backlog grew to 200M events, causing 15+ minute latency.

40K/sec

Vacuum Overhead

Autovacuum consumed 35% of CPU, causing lock contention that blocked writes.

35% CPU

Storage Bloat

9 TB compressed + 30% dead tuple overhead. Storage costs reached $18,000/month.

$18K/mo

Query Performance

Analytical queries (24h aggregates) took 12–30 seconds, even on read replicas.

12-30s

15 min

Kafka lag at peak, leading to delayed equipment anomaly alerts and SLA breaches

03 / Constraints

Data Integrity

Zero data loss required

Downtime

No ingestion pause

Team

5 engineers, no ClickHouse exp.

Cost Target

40%+ reduction

Retention

90-day + GDPR

04 / Approach

Rejected: PostgreSQL + TimescaleDB

Improved to 55K inserts/sec but still fell short of 180K. Fundamental row-store limitation.

Rejected: Apache Cassandra

Optimized for key-based lookups, not analytics. Would need separate analytics layer.

Selected: ClickHouse

Columnar storage, no vacuum, native Kafka engine. Trade-off: eventual consistency.

Why ClickHouse?

Columnar Storage

Optimized for analytical aggregations over time ranges

MergeTree Engine

Background merges eliminate vacuum overhead

Kafka Engine

Direct consumption from existing topics

05 / Implementation

Dual-Write, Gradual Cutover

10-week migration with dual-write period

10 Weeks

Total Timeline

Week 1-2

Cluster Setup

6-node ClickHouse on i3en.6xlarge (NVMe), 12 shards, 2x replication

Week 3-4

Kafka Integration

Replaced Python consumer with ClickHouse Kafka engine. Dual-write period.

Week 5-6

Query Migration

Rewrote top 20 dashboard queries. New read-only user for dashboards.

Week 7-8

Traffic Shift

Feature flag: 10% → 50% → 100%. PostgreSQL as fallback for 1 month.

Week 9-10

Decommission

Stopped legacy consumer. Backed up PostgreSQL to S3. Deleted RDS.

Final Architecture

Kafka

ClickHouse

S3 (Backup)

06 / Results

Metric	PostgreSQL (RDS)	ClickHouse (EC2)
Write throughput	55,000/sec	310,000/sec
Ingestion latency (p95)	8.2 sec	0.7 sec
Query latency (p95)	18 sec	210 ms
Storage footprint	9 TB + 30%	2.4 TB
Monthly compute cost	$14,200	$5,800
Monthly storage cost	$3,800	$720
Kafka lag (max)	200M	<500K

Total Cost

$18K → $6.5K

64% reduction

Query Speed

85x

Faster (18s → 210ms)

Uptime

90 days

Zero incidents

07 / Key Insight

The difference is architectural alignment with workload.

PostgreSQL excels at transactional workloads, but when the workload is continuous high-rate inserts + analytical aggregations, MVCC and vacuum become fatal bottlenecks. ClickHouse's columnar storage and asynchronous merges eliminate the vacuum trap entirely.

Choose PostgreSQL

• Need ACID transactions
• Complex joins across many tables
• Write volume under 10K/sec

Choose ClickHouse

• Ingesting >50K events/sec
• Queries are aggregations over time ranges
• Can tolerate eventual consistency

Related Case Studies

INFRASTRUCTURE

Evaluating databases for high-volume analytics?

We specialize in data infrastructure engineering and production AI systems. Let's discuss your architecture.

Start a Conversation More Case Studies