Case Study

ClickHouse vs. PostgreSQL

Escaping the vacuum trap — How migrating from PostgreSQL to ClickHouse handled 180K events/sec and reduced costs by 64%.

310K

Inserts/sec

64%

Cost Reduction

210ms

P95 Query

2.4TB

Storage

01 / Context

A European IoT platform (Series A, 40 million connected devices) was ingesting telemetry data from industrial sensors — temperature, vibration, energy consumption — at 180,000 events per second during peak hours. Daily volume exceeded 15 billion records.

02 / Problem

Write Throughput

PostgreSQL could sustain only 40,000 inserts/sec. Backlog grew to 200M events, causing 15+ minute latency.

40K/sec

Vacuum Overhead

Autovacuum consumed 35% of CPU, causing lock contention that blocked writes.

35% CPU

Storage Bloat

9 TB compressed + 30% dead tuple overhead. Storage costs reached $18,000/month.

$18K/mo

Query Performance

Analytical queries (24h aggregates) took 12–30 seconds, even on read replicas.

12-30s
15 min
Kafka lag at peak, leading to delayed equipment anomaly alerts and SLA breaches
03 / Constraints

Data Integrity

Zero data loss required

Downtime

No ingestion pause

Team

5 engineers, no ClickHouse exp.

Cost Target

40%+ reduction

Retention

90-day + GDPR

04 / Approach

Rejected: PostgreSQL + TimescaleDB

Improved to 55K inserts/sec but still fell short of 180K. Fundamental row-store limitation.

Rejected: Apache Cassandra

Optimized for key-based lookups, not analytics. Would need separate analytics layer.

Selected: ClickHouse

Columnar storage, no vacuum, native Kafka engine. Trade-off: eventual consistency.

Why ClickHouse?

Columnar Storage

Optimized for analytical aggregations over time ranges

MergeTree Engine

Background merges eliminate vacuum overhead

Kafka Engine

Direct consumption from existing topics

05 / Implementation

Dual-Write, Gradual Cutover

10-week migration with dual-write period

10 Weeks

Total Timeline

Week 1-2

Cluster Setup

6-node ClickHouse on i3en.6xlarge (NVMe), 12 shards, 2x replication

Week 3-4

Kafka Integration

Replaced Python consumer with ClickHouse Kafka engine. Dual-write period.

Week 5-6

Query Migration

Rewrote top 20 dashboard queries. New read-only user for dashboards.

Week 7-8

Traffic Shift

Feature flag: 10% → 50% → 100%. PostgreSQL as fallback for 1 month.

Week 9-10

Decommission

Stopped legacy consumer. Backed up PostgreSQL to S3. Deleted RDS.

Final Architecture

Kafka
ClickHouse
S3 (Backup)
06 / Results
Metric PostgreSQL (RDS) ClickHouse (EC2)
Write throughput 55,000/sec 310,000/sec
Ingestion latency (p95) 8.2 sec 0.7 sec
Query latency (p95) 18 sec 210 ms
Storage footprint 9 TB + 30% 2.4 TB
Monthly compute cost $14,200 $5,800
Monthly storage cost $3,800 $720
Kafka lag (max) 200M <500K

Total Cost

$18K → $6.5K

64% reduction

Query Speed

85x

Faster (18s → 210ms)

Uptime

90 days

Zero incidents

07 / Key Insight

The difference is architectural alignment with workload.

PostgreSQL excels at transactional workloads, but when the workload is continuous high-rate inserts + analytical aggregations, MVCC and vacuum become fatal bottlenecks. ClickHouse's columnar storage and asynchronous merges eliminate the vacuum trap entirely.

Choose PostgreSQL

  • • Need ACID transactions
  • • Complex joins across many tables
  • • Write volume under 10K/sec

Choose ClickHouse

  • • Ingesting >50K events/sec
  • • Queries are aggregations over time ranges
  • • Can tolerate eventual consistency

Related Case Studies

Evaluating databases for high-volume analytics?

We specialize in data infrastructure engineering and production AI systems. Let's discuss your architecture.