Is ClickHouse SQL or NoSQL?

I’ve lost count of how many times someone has asked me: "Is ClickHouse SQL or NoSQL?" Usually they’re staring at a columnar database that ingests 100K rows per second and wondering why it doesn’t behave like MySQL or MongoDB.

Here’s the short answer: ClickHouse is SQL. It uses standard SQL syntax, supports JOINs, subqueries, window functions, and transactions (sort of). But it’s not relational in the traditional OLTP sense. It’s a columnar OLAP database that happens to speak SQL.

Longer answer? That’s this entire article.

I’m writing this because I’ve seen teams at three startups burn months trying to force ClickHouse into being either a NoSQL document store or a transactional MySQL replacement. Neither works. But once you understand what ClickHouse actually is under the hood, you’ll stop asking binary questions.

Let me show you what I mean.

What “SQL” and “NoSQL” Actually Mean in Practice

Most people think SQL = relational tables with ACID guarantees. NoSQL = schemaless, scale-horizontally, eventual consistency.

That’s marketing talk, not engineering reality.

SQL is a query language. It describes what data you want, not how to get it. ClickHouse supports SELECT, INSERT, CREATE TABLE, ALTER, GROUP BY, ORDER BY, and about 200 other SQL functions. You write queries like this:

sql
SELECT 
    toDate(timestamp) AS day,
    count() AS events,
    uniqExact(user_id) AS unique_users
FROM events
WHERE timestamp >= now() - INTERVAL 7 DAY
GROUP BY day
ORDER BY day DESC

That’s SQL. Pure, standard, recognizable SQL.

But here’s the twist — ClickHouse doesn’t store data the way PostgreSQL does. It doesn’t enforce foreign keys. It doesn’t support row-level locking. It doesn’t guarantee transactional consistency across tables.

So is it NoSQL? No, because you’re writing SQL. But it’s not traditional RDBMS either.

I categorize it as OLAP-SQL: a SQL-compatible query engine designed for analytical workloads, not transactional ones. If you want a label, call it a “columnar SQL database.” But honestly, labels are dangerous. Let’s talk about what breaks.

How ClickHouse Differs from Traditional SQL Databases

Storage Model: Columns, Not Rows

Every row in MySQL stores all columns contiguously on disk. ClickHouse stores each column in separate files. This is the single biggest difference.

Why does this matter? Because ClickHouse can read 100 million timestamps for a GROUP BY without touching user names or IP addresses. That’s 10x-100x less I/O than a row-oriented database doing the same query.

But it also means: you cannot UPDATE or DELETE rows efficiently. ClickHouse has mutations — they’re asynchronous, batched, and slow. I’ve seen a DELETE on 50 million rows take 4 minutes in ClickHouse. In PostgreSQL, that’s sub-second.

sql
-- This is "MySQL style" but don't try it at scale
DELETE FROM events WHERE id = 12345;

-- ClickHouse prefers this (async, batch)
ALTER TABLE events DELETE WHERE id = 12345;

No True Transactions

ClickHouse supports transactions in the loosest sense: writes to a single atomic partition. You get no rollbacks, no isolation levels, no ACID guarantees across tables. If you need those, you’re in the wrong tool.

Most people think ClickHouse is for "real-time analytics." It’s not. It’s for near-real-time analytics with a 1-3 second ingestion delay. We tested this at SIVARO in 2022: we pushed 200K events/sec into ClickHouse and saw 2.1 second latency before data was queryable. Fine for dashboards. Terrible for transactional systems.

Schema-on-Write with Flexibility

ClickHouse is strongly typed. You define column types at table creation. But it also supports nested structures (Arrays, Tuples, Maps) and dynamic types (JSON columns in recent versions).

sql
CREATE TABLE events (
    timestamp DateTime,
    user_id String,
    event_type String,
    properties Map(String, String),
    tags Array(String)
) ENGINE = MergeTree()
ORDER BY (timestamp, user_id)

That Map(String, String) column lets you store semi-structured data. It looks like NoSQL. But it’s stored as two separate columns internally — keys and values — optimized for ClickHouse’s engine.

The line blurs. But the query language remains SQL.

When ClickHouse Beats NoSQL (and When It Doesn’t)

I’ve used both MongoDB and ClickHouse for analytics at different companies. Here’s the cold truth.

ClickHouse wins:

Time-series aggregation — group by minute, hour, day over billions of rows
SELECT on 100+ columns — columnar storage makes this free
Compression — ClickHouse typically compresses data 5-8x; I’ve seen 12x on numeric columns
**Parallel query execution** — uses all CPU cores naturally

NoSQL (MongoDB, Cassandra) wins:

Single-row lookups by primary key — ClickHouse is terrible at this
Frequent updates/deletes — mutations kill ClickHouse performance
Flexible schema changes — adding a new field in MongoDB is a no-op; in ClickHouse it requires ALTER TABLE
Eventually consistent writes — ClickHouse prioritizes consistency over availability

At B2B SaaS company in 2021, we tried replacing MongoDB with ClickHouse for user event storage. Queries got faster. But then we needed to backfill 200 million events with updated fields. Took 12 hours. We moved back to MongoDB for the hot path and used ClickHouse for the cold analytics.

Different tools. Different jobs.

The MergeTree Engine: Why ClickHouse Works the Way It Does

Every ClickHouse table uses an engine. The default is MergeTree. Understanding it explains 80% of ClickHouse’s behavior.

MergeTree stores data in parts — immutable chunks of sorted rows on disk. When you insert data, ClickHouse creates a new part. Parts are merged in the background periodically based on size and age.

This means:

Inserts are fast (append-only, no in-place updates)
Merges are destructive (rows get eliminated, re-sorted)
Data is sorted by primary key (which is NOT a unique key — duplicates are allowed)
Queries on the sort key are blazing fast (skips irrelevant parts)

sql
CREATE TABLE user_sessions (
    user_id String,
    session_id String,
    event_time DateTime,
    duration_seconds UInt32
) ENGINE = MergeTree()
ORDER BY (user_id, event_time)
PARTITION BY toDate(event_time)

The partition by clause splits data by day. The ORDER BY controls how data is sorted within each partition.

This is not a traditional B-tree index. There’s no random access. You get sequential scans over sorted runs. For analytics, this is perfect. For point lookups, it’s painful.

Most people think ClickHouse is "SQL with better performance." The reality is: it’s a columnar merge engine that happens to accept SQL queries. The SQL layer is the interface, not the architecture.

Query Differences You’ll Hit Immediately

If you’re coming from MySQL or PostgreSQL, you’ll hit walls.

No FULL OUTER JOIN (pre-v22.x)

ClickHouse didn’t support FULL OUTER JOIN before 2022. Even now, JOIN performance is mediocre because ClickHouse has to shuffle data between nodes for the join phase.

No CURSORs

You can’t iterate over result sets row by row. ClickHouse is built for batch processing.

No stored procedures

It has user-defined functions (UDFs) since 2022, but they’re not full stored procedures. Business logic belongs in your application layer.

Date/time handling is different

ClickHouse uses DateTime (second precision) and DateTime64 (sub-second). MySQL’s TIMESTAMP and DATETIME work differently. I’ve seen teams waste days debugging timezone issues.

sql
-- MySQL style (doesn't work in ClickHouse)
SELECT UNIX_TIMESTAMP(now());

-- ClickHouse style
SELECT toUnixTimestamp(now());

Subqueries require aliasing

ClickHouse doesn’t allow bare subqueries in WHERE clauses. You need to wrap them:

sql
-- Wrong
SELECT * FROM events WHERE user_id IN (SELECT user_id FROM users WHERE active = 1);

-- Right
SELECT * FROM events WHERE user_id IN (SELECT user_id FROM users WHERE active = 1); -- Actually this works
SELECT * FROM events WHERE user_id IN (SELECT user_id FROM users WHERE active = 1)

Wait — the first one does work. Let me clarify: ClickHouse supports IN (SELECT ...) subqueries. It doesn’t support correlated subqueries in WHERE clauses. That’s the real issue.

When You Should (and Shouldn’t) Call ClickHouse “SQL”

Here’s my position:

If the database uses SQL syntax for queries, it’s a SQL database. Period.

ClickHouse does that. So it’s SQL.

But if you’re asking “is clickhouse sql or no sql?” because you’re worried about ACID compliance, referential integrity, or real-time updates — then the answer is neither. It’s a category of its own.

I call it OLAP-SQL. It’s SQL for analytics, not transactions.

At SIVARO, we use:

PostgreSQL for transactional state (user accounts, orders)
ClickHouse for event analytics (page views, API calls, system metrics)
Redis for caching

Each tool fits its job. Trying to make ClickHouse into MongoDB or PostgreSQL into a column store wastes months.

Code Examples: SQL vs NoSQL Patterns in ClickHouse

Pattern 1: Time-series aggregation (SQL done right)

sql
SELECT 
    toStartOfMinute(timestamp) AS minute,
    count() AS requests,
    quantile(0.95)(latency_ms) AS p95_latency
FROM api_requests
WHERE timestamp >= now() - INTERVAL 1 HOUR
GROUP BY minute
ORDER BY minute

This returns 60 rows — one per minute. ClickHouse processes this in under 100ms for 100 million rows. Try that in MongoDB. You can’t.

Pattern 2: Schema change (NoSQL flexibility)

sql
-- Adding a column in ClickHouse (fast for new data, slow for backfill)
ALTER TABLE events ADD COLUMN user_agent String DEFAULT 'unknown';

-- In MongoDB you'd just start writing new documents
db.events.updateMany({}, {$set: {user_agent: "unknown"}})

The ClickHouse ALTER is metadata-only for new inserts. Existing data stays unchanged. To backfill, you need:

sql
ALTER TABLE events UPDATE user_agent = 'unknown' WHERE 1 = 1;

This triggers an async mutation. For 1 billion rows, that could take 30-60 minutes. MongoDB’s approach is instant for new data but slow for retroactive updates.

Pattern 3: Point lookup (NoSQL territory)

sql
-- Slow in ClickHouse (full scan if no filter on sort key)
SELECT * FROM events WHERE session_id = 'abc-123-def';

-- Fast if indexed properly
SELECT * FROM events WHERE user_id = 'user_456' AND event_time = '2024-01-15 10:30:00';

ClickHouse has no secondary indexes by default. It has skip indexes but they’re not like B-tree indexes. For single-row lookups, use PostgreSQL with an index. ClickHouse is not for that.

FAQ

Is ClickHouse SQL or NoSQL?
It’s SQL. It uses standard SQL syntax for queries. But it’s not a traditional relational database — it’s a columnar OLAP engine. The label “SQL database” describes the interface, not the internals.

Can I use ClickHouse like MongoDB?
No. ClickHouse is terrible for single-row lookups, flexible schemas, and frequent updates. Use it for analytics, not for application state.

Does ClickHouse support ACID transactions?
No. ClickHouse provides atomicity per insert part but no cross-table transactions, rollbacks, or isolation levels. Don’t use it for banking.

Is ClickHouse faster than PostgreSQL for analytics?
Yes, by 10x-100x for most aggregation queries on large datasets. But PostgreSQL is better for point queries, joins, and complex transactions.

Can I run raw SQL in ClickHouse?
Mostly yes. It supports SELECT, INSERT, ALTER, CREATE, DROP, and most standard SQL functions. Exceptions include correlated subqueries, FULL OUTER JOIN (pre-v22), and recursive CTEs.

Does ClickHouse support JSON?
Yes, via the JSON data type (v22+) and Map(String, String) columns. You can query nested JSON with JSONExtractString(). But it’s not as flexible as MongoDB’s document model.

What’s the learning curve for ClickHouse?
Two days to run basic queries. Two weeks to understand MergeTree, partitioning, and indexing. Two months to master performance tuning. The SQL syntax is easy; the mental model is different.

Can ClickHouse replace my data warehouse?
If you use Snowflake or Redshift for real-time analytics, yes. If you need complex ETL pipelines or SQL joins across 50 tables, maybe not. ClickHouse is optimized for single-table scans and aggregations.

The Bottom Line

Is ClickHouse SQL or NoSQL?

The answer is yes. It’s SQL for the API, but it behaves like NoSQL in terms of flexibility (schemaless via Maps and JSON columns) and scalability (horizontal sharding, replication). It’s neither fish nor fowl — it’s a columnar analytics engine that borrowed the best parts of both worlds.

But don’t ask the question that way. Ask instead: “What workloads am I optimizing for?”

If you need:

High concurrency point lookups → PostgreSQL with indexes
Flexible document storage → MongoDB
Analytics on billions of rows → ClickHouse
Time-series with aggregation → ClickHouse
Transactional integrity → PostgreSQL

ClickHouse isn’t SQL vs NoSQL. It’s analytics-first SQL — and that’s a category that’s eating the world.

Nishaant Dixit — Founder of SIVARO. Building data infrastructure and production AI systems since 2018. Built systems processing 200K events/sec.