Database architecture for IoT systems: when to use TimescaleDB, when to use ClickHouse
The two most popular time-series engines for sensor data — selection criteria, schema design, and a data lifecycle strategy so costs don't explode with device count.
One of the first questions when designing an IoT system is: where will sensor data — millions of data points per day, from hundreds or thousands of devices — be stored? A wrong answer at this stage usually doesn't cause an immediate failure, but quietly accumulates into infrastructure costs and query latency after 6-12 months of operation. This article describes how we approach this problem when designing a Pilot Build for an industrial sensor monitoring system — not abstract theory, but a concrete architecture you can apply right away.
Two leading candidates: TimescaleDB and ClickHouse
TimescaleDB is a PostgreSQL extension that turns a regular table into a "hypertable" — automatically partitioned by time. Its biggest advantage is that you still get the full PostgreSQL toolkit: foreign key constraints, transactions, JOINs with metadata tables (device lists, installation locations, alert thresholds), and an ORM/driver ecosystem familiar to every backend engineer.
ClickHouse is a columnar database purpose-built for analytics — fast reads across billions of rows, very good compression (typically 5-10× compared to row-based storage) — but with trade-offs: no real transactions, slow and complex UPDATE/DELETE, and JOINs between large tables aren't its strength.
Selection criteria — not "which is better" but "which fits the problem"
- Device count under 5,000 and write frequency under 1 point/second/device → TimescaleDB is usually enough, and simplifies operations since you only need one Postgres instance
- Large device counts (tens of thousands+) or long-term storage of high-frequency raw data (vibration, audio, tens of Hz) → ClickHouse delivers noticeably better read performance and compression ratios
- Need a low-latency, real-time operations dashboard (operators viewing live metrics) → TimescaleDB with continuous aggregates handles this well without adding another system
- Need long-term historical analysis, BI, or ML model training on raw data → ClickHouse is the natural choice for the data warehouse layer
In many cases, the real answer is both — but playing two different roles within the same architecture, not replacing each other.
A reference architecture: hot path + cold path
The approach we typically propose for an industrial sensor monitoring system (e.g., vibration, temperature, pressure from a few thousand sensors) is to split storage into two layers with clearly defined roles:
Hot path — TimescaleDB
Data from the last 7-30 days, downsampled to a resolution suitable for operations dashboards (e.g., 1 point/minute instead of 1 point/second). Timescale's continuous aggregates pre-compute average/min/max values per time window, so the dashboard loads instantly without scanning raw data. This is the layer that operators, the alert engine, and the application's API interact with directly.
Cold path — ClickHouse or object storage (Parquet on S3/R2)
Full raw data, retained long-term for analysis, audit, and as training data for forecasting/anomaly-detection models. A batch job (running daily or hourly) pushes "cooled" data from Timescale to ClickHouse or down to Parquet, then deletes it from the hot path to keep the hot table small and queries fast.
Design principle: whatever layer serves real-time operations must be small and fast; whatever layer serves analysis/training can be large and cheap — but doesn't need to be instantly fast.
Schema design — a few common pitfalls
- Always keep a separate devices table (metadata: location, model, alert thresholds, firmware version) and a time-series table containing only device_id, timestamp, metric, value. Avoid stuffing metadata into every measurement row — it wastes storage and makes updates painful when a device's configuration changes
- Use the narrowest data type possible for value — float4 instead of float8/numeric if precision allows, because the difference multiplied across billions of rows is significant
- Set a retention policy from day one, not when the disk fills up. TimescaleDB has add_retention_policy() to automatically drop old chunks; ClickHouse has TTL at the table/partition level
- Indexing on (device_id, timestamp) is nearly mandatory — this is the most common query pattern ("get data for device X over time range Y")
Cost — a factor often overlooked until it's too late
A common mistake is designing a system that performs well at pilot scale (a few dozen devices) but whose storage costs grow linearly — or worse, super-linearly as indexes balloon — once scaled to thousands of devices. Timescale's compression (column-oriented storage for older chunks) can cut storage by 90%+ compared to a regular heap, but it must be explicitly configured (compress_chunk policy) — it isn't enabled by default in every version.
Conclusion
There's no single correct answer for every IoT system — but there is a right question to answer first: who will read this data, at what latency, and for how long? Answering that before choosing a database helps avoid having to migrate the entire storage system midstream — one of the most expensive and risky tasks in operating a data system. This is also part of the Discovery Sprint we run before every Pilot Build: mapping out data flows and expected volumes before writing a single line of infrastructure-as-code.
If your team is designing — or struggling with cost/latency issues on — an existing time-series system, this is exactly the kind of problem that falls within the Server & Database layer of the closed loop KonexForge builds — similar to how ClickHouse serves as the cold path in the water-quality monitoring system for 12 districts we deployed.