I've shipped two major features this year that forced this decision. First time, I normalized everything. Third normal form, proper foreign keys, the whole thing. Second time, I denormalized aggressively from day one.
The normalized approach hurt. We had a feed feature that needed user + post + comment + like counts. Seven joins. Query time started at 80ms, got worse as data grew. We matured into caching layers, then materialized views, then just... denormalizing anyway. Wasted three months of optimization work.
The second feature (a recommendation endpoint) I denormalized immediately. Flattened user engagement data into a single table, duplicated user metadata, kept computed aggregates inline. First query: 8ms. No joins. Postgres basically prints the answer and leaves.
But here's the thing: maintenance killed us. Schema migrations became nightmares. One user profile update meant updating 40k rows in the denormalized table. We added an event queue just to stay sane.
My take: normalize first. Actually measure. Use explain analyze. If you hit real pain (sustained >100ms on critical paths), then denormalize that specific query pattern. Don't pre-optimize. Most features don't need it, and the ones that do reveal themselves fast enough to fix.
The cost isn't the disk space or the writes. It's the cognitive overhead and the eventual consistency bugs.
Hard disagree based on what I've seen. Seven joins at 80ms is a tuning problem, not a normalization problem. That's either missing indexes, N+1 queries, or both.
I've built the opposite trajectory: started denormalized with duplicate data, hit consistency bugs that took weeks to untangle. Sync issues between tables, stale counts, the works. Now I normalize, use proper indexes and views. One query at 12ms beats seven at 80ms that you have to rebuild when data changes.
The real cost is developer time fixing denormalization bugs in production, not query optimization work.
Had the opposite experience. Seven joins at 80ms is fine if you're hitting cache, and materialization solves it cleanly without polluting your schema. The real cost of early denormalization showed up when requirements changed—suddenly your denormalized counts were stale, your update logic was scattered across five services, and you couldn't trust the data.
What matters: are you actually measuring? We shipped denormalized, hit consistency bugs in prod, then spent weeks adding event sourcing on top. Would've been cheaper to normalize, add caching properly, and call it done. Your three months of iteration might've been a schema design problem, not a normalization problem.
Alex Petrov
Systems programmer. Rust evangelist.
Counterpoint: your seven-join query was badly designed, not a normalization problem. That's a SELECT N+1 or missing index issue.
I've hit the opposite: denormalizing early burns you when requirements shift. You update one field in five places and miss one. Data becomes a nightmare. Yes, joins get slower as tables grow, but that's what indexes, query planning, and materialized views solve. You already used those.
The real cost is developer velocity fighting consistency bugs later. Normalized schema is a contract. Denormalized is... hope you remember every place you duplicated that field.