Write complexity and data integrity

The amount of work we need to do to write data in the fully normalized strategy is basically equal to what we needed to do with a partially normalized layout. Our storage needs to increase by a bit; now we're storing one full copy of each status update for every follower the author has. However, storage is cheap, and writing data in Cassandra is cheap, so we've managed to make our timeline read pattern far more efficient at a low cost.

One concern in any sort of denormalized scenario is data integrity. At the Cassandra level, the only thing stopping us from adding a status update to the user_status_updates table is forgetting to add copies as appropriate to the home_status_updates table, or vice versa. ...

Get Learning Apache Cassandra - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.