Stumbling on tombstones

We've seen that tombstones are critical to ensuring that Cassandra can correctly identify that a piece of data has been deleted in a distributed environment, but tombstones also have a downside. Since tombstones are stored in place of the deleted values, they continue to occupy space in the range of clustering columns in a given partition. In some situations, this can lead to unexpected performance degradation and even errors.

To use a somewhat artificial illustration, let's say that alice is now a long-time MyStatus user and has created tens of thousands of status updates. Let's also assume that alice decides one day that she wants to delete 1,000 recent status updates she's created. Once she's done with that process, ...

Get Learning Apache Cassandra - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.