Data distribution in Cassandra

In a traditional relational database such as MySQL or PostgreSQL, the entire contents of the database reside on a single machine. At a certain scale, the hardware capacity of the server running the database becomes a constraint: simply migrating to more powerful hardware will lead to diminishing returns.

Let's imagine ourselves in this scenario, where we have an application running on a single-machine database that has reached the limits of its capacity to vertically scale. In that case, we'll want to split the data between multiple machines, a process known as sharding or federation. Assuming we want to stick with the same underlying tool, we'll end up with multiple database instances, each of which holds a ...

Get Learning Apache Cassandra - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.