Designing a primary key

When designing a primary key, Cassandra modelers should have two main considerations:

  • Does this primary key allow for even distribution, at scale?
  • Does this primary key match the required query pattern(s), at scale?

Note that on the end of each statement is the phrase at scale. A model that writes all of its table's data into a single partition distributes evenly when you have a 3-node cluster. But it doesn't evenly distribute when you scale up to a 30-node cluster. Plus, that type of model puts your table in danger of approaching the limit of 2,000,000,000 cells per partition. Likewise, almost any model will support high-performing unbound queries (queries without a WHERE clause) when they have 10 rows across the ...

Get Mastering Apache Cassandra 3.x - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.