In this chapter, we’ll gain an understanding of Cassandra’s design goals, data model, and some general behavior characteristics.
For developers and administrators coming from the relational world, the Cassandra data model can be very difficult to understand initially. Some terms, such as “keyspace,” are completely new, and some, such as “column,” exist in both worlds but have different meanings. It can also be confusing if you’re trying to sort through the Dynamo or Bigtable source papers, because although Cassandra may be based on them, it has its own model.
So in this chapter we start from common ground and then work through the unfamiliar terms. Then, we do some actual modeling to help understand how to bridge the gap between the relational world and the world of Cassandra.
In a relational database, we have the database itself, which is the
outermost container that might correspond to a single application. The
database contains tables. Tables have names and contain one or more
columns, which also have names. When we add data to a table, we specify a
value for every column defined; if we don’t have a value for a particular
column, we use
null. This new entry adds a row to the table, which we can later read if we know the row’s unique identifier (primary key), or by using a SQL statement that expresses some criteria that row might meet. If we want to update values in the table, we can update all of the rows or just some ...