Some Systems in Detail

Now that we've briefly overviewed the different categories of NoSQL databases, we can begin to explore some individual databases in depth. Although we can't cover each and every database out there today, we can certainly cover those that have been getting the most attention lately. It's interesting to note some of the similarities between the databases and how the decisions they make affect system operability. For example, several of the databases choose an append-only data structure, which means backup can be as simple as an rsync. Let's dive in and take a look.

Cassandra

Cassandra is a highly distributed database that is used in production by Digg, Twitter, Facebook, Rackspace, and Reddit, to name a few. Cassandra has a few key philosophies that inform some of its basic design decisions. Cassandra takes the stance that writes are harder to scale than reads, so it's heavily optimized for writes. In fact, the hard drive disk head never has to seek on a write operation, because the only immediate write that's needed is to a log, which is append-only.

Another of Cassandra's philosophies is that there should be no single point of failure. It's for this reason that there's no "coordination" server, or "elected master" or anything of that kind. Cassandra servers know about each other via a gossip protocol that lets them relatively quickly propagate information across a cluster. Writes for any piece of data can happen to any node in a cluster, and reads can do so, ...

Get Web Operations now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.