Chapter 14. Deploying and Integrating

In this, our final chapter, it’s time to share a few last pieces of advice as you work toward deploying Cassandra in production. We’ll discuss options to consider in planning deployments and explore options for deploying Cassandra in various cloud environments. We’ll close with a few thoughts on some technologies that complement Cassandra well.

Planning a Cluster Deployment

A successful deployment of Cassandra starts with good planning. You’ll want to consider the amount of data that the cluster will hold, the network environment in which the cluster will be deployed, and the computing resources (whether physical or virtual) on which the instances will run.

Sizing Your Cluster

An important first step in planning your cluster is to consider the amount of data that it will need to store. You will, of course, be able to add and remove nodes from your cluster in order to adjust its capacity over time, but calculating the initial and planned size over time will help you better anticipate costs and make sound decisions as you plan your cluster configuration.

In order to calculate the required size of the cluster, you’ll first need to determine the storage size of each of the supported tables using the formulas we introduced in Chapter 5. This calculation is based on the columns within each table as well as the estimated number of rows and results in an estimated size of one copy of your data on disk.

In order to estimate the actual physical amount ...

Get Cassandra: The Definitive Guide, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.