O'Reilly logo
live online training icon Live Online training

Building Applications with Apache Cassandra

A quickstart guide to Cassandra for Java developers and architects

Jeff Carpenter

Apache Cassandra is an open source distributed database management system designed to handle large amounts of data across many commodity servers, providing horizontal scalability and high availability with no single point of failure. Join Jeff Carpenter for this interactive training course to hit the ground running with Cassandra. You'll learn how to create data models and client applications in Java. And you'll look under the hood to understand Cassandra’s architecture and how to get the most out of this powerful, highly scalable database.

What you'll learn-and how you can apply it

By the end of this live, online course, you’ll understand:

  • Cassandra’s architecture and how it works
  • How Cassandra is different from traditional databases, and how those differences affect data modeling and application development
  • How to determine whether Cassandra is a good fit for your application
  • Cassandra’s features, especially as available through the client drivers
  • Configuration options

And you’ll be able to:

  • Design Cassandra data models that will perform/scale effectively
  • Create application programs using the DataStax Java Driver
  • Use tools including as cqlsh, nodetool, CCM, and DataStax DevCenter

This training course is for you because...

  • You are a developer looking to get started quickly building applications using Cassandra
  • You are an architect who's interested in understanding Cassandra's internals and data modeling in order to leverage it appropriately and guide development teams.

Prerequisites

  • Background in Java development
  • Some familiarity with basic database concepts

Recommended Preparation:

Cassandra: The Definitive Guide, 2nd Ed.

Learning Apache Cassandra

Using the Java Development Kit

Day 1 preparation:

Read Cassandra: The Definitive Guide, 2nd Ed.: chapters 1 and 2 (all) and chapter 4 through “Cassandra’s Data Model”

  • The virtual machine can be downloaded from the following Dropbox location: https://www.dropbox.com/s/2cubcjs13nn3o3a/Building%20Apps%20Apache%20Cassandra%20Jan%202018.ova?dl=0
  • Clone the repository
  • Use your own IDE, preferably IntelliJ IDEA (community edition is a free download).

The contents of this VM include: - Apache Cassandra 3.11.1 - Provided in archive format in the Downloads folder (we install Cassandra as part of the course) - Java Development Kit 8 (OpenJDK) - IntelliJ IDEA Community Edition - This is the recommended IDE for this course, exercise instructions are given in terms of this free version - DataStax Dev Center - This tool provides a nice interface for creating and executing Cassandra Query Language (CQL) scripts, including syntax highlighting and tracing - Available at https://academy.datastax.com/downloads  - Cassandra Cluster Manager - CCM is a script to create and manage small Cassandra clusters on localhost - Requires Python 2.7 (as does Cassandra’s cqlsh tool) - Repositories cloned from GitHub, containing examples used in this course - git@github.com:jeffreyscarpenter/reservation-service.git - git@github.com:jeffreyscarpenter/cassandra-guide.git

Usage of the VM is recommended, but the listing of contents has been provided above should you have difficulties running the VM or desire to run exercises in your own Linux/MacOS environment. (Usage of Windows is not recommended.) 

Cut and paste are not available by default in the Ubuntu VM. If you would like to enable this for your environment, you can use one of the following links based on your toolset:  - Virtualbox: https://docs.oracle.com/cd/E36500_01/E36502/html/qs-guest-additions.html - VMware: https://www.vmware.com/support/ws55/doc/ws_newguest_tools_linux.html

Day 2 preparation:

Read chapters 6 and 7 of Cassandra: The Definitive Guide, 2nd Ed.

About your instructor

  • Jeff Carpenter is a technical evangelist at DataStax, where he leverages his background in system architecture, microservices and Apache Cassandra to help empower developers and operations engineers build distributed systems that are scalable, reliable, and secure. Jeff has worked on projects ranging from a complex battle planning system in an austere network environment, to a cloud-based hotel reservation system. He the author of Cassandra: The Definitive Guide, 2nd Edition.

Schedule

The timeframes are only estimates and may vary according to how the class is progressing

Day 1:

Unit 1: Introduction, install and cqlsh

Lecture: Introducing Cassandra (15 minutes)

Discussion: Is Cassandra a Good Fit for my project? (5 minutes)

Individual Activity: Installing Cassandra 15 minutes

Q&A: Installation (5 minutes)

Group Activity: Running cqlsh (15 minutes)

Q&A: cqlsh (5 minutes)

Break (5 minutes)

Unit 2: Data Modeling

Group Activity: Cassandra data types (15 minutes)

Q&A: data types (5 minutes)

Individual Activity: Query options (10 minutes)

Lecture: Cassandra Data Modeling Principles (15 minutes)

Individual Activity: Data modeling (10 minutes)

Discussion/Q&A: Data Modeling Debrief (5 minutes)

Break (10 minutes)

Unit 3: Client Development Individual Activity: DataStax Java Driver (15 minutes)

Q&A (5 minutes)

Individual Activity: Additional Statements (15 minutes)

Q&A (5 minutes)

Wrap up day 1

Day 1 review: Review chapters 3, 4 (remainder), 5 of Cassandra: The Definitive Guide, 2nd Ed., to reinforce concepts covered on Day 1

Day 2 preparation: Read chapters 6 and 7 of Cassandra: The Definitive Guide, 2nd Ed.

Day 2:

Unit 4: Architecture Concepts

Individual Activity: Creating a Cluster (10 minutes)

Lecture: Cassandra Clusters (10 minutes)

Individual Activity: Log inspection (5 minutes)

Q&A (5 minutes)

Lecture: Replication, consistency and the CAP theorem (10 minutes)

Group Activity: Cassandra statistics (10 minutes)

Q&A (5 minutes)

Break (5 minutes)

Unit 5: Advanced Client Development

Individual Activity: Batches and transactions (10 minutes)

Lecture: Batches, Lightweight transactions and Paxos (10 minutes)

Individual Activity: Paging (5 minutes)

Group Activity: Other Language Drivers (10 minutes)

Lecture: Other Driver Features (10 minutes)

Break (10 minutes)

Unit 6: Under the Hood: the Read and Write Paths

Individual Activity: Tracing (10 minutes)

Lecture: Cassandra Read Path (10 minutes)

Group Activity: Data files (5 minutes)

Lecture: Cassandra Write Path (10 minutes)

Individual Activity: Deleting Data (5 minutes)

Lecture: TTL, Deletion and Tombstones (5 minutes)

Q&A: Read, Write and Deletion (5 minutes)

Discussion: What’s next (10 minutes)

Day 2 review:

Read chapters 8 and 9 of Cassandra: The Definitive Guide, 2nd Ed., to reinforce concepts covered on Day 2