O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Learning Apache Cassandra

Video Description

Build an efficient, scalable, fault-tolerant, and highly-available data layer into your applications by managing large amount of data using Apache Cassandra

About This Video

  • Master database operations – creating a database, creating a table, inserting data, and modelling data
  • Create a powerful sample application
  • Work with Cassandra nodes to make functions work as per your wish

In Detail

Cassandra is a NoSQL database with decentralized, fault-tolerant, scalable, and low-cost features, making it a core component of cloud computing systems. The more recent versions have greatly improved the security features, making it suitable for use in enterprise systems.

In this tutorial, you’ll see how Cassandra overcomes the challenges that relational databases face during high scalability demand. You will become familiar with the Cassandra terminologies, components, and their roles. Then you will learn how to create a multi-node Cassandra structure, understand the roles and responsibilities of Cassandra components, and see the data flow during database operations that demand speed, accuracy, and durability.

You will then see how Cassandra stores data onto files on the disk, how to optimize those files to improve performance, and how to monitor the Cassandra database performance using logs and metrics.

We’ll demonstrate the factors that could affect the performance SLAs of the Cassandra database. Next, you will learn how to optimize the data model to provide performance guarantees and consistent performance SLA over time. You’ll also learn how to build the data model on Cassandra and integrate the database with your application.

In the later sections, you’ll connect with Cassandra from Spark to read and write data. You’ll integrate Cassandra with Spark and learn how to process live streaming data with Spark and persist the data in Cassandra for consumption through the downstream system.

By the end of the course, you’ll be able to build powerful, scalable Cassandra database layers for your applications. You’ll design rich schemes to capture the relationships between different data types and master the advanced features available in Cassandra.

Table of Contents

  1. Chapter 1 : Introduction to Cassandra
    1. The Course Overview 00:01:54
    2. What Is Apache Cassandra? 00:04:00
    3. Key Space, Table Schema, Partition Key, and Clustering Key 00:03:40
    4. Start a Single Node Cassandra Database 00:01:15
    5. Introduction to Cqlsh Command Line Client 00:03:57
    6. Loading and Reading Data 00:03:39
  2. Chapter 2 : Cassandra Distributed Architecture
    1. Node and Ring Structure 00:05:36
    2. Replication and Consistency Model 00:05:05
    3. Racks and Datacenters 00:05:28
    4. CAP Theorem 00:06:43
    5. Gossip 00:02:42
    6. Read Repair, Hinted Handoff 00:04:04
  3. Chapter 3 : Diagnostics
    1. Understanding Files in the Data Directory 00:06:07
    2. Use Nodetool to Examine Performance Statistics 00:04:21
    3. System and Output Logs 00:03:02
    4. JMX to Monitor Metrics 00:02:27
    5. Choosing the Appropriate Compaction Strategy 00:07:52
  4. Chapter 4 : Data Modelling Principles
    1. Primary Key and Cluster Ordering 00:05:18
    2. Denormalization and Design for the Read Performance 00:04:54
    3. Optimizing for BlindWrites 00:03:50
  5. Chapter 5 : Data Modelling in Cassandra
    1. Collection Types 00:03:02
    2. Static Columns 00:01:40
    3. Indexes, Materialized Views 00:03:31
    4. Data Aggregation 00:01:53
    5. compareAndSet 00:02:15
    6. Counter Type 00:02:50
  6. Chapter 6 : Optimization of Data
    1. The Impact of Frequent Updates and Delete 00:03:23
    2. Wide Rows and Primary Key Considerations 00:03:15
    3. Load Testing with CQL Stress 00:03:31
    4. Logged and Unlogged Batching 00:04:26
  7. Chapter 7 : Integrating Cassandra Database with Your Application
    1. A Maven Project Using the Java Driver 00:01:57
    2. Connection Information for the Driver 00:03:22
    3. Basic Statements 00:03:44
    4. Using Prepared Statements 00:04:28
    5. Understanding Errors 00:04:27
  8. Chapter 8 : Overview of Apache Spark
    1. What Is Apache Spark and Spark Architecture 00:05:48
    2. Get Started with Spark 00:02:00
    3. Working with Spark’s Data Structures – RDD, Data Frame, and Dataset 00:04:28
    4. Setting Up the Spark Connector 00:02:00
  9. Chapter 9 : Connecting Spark with Cassandra
    1. Writing Data to Cassandra from Spark 00:02:29
    2. Reading Data from Cassandra Using Spark RDD 00:04:03
    3. Join, Aggregate Data Using Spark Data Frame API and Spark SQL 00:04:32
    4. Cassandra Aware Partitioning in Spark 00:03:52
  10. Chapter 10 : Integrate Cassandra with Spark Streaming
    1. Use Cases for Near Real Time Stream Processing Using Spark Streaming 00:05:23
    2. Advanced Stream Receiver Using Kafka Connectors 00:02:32
    3. Stateless and Stateful Transformations 00:03:35
    4. Persistence of Live Stream on to Cassandra 00:10:24