O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Fast Data Processing Systems with SMACK stack

Video Description

Build data processing platforms that can take on even the hardest of your data troubles!

About This Video

  • This highly practical tutorial shows you how to use the best of the big data technologies to solve your response-critical problems

  • Learn the art of making cheap-yet-effective big data architecture without using complex Greek-letter architectures

  • Use this easy-to-follow video to build fast data processing systems for your organization

  • In Detail

    SMACK is an open source full stack for big data architecture. It is a combination of Spark, Mesos, Akka, Cassandra, and Kafka. This stack is the newest technique developers have begun to use to tackle critical real-time analytics for big data. This highly practical tutorial will teach you how to integrate these technologies to create a highly efficient data analysis system for fast data processing.We’ll start off with an introduction to SMACK and show you when to use it. First you’ll get to grips with functional thinking and problem solving using Scala. Next you’ll come to understand the Akka architecture. Then you’ll get to know how to improve the data structure architecture and optimize resources using Apache Spark. Moving forward, you’ll learn how to perform linear scalability in databases with Apache Cassandra. You’ll grasp the high throughput distributed messaging systems using Apache Kafka. We’ll show you how to build a cheap but effective cluster infrastructure with Apache Mesos. Finally, you will deep dive into the different aspects of SMACK using 2 practical case studies. By the end of the video, you will be able to integrate all the components of the SMACK stack and use them together to achieve highly effective and fast data processing.

    Table of Contents

    1. Chapter 1 : An Introduction to SMACK
      1. The Course Overview 00:05:19
      2. Modern Data-Processing Challenges 00:06:28
      3. The Data-Processing Pipeline Architecture 00:07:09
      4. SMACK Technologies 00:07:04
      5. Understanding Data Expert Profiles and Changing the Data Center Operations 00:08:36
    2. Chapter 2 : The Language – Scala
      1. Scala Collections 00:07:45
      2. Iterators in Scala 00:03:43
      3. More Functions with Scala 00:18:53
    3. Chapter 3 : The Model – Akka
      1. Actor Model In a Nutshell 00:13:32
      2. Working with Actors 00:09:39
    4. Chapter 4 : The Engine – Apache Spark
      1. Spark Concepts 00:06:44
      2. Resilient Distributed Datasets 00:22:01
      3. Spark in Cluster Mode 00:20:26
      4. Spark Streaming 00:20:02
    5. Chapter 5 : The Storage – Apache Cassandra
      1. NoSQL 00:04:33
      2. Apache Cassandra Installation 00:09:50
      3. Backup and Compression 00:04:18
      4. Recovery Techniques 00:03:32
      5. Recovery Techniques – DBMS Optimization, Bloom Filter, and More 00:15:09
      6. The Spark Cassandra Connector 00:04:47
    6. Chapter 6 : Connectors – Spark, Cassandra, and Akka
      1. Introduction to the Spark Cassandra Connector 00:05:20
      2. Cassandra and Spark Streaming Basics 00:03:36
      3. Functions with Cassandra 00:11:57
      4. Akka and Cassandra 00:10:54
    7. Chapter 7 : The Broker – Apache Kafka
      1. Introducing Kafka 00:10:46
      2. Installation 00:02:16
      3. Cluster 00:13:14
      4. Architecture 00:09:56
      5. Producers 00:06:00
      6. Consumers 00:07:20
      7. Integration and Administration 00:14:01
    8. Chapter 8 : Connectors – Akka, Spark, Kafka, and Cassandra
      1. Akka, Spark, and Kafka 00:08:53
      2. Kafka and Cassandra 00:02:09
    9. Chapter 9 : The Manager – Apache Mesos
      1. The Apache Mesos Architecture 00:16:28
      2. Resource Allocation 00:20:34
      3. Running a Mesos Cluster on a Private Data Center 00:10:01
      4. Scheduling and Managing the Frameworks 00:15:16
      5. Apache Aurora 00:04:54
      6. Singularity 00:03:42
      7. Apache Spark on Apache Mesos 00:04:56
      8. Apache Cassandra on Apache Mesos 00:02:12
      9. Apache Kafka on Apache Mesos 00:06:27