O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Advanced Machine Learning with Spark 2.x

Video Description

Get in-depth knowledge of Machine Learning libraries, analytics, and prediction with Apache Spark

About This Video

  • Learn the best practices involved in building, evaluating, tuning, and deploying Spark pipelines.
  • Stream applications to provide real-time insights and predictionsyou’re your business.
  • Perform Natural Language Processing and Deep Learning in Spark.

In Detail

The aim of this course is to provide a practical understanding of advanced Machine Learning algorithms in Apache Spark to make predictions and recommendation and derive insights from large distributed datasets. This course starts with an introduction to the key concepts and data types that are fundamental to understanding distributed data processing and Machine Learning with Spark.

Further to this, we provide practical recipes that demonstrate some of the most popular algorithms in Spark, leading to the creation of sophisticated Machine Learning pipelines and applications. The final sections are dedicated to more advanced use cases for Machine Learning: streaming, Natural Language Processing, and Deep Learning. In each section, we briefly establish the theoretical basis of the topic under discussion and then cement our understanding with practical use cases.

Table of Contents

  1. Chapter 1 : Introduction to Key Concepts and Data Types
    1. The Course Overview 00:03:43
    2. Spark Data Structures — RDD, DataFrames, and Datasets 00:12:03
    3. Dense and Sparse Vectors 00:03:18
    4. Labeled Points, Matrix, and Other Data Types 00:04:59
    5. Key Concepts, Machine Learning Pipelines, and Operations 00:03:02
  2. Chapter 2 : Machine Learning at Scale
    1. Feature Engineering 00:06:03
    2. Supervised Learning – Classification, Regression 00:04:55
    3. Unsupervised Learning 00:04:15
    4. Recommendation Engines 00:09:58
  3. Chapter 3 : ML Pipelines, Evaluation, Tuning, and Deployment
    1. Deep Dive into Regression Models 00:05:02
    2. Deep Dive into Decision Tree Models 00:05:43
    3. Evaluating and Tuning Our Model 00:03:57
    4. Saving and Deploying Our Model 00:04:04
  4. Chapter 4 : Spark Machine Learning and Streaming
    1. Overview of Spark Streaming 00:06:03
    2. Your Own Streaming Application with Kafka 00:03:49
    3. Your First Streaming Application 00:08:05
    4. Analyzing Sensors Data in a Streaming Way 00:10:11
  5. Chapter 5 : Advanced Topics Natural Language Processing
    1. Natural Language Processing Overview 00:04:05
    2. Feature Generation from Text — CountVectorizer, TFIDF, and LDA 00:09:53
    3. Feature Generation from Text — Word Embeddings 00:05:09
    4. NLP Document Classification Application 00:04:04
  6. Chapter 6 : Deep Learning with Spark
    1. The Spark Versus Deep Learning Use Case 00:08:36
    2. Spark for Parallelizing Deep Learning Evaluation 00:05:28
    3. Deep Learning as a Feature Generator for Existing Spark ML Algorithms 00:05:45
    4. Spark/Deep Learning Made Easy 00:02:40