Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing

Released

Publisher(s): Apress

ISBN: None

Start your free trial

Book description

None

Cover
Title
Copyright
Dedication
Contents at a Glance
Contents
About the Author
About the Technical Reviewers
Acknowledgments
Introduction
Chapter 1 : Big Data Technology Landscape
1. Hadoop
2. Data Serialization
3. Columnar Storage
  1. RCFile
  2. ORC
  3. Parquet
4. Messaging Systems
  1. Kafka
  2. ZeroMQ
5. NoSQL
  1. Cassandra
  2. HBase
6. Distributed SQL Query Engine
7. Summary
Chapter 2 : Programming in Scala
1. Functional Programming (FP)
2. Scala Fundamentals
  1. Getting Started
  2. Basic Types
  3. Variables
  4. Functions
  5. Classes
  6. Singletons
  7. Case Classes
  8. Pattern Matching
  9. Operators
  10. Traits
  11. Tuples
  12. Option Type
  13. Collections
3. A Standalone Scala Application
4. Summary
Chapter 3 : Spark Core
1. Overview
  1. Key Features
  2. Ideal Applications
2. High-level Architecture
3. Application Execution
  1. Terminology
  2. How an Application Works
4. Data Sources
5. Application Programming Interface (API)
6. Lazy Operations
  1. Action Triggers Computation
7. Caching
8. Spark Jobs
9. Shared Variables
  1. Broadcast Variables
  2. Accumulators
10. Summary
Chapter 4 : Interactive Data Analysis with Spark Shell
1. Getting Started
  1. Download
  2. Extract
  3. Run
2. REPL Commands
3. Using the Spark Shell as a Scala Shell
4. Number Analysis
5. Log Analysis
6. Summary
Chapter 5 : Writing a Spark Application
1. Hello World in Spark
2. Compiling and Running the Application
3. Monitoring the Application
4. Debugging the Application
5. Summary
Chapter 6 : Spark Streaming
1. Introducing Spark Streaming
2. Application Programming Interface (API)
3. A Complete Spark Streaming Application
4. Summary
Chapter 7 : Spark SQL
1. Introducing Spark SQL
2. Performance
3. Applications
4. Application Programming Interface (API)
5. Built-in Functions
  1. Aggregate
  2. Collection
  3. Date/Time
  4. Math
  5. String
  6. Window
6. UDFs and UDAFs
7. Interactive Analysis Example
8. Interactive Analysis with Spark SQL JDBC Server
9. Summary
Chapter 8 : Machine Learning with Spark
1. Introducing Machine Learning
2. Spark Machine Learning Libraries
3. MLlib Overview
4. The MLlib API
5. An Example MLlib Application
  1. Dataset
  2. Goal
  3. Code
6. Spark ML
7. An Example Spark ML Application
  1. Dataset
  2. Goal
  3. Code
8. Summary
Chapter 9 : Graph Processing with Spark
1. Introducing Graphs
2. Introducing GraphX
3. GraphX API
4. Summary
Chapter 10 : Cluster Managers
1. Standalone Cluster Manager
2. Apache Mesos
3. YARN
  1. Architecture
  2. Running a Spark Application on a YARN Cluster
4. Summary
Chapter 11 : Monitoring
1. Monitoring a Standalone Cluster
  1. Monitoring a Spark Master
  2. Monitoring a Spark Worker
2. Monitoring a Spark Application
3. Summary
Bibliography
Index

Product information

Title: Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing
Author(s):
Release date:
Publisher(s): Apress
ISBN: None

video

Mastering Big Data Analytics with PySpark

by Danny Meijer

PySpark helps you perform data analysis at-scale; it enables you to build more scalable analyses and …

book

Data Analytics with Spark Using Python, First edition

by Jeffrey Aven

Spark for Data Professionals introduces and solidifies the concepts behind Spark 2.x, teaching working developers, architects, …

book

Scala Programming for Big Data Analytics : Get Started With Big Data Analytics Using Apache Spark

by Irfan Elahi

Gain the key language concepts and programming techniques of Scala in the context of big data …

book

Practical Big Data Analytics

by Nataraj Dasgupta

Get command of your organizational Big Data using the power of data science and analytics About …

Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing

Book description

Table of contents

Product information

You might also like

Mastering Big Data Analytics with PySpark

Data Analytics with Spark Using Python, First edition

Scala Programming for Big Data Analytics : Get Started With Big Data Analytics Using Apache Spark

Practical Big Data Analytics

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly