Preface

The success of Hadoop as a big data platform raised user expectations, both in terms of solving different analytics challenges as well as reducing latency. Various tools evolved over time, but when Apache Spark came, it provided one single runtime to address all these challenges. It eliminated the need to combine multiple tools with their own challenges and learning curves. By using memory for persistent storage besides compute, Apache Spark eliminates the need to store intermedia data in disk and increases processing speed up to 100 times. It also provides a single runtime, which addresses various analytics needs such as machine-learning and real-time streaming using various libraries.

This book covers the installation and configuration ...

Get Spark Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.