Introduction

With the recent advancements in cluster computing coupled with the rise of big data, the field of machine learning has been pushed to the forefront of computing. The need for an interactive platform that enables data science at scale has long been a dream that is now a reality.

The following three areas together have enabled and accelerated interactive data science at scale:

  • Apache Spark: A unified technology platform for data science that combines a fast compute engine and fault-tolerant data structures into a well-designed and integrated offering
  • Machine learning: A field of artificial intelligence that enables machines to mimic some of the tasks originally reserved exclusively for the human brain
  • Scala: A modern JVM-based ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.