Preface

Data science and machine learning are at the core of many innovative technologies and products and are expected to continue to disrupt many industries and business models across the globe for the foreseeable future. Until recently though, most of this innovation was constrained by the limited availability of data.

With the introduction of Apache Hadoop, all of that has changed. Hadoop provides a platform for storing, managing, and processing large datasets inexpensively and at scale, making data science analysis of large datasets practical and feasible. In this new world of large-scale advanced analytics, data science is a core competency that enables organizations to remain competitive and innovate beyond their traditional business models. ...

Get Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.