Foreword

Hadoop and data science have been sought after skillsets respectively over the last five years. However, few publications have attempted to bring the two together, teaching data science within the Hadoop context. For practitioners looking for an introduction to data science combined with solving those problems at scale using Hadoop and related tools, this book will prove to be an excellent resource.

The topic of data science is introduced with topics covered including data ingest, munging, feature extraction, machine learning, predictive modeling, anomaly detection, and natural language processing. The platform of choice for the examples and implementation of these topics is Hadoop, Spark, and the other parts of the Hadoop ecosystem. ...

Get Practical Data Science with Hadoop® and Spark: Designing and Building Effective Analytics at Scale now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.