Technology and implementation options for scaling-up Machine learning

In this section, we will explore some parallel programming techniques and distributed platform options that Machine learning implementations can adopt. The Hadoop platform will be introduced in the next chapter, and we will look into some practical examples starting from Chapter 3, An Introduction to Hadoop's Architecture and Ecosystem with some real-world examples.

MapReduce programming paradigm

MapReduce is a parallel programming paradigm that abstracts the parallelizing computing and data complexities in a distributed computing environment. It works on the concept of taking the compute function to the data rather than taking the data to the compute function.

MapReduce is more ...

Get Practical Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.