This chapter explored how to process those large volumes of data that we discussed so much in the previous chapter. In particular we covered:
- How MapReduce was the only processing model available in Hadoop 1 and its conceptual model
- The Java API to MapReduce, and how to use this to build some examples, from a word count to sentiment analysis of Twitter hashtags
- The details of how MapReduce is implemented in practice, and we walked through the execution of a MapReduce job
- How Hadoop stores data and the classes involved to represent input and output formats and record readers and writers
- The limitations of MapReduce that led to the development of YARN, opening the door to multiple computational models on the Hadoop platform
- The YARN architecture ...