O'Reilly logo

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

This chapter explored how to process those large volumes of data that we discussed so much in the previous chapter. In particular we covered:

  • How MapReduce was the only processing model available in Hadoop 1 and its conceptual model
  • The Java API to MapReduce, and how to use this to build some examples, from a word count to sentiment analysis of Twitter hashtags
  • The details of how MapReduce is implemented in practice, and we walked through the execution of a MapReduce job
  • How Hadoop stores data and the classes involved to represent input and output formats and record readers and writers
  • The limitations of MapReduce that led to the development of YARN, opening the door to multiple computational models on the Hadoop platform
  • The YARN architecture ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required