Chapter 3. Understanding MapReduce

The previous two chapters have discussed the problems that Hadoop allows us to solve, and gave some hands-on experience of running example MapReduce jobs. With this foundation, we will now go a little deeper.

In this chapter we will be:

  • Understanding how key/value pairs are the basis of Hadoop tasks
  • Learning the various stages of a MapReduce job
  • Examining the workings of the map, reduce, and optional combined stages in detail
  • Looking at the Java API for Hadoop and use it to develop some simple MapReduce jobs
  • Learning about Hadoop input and output

Key/value pairs

Since Chapter 1, What It's All About, we have been talking about operations that process and provide the output in terms of key/value pairs without explaining ...

Get Hadoop Beginner's Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.