Chapter 4. Writing basic MapReduce programs

This chapter covers

  • Patent data as an example data set to process with Hadoop
  • Skeleton of a MapReduce program
  • Basic MapReduce programs to count statistics
  • Hadoop’s Streaming API for writing MapReduce programs using scripting languages
  • Combiner to improve performance

The MapReduce programming model is unlike most programming models you may have learned. It’ll take some time and practice to gain familiarity. To help develop your proficiency, we go through many example programs in the next couple chapters. These examples will illustrate various MapReduce programming techniques. By applying MapReduce in multiple ways you’ll start to develop an intuition and a habit of “MapReduce thinking.” The examples ...

Get Hadoop in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.