Chapter 1. Getting Started with Hadoop v2

In this chapter, we will cover the following recipes:

  • Setting up standalone Hadoop v2 on your local machine
  • Writing a WordCount MapReduce application, bundling it, and running it using Hadoop local mode
  • Adding a combiner step to the WordCount MapReduce program
  • Setting up HDFS
  • Setting up Hadoop YARN in a distributed cluster environment using Hadoop v2
  • Setting up Hadoop ecosystem in a distributed cluster environment using a Hadoop distribution
  • HDFS command-line file operations
  • Running the WordCount program in a distributed cluster environment
  • Benchmarking HDFS using DFSIO
  • Benchmarking Hadoop MapReduce using TeraSort

Introduction

We are living in the era of big data, where exponential growth of phenomena such as web, ...

Get Hadoop MapReduce v2 Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.