Chapter 3. Hadoop Essentials – Configurations, Unit Tests, and Other APIs

In this chapter, we will cover:

  • Optimizing Hadoop YARN and MapReduce configurations for cluster deployments
  • Shared user Hadoop clusters – using Fair and Capacity schedulers
  • Setting classpath precedence to user-provided JARs
  • Speculative execution of straggling tasks
  • Unit testing Hadoop MapReduce applications using MRUnit
  • Integration testing Hadoop MapReduce applications using MiniYarnCluster
  • Adding a new DataNode
  • Decommissioning DataNodes
  • Using multiple disks/volumes and limiting HDFS disk usage
  • Setting the HDFS block size
  • Setting the file replication factor
  • Using the HDFS Java API

Introduction

This chapter describes how to perform advanced administration steps in your Hadoop cluster, ...

Get Hadoop MapReduce v2 Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.