Benchmarking HDFS using DFSIO

Hadoop contains several benchmarks that you can use to verify whether your HDFS cluster is set up properly and performs as expected. DFSIO is a benchmark test that comes with Hadoop, which can be used to analyze the I/O performance of an HDFS cluster. This recipe shows how to use DFSIO to benchmark the read/write performance of an HDFS cluster.

Getting ready

You must set up and deploy HDFS and Hadoop v2 YARN MapReduce prior to running these benchmarks. Locate the hadoop-mapreduce-client-jobclient-*-tests.jar file in your Hadoop installation.

How to do it...

The following steps will show you how to run the write and read DFSIO performance benchmarks:

  1. Execute the following command to run the HDFS write performance benchmark. ...

Get Hadoop MapReduce v2 Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.