MRv1 versus MRv2

MRv1 (MapReduce version 1) is part of Apache Hadoop 1.x and is an implementation of the MapReduce programming paradigm.

The MapReduce project itself can be broken into the following parts:

  • End-user MapReduce API: This is the API needed to develop the MapReduce application.
  • MapReduce framework: This is the runtime implementation of various phases, such as the map phase, the sort/shuffle/merge aggregation phase, and the reduce phase.
  • MapReduce system: This is the backend infrastructure required to run MapReduce applications and includes things such as cluster resource management, scheduling of jobs, and so on.

Hadoop 1.x was written solely as an MR engine. Since it runs on a cluster, its cluster management component was also tightly ...

Get YARN Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.