Understanding where YARN fits into Hadoop

If we refer to Hadoop 1.x in the first figure of this chapter, then it is clear that the responsibilities of the JobTracker mainly included the following:

  • Managing the computational resources in terms of map and reduce slots
  • Scheduling submitted jobs
  • Monitoring the executions of the TaskTrackers
  • Restarting failed tasks
  • Performing a speculative execution of tasks
  • Calculating the Job Counters

Clearly, the JobTracker alone does a lot of tasks together and is overloaded with lots of work.

This overloading of the JobTracker led to the redesign of the JobTracker, and YARN tried to reduce the responsibilities of the JobTracker in the following ways:

  • Cluster resource management and Scheduling responsibilities were moved ...

Get YARN Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.