Sharing resources

In Hadoop 1, the only time one had to consider resource sharing was in considering which scheduler to use for the MapReduce JobTracker. Since all jobs were eventually translated into MapReduce code having a policy for resource sharing at the MapReduce level was usually sufficient to manage cluster workloads in the large.

Hadoop 2 and YARN changed this picture. As well as running many MapReduce jobs, a cluster might also be running many other applications atop other YARN ApplicationMasters. Tez and Spark are frameworks in their own right that run additional applications atop their provided interfaces.

If everything runs on YARN, then it provides ways of configuring the maximum resource allocation (in terms of CPU, memory, and soon ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.