Summary

YARN has opened up the Hadoop ecosystem to a wide range of applications. It has not only alleviated scaling bottlenecks that were present in traditional MapReduce-based Hadoop but also aided in improving infrastructure efficiency in an organization. This was made possible by:

  • Separating out application-specific logic from resource management. The ResourceManager is solely responsible for cluster resource management and is agnostic of any application.
  • Providing common and generic abstractions for resource specifications. Resources are specified in terms of cores and memory.
  • Maintaining backward compatibility with existing Hadoop APIs. Existing Hadoop programs work on YARN on recompilation, without any code changes.
  • Providing a variety of pluggable ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.