Chapter 8. Failures in YARN

Dealing with failures in distributed systems is comparatively more challenging and time consuming. Also, the Hadoop and YARN frameworks run on commodity hardware and cluster size nowadays; this size can vary from several nodes to several thousand nodes. So handling failure scenarios and dealing with ever-growing scaling issues is very important. In this section, we will focus on failures in the YARN framework: the causes of failures and how to overcome them.

In this chapter, we will cover the following topics:

ResourceManager failures
ApplicationMaster failures
NodeManager failures
Container failures
Hardware failures

We will be dealing with the root causes of these failures and the solutions to them.

ResourceManager failures ...

Get YARN Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

YARN Essentials by Amol Fasale, Nirmal Kumar

Chapter 8. Failures in YARN

ResourceManager failures ...

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly