ApplicationMaster failures

To recover the application's state after its restart because of an ApplicationMaster failure is the responsibility of the ApplicationMaster itself. When the ApplicationMaster fails, the ResourceManager simply starts another container with a new ApplicationMaster running in it for another application attempt. It is the responsibility of the new ApplicationMaster to recover the state of the older ApplicationMaster, and this is possible only when ApplicationMasters persist their states in the external location so that it can be used for future reference. Any ApplicationMaster can run any application from scratch instead of recovering its state and rerunning again.

For example, an ApplicationMaster can recover its completed ...

Get YARN Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.