Hardware Failures

As the Hadoop and YARN frameworks use commodity hardware for the cluster setup and scaling from several nodes to several thousand nodes, all the components of Hadoop or YARN are designed on the assumption that hardware failures are very common. Therefore, these failures would be automatically handled by the framework so that important data is not lost because of them. For this, Hadoop provides data replication across the nodes/racks so that even if the whole rack fails, data would be recovered from another node on another rack, and jobs would be restarted over another replica dataset to compute the results.

Get YARN Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.