Chapter 10. HDFS Federation

The NameNode component of HDFS was the central point of failure in the initial versions of Hadoop. In the later versions, a secondary NameNode was introduced as a backup for the primary NameNode. Until Hadoop 2.X, the NameNode component could only handle a single namespace, making it less scalable and difficult to isolate in a multitenant HDFS environment. Scalability and isolation were the two most desired requirements for Hadoop enterprise deployments. Most organizations shared infrastructure among their different teams with varying degrees of availability and authorization aspirations.

HDFS Federation is a feature that enables Hadoop to have multiple namespaces, making it easy to use for shared cluster scenarios. ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.