Chapter 7. Apache Sentry (Incubating)

Over the lifetime of the various Hadoop ecosystem projects, secure authorization has been added in a variety of different ways. It has become increasingly challenging for administrators to implement and maintain a common system of authorization across multiple components. To compound the problem, the various components have different levels of granularity and enforcement of authorization controls, which often leave an administrator confused as to what a given user can actually do (or not do) in the Hadoop environment. These issues, and many others, were the driving force behind the proposal for Apache Sentry (Incubating).

The Sentry proposal identified a need for fine-grained role-based access controls (RBAC) to give administrators more flexibility to control what users can access. Traditionally, and covered already, HDFS authorization controls are limited to simple POXIS-style permissions and extended ACLs. What about frameworks that work on top of HDFS, such as Hive, Cloudera Impala, Solr, HBase, and others? Sentry’s goals are to implement authorization for Hadoop ecosystem components in a unified way so that security administrators can easily control what users and groups have access to without needing to know the ins and outs of every single component in the Hadoop stack.

Sentry Concepts

Each component that leverages Sentry for authorization must have a Sentry binding. The binding is a plug-in that the component uses to delegate authorization ...

Get Hadoop Security now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.