Other Apache projects

Whether you use a bundled distribution or stick with the base Apache Hadoop download, you will encounter many references to other, related Apache projects. We have covered Hive, Sqoop, and Flume in this book; we'll now highlight some of the others.

Note that this coverage seeks to point out the highlights (from my perspective) as well as give a taste of the wide range of the types of projects available. As before, keep looking out; there will be new ones launching all the time.

HBase

Perhaps the most popular Apache Hadoop-related project that we didn't cover in this book is HBase ; its homepage is at http://hbase.apache.org. Based on the BigTable model of data storage publicized by Google in an academic paper (sound familiar?), ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.