Chapter 3. Distributed HBase, HDFS, and MapReduce

 

This chapter covers
  • HBase as a distributed storage system
  • When to use MapReduce instead of the key-value API
  • MapReduce concepts and workflow
  • How to write MapReduce applications with HBase
  • How to use HBase for map-side joins in MapReduce
  • Examples of using HBase with MapReduce

 

As you’ve realized, HBase is built on Apache Hadoop. What may not yet be clear to you is why. Most important, what benefits do we, as application developers, enjoy from this relationship? HBase depends on Hadoop for two separate concerns. Hadoop MapReduce provides a distributed computation framework for high-throughput data access. The Hadoop Distributed File System (HDFS) gives HBase a storage layer providing availability ...

Get HBase in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.