Using the filesystem

HBase depends on the Hadoop Distributed File System (HDFS).

HDFS fundamentally is a distributed file system, which relies on following core principles:

Getting ready

The following are the benefits of using HDFS:

  • It's designed to work as a fault-tolerant system and is rack aware.
  • It works on the low-cost commodity hardware.
  • HDFS relaxes core system POSIX requirements to facilitate streaming access to the underlying OS access of file system data.
  • It's designed to write once and read many times. It also supports parallel reading and processing the data (read, write, and append). It doesn't support random writes of data.
  • It's designed to scale at a very large level, which means file size like petabyte of data.
  • It works with minimum data ...

Get HBase High Performance Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.