Precreating regions using your own algorithm

When we create a table in HBase, the table starts with a single region. All data inserted into that table goes to the single region. As data keeps growing, when the size of the region reaches a threshold, Region Splitting happens. The single region is split into two halves so that the table can handle more data.

In a write-heavy HBase cluster, this approach has several issues that need to be fixed:

  • The split/compaction storm issue.

    As data grows uniformly, most of the regions are split at the same time, which causes huge disk I/O and network traffics.

  • Load is not well balanced until enough regions have been split.

    Especially right after the table is created, all requests go to the same region server where ...

Get HBase Administration Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.