Tuning block size to improve seek performance

HBase data are stored as StoreFile in the HFile format. StoreFiles are composed of HFile blocks. HFile block is the smallest unit of data that HBase reads from its StoreFiles. It is also the basic element that region server caches in the block cache.

The size of the HFile block is an important tuning parameter. To achieve better performance, we should select different block sizes, based on the average Key/Value size and disk I/O speed. Like block cache and Bloom Filter, HFile block size is also configurable at the column family level.

We will describe how to show the average Key/Value size and tune block size to improve seek performance in this recipe.

Getting ready

Log in to your HBase client node.

Get HBase Administration Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.