HBase rowkey design

Rowkey design is a very crucial part of HBase table design. During key design, proper care must be taken to avoid hotspotting. In case of poorly designed keys, all of the data will be ingested into just a few nodes, leading to cluster imbalance. Then, all the reads have to be pointed to those few nodes, resulting in slower data reads. We have to design a key that will help load data equally to all nodes of the cluster. Hotspotting can be avoided by the following techniques:

  • Key salting: It means adding an arbitrary value at the beginning of the key to make sure that the rows are distributed equally among all the table regions. Examples are aa-customer_id, bb-customer_id, and so on.
  • Key hashing: The key can be hashed and ...

Get Modern Big Data Processing with Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.