This chapter goes deeper into the various design implications imposed by HBase’s storage architecture. It is important to have a good understanding of how to design tables, row keys, column names, and so on, to take full advantage of the architecture.
HBase has two fundamental key structures: the row key and the column key. Both can be used to convey meaning, by either the data they store, or by exploiting their sorting order. In the following sections, we will use these keys to solve commonly found problems when designing storage solutions.
The first concept to explain in more detail is the logical layout of a table, compared to on-disk storage. HBase’s main unit of separation within a table is the column family—not the actual columns as expected from a column-oriented database in their traditional sense. Figure 9-1 shows the fact that, although you store cells in a table format logically, in reality these rows are stored as linear sets of the actual cells, which in turn contain all the vital information inside them.
The top-left part of the figure shows the logical layout of your data—you have rows and columns. The columns are the typical HBase combination of a column family name and a column qualifier, forming the column key. The rows also have a row key so that you can address all columns in one logical row.
Figure 9-1. Rows stored as linear ...