Chapter 7. Implementation of an Underlying Storage Engine

In the previous chapter, we described how Omneo uses the different Hadoop technologies to implement its use case. In this chapter, we will look more closely at all the different parts involving HBase. We will not discuss each and every implementation detail, but will cover all the required tools and examples to help you understand what is important during this phase.

As usual when implementing an HBase project, the first consideration is the table schema, which is the most important part of every HBase project. Designing an HBase schema can be straightforward, but depending on the use case, can also be quite complex and require significant planning and testing. It is a good practice to always start with this task, keeping in mind how data is received from your application (write path) and how you will need to retrieve it (read path). Read and write access patterns will dictate most of the table design.

Table Design

As we said and we will continue to say all over the book, table design is one of the most important parts of your project. The table schema, the key you will choose to use, and the different parameters you will configure will all have an impact on not only the performances of your application but also on the consistency. This is why for all the use cases we will describe, we are going to spend a lot of time on the table’s design. After your application is running for weeks and storing terabytes of data, moving ...

Get Architecting HBase Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.