The Read pipeline

Read in HBase is performed in the following steps:

  1. Client sends a read request. The request is received by the RegionServer which identifies all the Regions where the HFiles are present.
  2. First, the MemStore of the Region is queried; if the data is present, then the request is serviced.
  3. If the data is not present, the BlockCache is queried to check if it has the data; if yes, the request is serviced.
  4. If the data is not present in the BlockCache, then it is pulled from the Region and serviced. Now the data is cached in MemStore and BlockCache..

Get Hadoop Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.