CHAPTER 14

image

Building Real-Time Systems Using HBase

So far in this book, you have studied MapReduce and its derivative APIs. You learned about batch processes that emphasize throughput (the amount of work done per unit of time) over latency (the response time). Hadoop jobs can take hours to run, but the amount of work done per unit time is phenomenal. Yet there are use-cases for which response time is important. When a Facebook user posts a comment, it is sent as a notification to all the friends. This function needs to happen in near–real time, and it would not be ideal to have to wait until the end of the day for a batch job to execute to get ...

Get Pro Apache Hadoop, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.