Understanding common EMR use cases

Using HBase for random access at a massive scale involves a lot of customers who are running HBase with HDFS. Now there is support for HBase using S3 object store for HFiles. Also, there is the ability to use Read Replica HBase cluster in another AZ. Shifting to S3 can save you 50% or higher on storage costs. Instead of sizing the cluster for HDFS, they can now size it for the amount of processing power required for the HBase Region Servers. The S3 option is also good for load balancing and disaster recovery across AZs. As S3 is available across a region, you don’t have to replicate the data twice, that is, you don’t need two full HDFS clusters. Now you can set up a smaller cluster for the Read Replicas ...

Get Learning AWS - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.