HBase with R

The Apache HBase database allows users to store and process non-relational data on top of HDFS. Inspired by Google's BigTable, HBase is an open source, distributed, consistent, and scalable database that facilitates real-time read and write access to massive amounts of data. It is in fact a columnar or key-column-value data store that lacks any default schema and can be defined by users at any time.

The following tutorial will present a sequence of essential activities that will allow you to import our previously used Land Registry Price Paid Data into the HBase store on the Microsoft Azure HDInsight cluster and then retrieve specific slices of data using RStudio Server.

Azure HDInsight with HBase and RStudio Server

The process of launching ...

Get Big Data Analytics with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.