HDInsight - a multi-node Hadoop cluster on Azure

In Online Chapter , Pushing R Further (https://www.packtpub.com/sites/default/files/downloads/5396_6457OS_PushingRFurther.pdf), we briefly introduced you to HDInsight-a fully-managed Apache Hadoop service that comes as part of the Microsoft Azure platform and is specifically designed for heavy data crunching. In this section, we will deploy a multi-node HDInsight cluster with R and RStudio Server installed and will perform a number of MapReduce jobs on smart electricity meter readings (~414,000,000 cases, four variables, ~12 GB in size) of the Energy Demand Research Project available to download from UK Data Service's online Discover catalog at https://discover.ukdataservice.ac.uk/catalogue/?sn=7591 ...

Get Big Data Analytics with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.