A single-node Hadoop in Cloud

Hopefully by now you should have obtained an understanding of what outcomes you can achieve by running MapReduce jobs in Hadoop or by using other Hadoop components. In this chapter, we will put theory into practice.

We will begin by creating a Linux-based virtual machine with a pre-installed Hortonworks distribution of Hadoop through Microsoft Azure. The reason why we opt for a pre-installed, ready-to-use Hadoop is because this book is not about Hadoop per se, and we also want you to start implementing MapReduce jobs in the R language as soon as possible.

Once you have your Hadoop virtual machine configured and prepared for Big Data crunching we will present you with a simple word count example initially carried out ...

Get Big Data Analytics with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.