Chapter 4. Hadoop and MapReduce Framework for R

In this chapter we are entering the diverse world of Big Data tools and applications that can be relatively easily integrated with the R language. In this chapter, we will present you with a set of guidelines and tips on the following topics:

  • Deploying cloud-based virtual machines with Hadoop, the ready-to-use Hadoop Distributed File System (HDFS), and MapReduce frameworks
  • Configuring your instance/virtual machine to include essential libraries and useful supplementary tools for data management in HDFS
  • Managing HDFS using shell/Terminal commands and running a simple MapReduce word count in Java for comparison
  • Integrating R statistical environment with Hadoop on a single-node cluster
  • Managing files in ...

Get Big Data Analytics with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.