O'Reilly logo

Hadoop in Practice by Alex Holmes

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 8. Integrating R and Hadoop for statistics and more

 

This chapter covers
  • Integrating your R scripts with MapReduce and Streaming
  • Understanding Rhipe, RHadoop, and R + Streaming

 

R is a statistical programming language for performing data analysis and graphing the results. The capabilities of R[1] let you perform statistical and predictive analytics, data mining, and visualization functions on your data. Its breadth of coverage and applicability across a wide range of sectors (such as finance, life sciences, manufacturing, retail, and more) make it a popular tool.

1 R contains built-in as well as user-created packages which can be accessed via CRAN, its package distribution system; see (http://cran.r-project.org/web/packages/

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required