SparkR

SparkR is an interactive CLI, built-in with Spark, which provides an R interface of developing for processing large amounts of data either from a single source or aggregating from multiple sources. This is the statisticians' CLI for data interaction. As R is a statistician's language, it is a little more complicated than Python, due to the limitations and architecture of R.

SparkR can be found in the bin directory of the binary installations. It also has support for running in local or pseudo mode and, based on which, there would/wouldn't be any master and worker web UI. But the application web UI would be accessible regardless. Refer to the SparkR docs for further information at Spark: R on Spark: https://spark.apache.org/docs/latest/sparkr.html ...

Get Mastering Apache Cassandra 3.x - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.