Running Spark standalone

Spark can be executed in various modes. To get started, we are going to take a look at how to install Apache Spark on a standalone machine.

Getting ready

To perform this recipe, you should download the latest version of Spark. For this recipe, I am using Apache Spark 1.6.0. You can visit the download page at http://spark.apache.org/downloads.html.

How to do it...

Apache Spark is a computation engine. It has a built-in cluster manager. It can also use other cluster managers such as YARN/Mesos and so on. In this recipe, we are going to use the built-in resource manager that's provided by Spark:

Copy the downloaded Spark binary to a desired location.

Extract the tar ball:

$sudo tar –xzfspark-1.6.0-bin-hadoop2.6.tgz

Rename the

Get Hadoop Real-World Solutions Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Hadoop Real-World Solutions Cookbook - Second Edition by Tanmay Deshpande

Running Spark standalone

Getting ready

How to do it...

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly