Building the Uber JAR

The first step for deploying our Spark application on a cluster is to bundle it into a single Uber JAR, also known as the assembly JAR. In this recipe, we'll be looking at how to use the SBT assembly plugin to generate the assembly JAR. We'll be using this assembly JAR in subsequent recipes when we run Spark in distributed mode. We could alternatively set dependent JARs using the spark.driver.extraClassPath property (https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment). However, for a large number of dependent JARs, this is inconvenient.

How to do it...

The goal of building the assembly JAR is to build a single, Fat JAR that contains all dependencies and our Spark application. Refer to the following screenshot, ...

Get Scala Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.