There's more...

A Spark session has numerous parameters and APIs that can be set and exercised, but it is worth consulting the Spark documentation since some of the methods/parameters are marked with the status Experimental or left blank - for non-experimental statuses (15 minimum as of our last examination).

Another change to be aware of is to use spark.sql.warehouse.dir for the location of the tables. Spark 2.0 uses spark.sql.warehouse.dir to set warehouse locations to store tables rather than hive.metastore.warehouse.dir. The default value for spark.sql.warehouse.dir is System.getProperty("user.dir").

Also see spark-defaults.conf for more details.

Also noteworthy are the following:

  • Some of our favorite and interesting APIs from the Spark ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.