Spark configuration

Configuration can be applied to Spark in the following ways:

  • Spark properties control application-level settings, including execution behavior, memory management, dynamic allocation, scheduling, and security, which can be defined in the following order of precedence:
    • Via a Spark configuration programmatic object called SparkConf defined in your driver program
    • Via command-line arguments passed to spark-submit or spark-shell
    • Via default options set in conf/spark-defaults.conf
  • Environmental variables control per-machine settings, such as the local IP address of the local worker node, and which can be defined in conf/spark-env.sh.

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.