Spark architecture

Let's have a look at Apache Spark architecture, including a high level overview and a brief description of some of the key software components.

High level overview

At the high level, Apache Spark application architecture consists of the following key software components and it is important to understand each one of them to get to grips with the intricacies of the framework:

  • Driver program
  • Master node
  • Worker node
  • Executor
  • Tasks
  • SparkContext
  • SQL context
  • Spark session

Here's an overview of how some of these software components fit together within the overall architecture:

High level overview

Figure 1.15: Apache Spark application architecture - Standalone mode ...

Get Learning Apache Spark 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.