In this section, we will discuss various strategies for optimizing the performance of our Spark jobs. We will also discuss a few best practices with respect to Spark and Spark SQL.
Performance tuning is very subjective and a wide open statement. The very first step in performance tuning is to answer the question, "Do we really need to performance tune our jobs?" Now before we answer this question, we need to consider the following aspects:
If yes, then no need for performance tuning.
For example, expecting all Spark jobs (irrespective of data size or computations performed) to be completed in milliseconds is unrealistic. ...