O'Reilly logo
  • Kai Zhang thinks this is interesting:

df.repartition(5)

From

Cover of Spark: The Definitive Guide

Note

After running this command, i ran df.rdd.getNumPartitions again, it turns out the partition number is still 1, not 5.