Hour 12. Advanced Spark Programming

What You’ll Learn in This Hour:

Image Shared variables in Spark—broadcast variables and accumulators

Image Partitioning and repartitioning of Spark RDDs

Image Processing RDD data with external programs

In this hour, I will cover the additional programming tools at your disposal with the Spark API, including broadcast variables and accumulators as shared variables across different workers. I will also dive deeper into the important ...

Get Sams Teach Yourself Apache Spark™ in 24 Hours now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.