In this recipe, we introduce the KeyValueRDD pair RDD and the supporting join operations such as join(), leftOuterJoin and rightOuterJoin(), and fullOuterJoin() as an alternative to the more traditional and more expensive set operations available via the set operation API, such as intersection(), union(), subtraction(), distinct(), cartesian(), and so on.
We'll demonstrate join(), leftOuterJoin and rightOuterJoin(), and fullOuterJoin(), to explain the power and flexibility of key-value pair RDDs.
println("Full Joined RDD = ") val fullJoinedRDD = keyValueRDD.fullOuterJoin(keyValueCity2RDD) fullJoinedRDD.collect().foreach(println(_))