- Set up the data structures and RDD for the example:
val keyValuePairs = List(("north",1),("south",2),("east",3),("west",4)) val keyValueCity1 = List(("north","Madison"),("south","Miami"),("east","NYC"),("west","SanJose")) val keyValueCity2 = List(("north","Madison"),("west","SanJose"))
- Turn the List into RDDs:
val keyValueRDD = spark.sparkContext.parallelize(keyValuePairs) val keyValueCity1RDD = spark.sparkContext.parallelize(keyValueCity1) val keyValueCity2RDD = spark.sparkContext.parallelize(keyValueCity2)
- We can access the keys and values inside a pair RDD.
val keys=keyValueRDD.keys val values=keyValueRDD.values
- We apply the mapValues() function to the pair RDDs to demonstrate the transformation. In this example ...