See also

Documentation for randomSplit() is available at https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.api.java.JavaRDD@randomSplit(weights:Array%5BDouble%5D):Array%5Borg.apache.spark.api.java.JavaRDD%5BT%5D%5D.

The randomSplit() is a method call within an RDD. While the number of RDD method calls can be overwhelming, mastering this Spark concept and API is a must.

API signature is as follows:

def randomSplit(weights: Array[Double]): Array[JavaRDD[T]]

Randomly splits this RDD with the provided weights.

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.