Transforming RDDs with Spark 2.0 using the filter() API

In this recipe, we explore the filter() method of RDD which is used to select a subset of the base RDD and return a new filtered RDD. The format is similar to map(), but a lambda function selects which members are to be included in the resulting RDD.

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.