Fitting a linear regression line to data the old fashioned way

In this recipe, we use RDDs and a closed form formula to code a simple linear equation from scratch. The reason we use this as the first recipe is to demonstrate that you can always implement any given statistical learning algorithm via the RDDs to achieve computational scale using Apache Spark.

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.