Linear regression API with Lasso and L-BFGS in Spark 2.0

In this recipe, we will demonstrate the use of Spark 2.0's LinearRegression() API to showcase a unified/parameterized API to tackle the linear regression in a comprehensive way capable of extension without backward-compatibility issues of an RDD-based named API. We show how to use the setSolver() to set the optimization method to first-order memory-efficient L-BFGS, which can deal with numerous amount of parameters (that is, especially in sparse configuration) with ease.

In this recipe, the .setSolver() is set to lbgfs, which makes the L-BFGS (see RDD-based regression for more detail) the selected optimization method. The .setElasticNetParam() is not set, so the default of 0 remains ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.