Doing ridge regression
An alternate way to lasso to improve prediction quality is ridge regression. While in lasso, a lot of features get their coefficients set to zero and, therefore, eliminated from an equation, in ridge, predictors or features are penalized, but are never set to zero.
How to do it…
- Start the Spark shell:
$ spark-shell
- Import the statistics and related classes:
scala> import org.apache.spark.mllib.linalg.Vectors scala> import org.apache.spark.mllib.regression.LabeledPoint scala> import org.apache.spark.mllib.regression.RidgeRegressionWithSGD
- Create the
LabeledPoint
array with the house price as the label:scala> val points = Array( LabeledPoint(1,Vectors.dense(5,3,1,2,1,3,2,2,1)), LabeledPoint(2,Vectors.dense(9,8,8,9,7,9,8,7,9)) ...
Get Spark Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.