Lasso

Lasso regularization is based on the L1-norm of the parameter vector:

Contrary to Ridge, which shrinks all the weights, Lasso can shift the smallest one to zero, creating a sparse parameter vector. The mathematical proof is beyond the scope of this book; however, it's possible to understand it intuitively by considering the following diagram (bidimensional):

Lasso (L1) regularization

The zero-centered square represents the Lasso boundaries. If we consider a generic line, the probability of being tangential to the square is higher at the ...

Get Mastering Machine Learning Algorithms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.