There's more...

The LogisticRegressionWithLBFGS() object has a method called setNumClasses() that allows it to deal with multinomials (that is, more than two groups). By default, it is set to two, which is a binary logistic regression.

L-BFGS is a limited memory adaptation of the original BFGS (Broyden-Fletcher-Goldfarb-Shanno) method. L-BFGS is well suited for regression models that deal with a large number of variables. It is a form of BFGS approximation with limited memory in which it tries to estimate the Hessian matrix while searching through the large search space.

We encourage the reader to step back and look at the problem as regression plus an optimization technique (regression with SGD versus regression with L-BFGS). In this recipe, ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.