There's more...

The step size or learning rate is very important to master when you first start with GD. If the step size is too small, it results in computational wastage and gives the appearance that the gradient descent is not converging to a solution. While setting the step size is trivial for demos and small projects, setting it to a wrong value can lead to a high computational loss on large ML projects. On the other hand, if the step size is too large, we end up with a ping-pong situation or moving away from convergences that usually shows up as a blown up error curve, meaning that the error increases rather than decreasing with each iteration.

In our experience, it is best to look at the error versus iteration chart and use the knee ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.