There's more...

As we mentioned before, the optimization algorithm is sensitive to the choice of learning rate. It is important to summarize the effect of this choice in a concise manner:

Learning rate size

Advantages/disadvantages

Uses

Smaller learning rate

Converges slower but more accurate results

If solution is unstable, try lowering the learning rate first

Larger learning rate

Less accurate, but converges faster

For some problems, helps prevent solutions from stagnating

Sometimes, the standard gradient descent algorithm can get stuck or slow down significantly. This can happen when the optimization is stuck in the flat spot of a saddle. To combat this, there is another algorithm that takes into account a ...

Get TensorFlow Machine Learning Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.