SGD with momentum in Keras

When using Keras, it's possible to customize the SGD optimizer by directly instantiating the SGD class and using it while compiling the model:

from keras.optimizers import SGD...sgd = SGD(lr=0.0001, momentum=0.8, nesterov=True)model.compile(optimizer=sgd,              loss='categorical_crossentropy',              metrics=['accuracy'])

The class SGD accepts the parameter lr (the learning rate η with a default set to 0.01), momentum (the parameter μ), nesterov (a boolean indicating whether employing the Nesterov momentum), and an optional decay parameter to indicate whether the learning rate must be decayed over the updates with the following formula:

Get Mastering Machine Learning Algorithms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.