Dropout

This method has been proposed by Hinton and co. (in Improving neural networks by preventing co-adaptation of feature detectors, Hinton G. E., Srivastava N., Krizhevsky A., Sutskever I., Salakhutdinov R. R., arXiv:1207.0580 [cs.NE]) as an alternative to prevent overfitting and allow bigger networks to explore more regions of the sample space. The idea is rather simple—during every training step, given a predefined percentage nd, a dropout layer randomly selects ndN incoming units and sets them to 0.0 (the operation is only active during the training phase, while it's completely removed when the model is employed for new predictions).

This operation can be interpreted in many ways. When more dropout layers are employed, the result of ...

Get Mastering Machine Learning Algorithms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.