Tuning and optimizing CNN hyperparameters

The following hyperparameters are very important and must be tuned to achieve optimized results.

  • Dropout: Used for random omission of feature detectors to prevent overfitting
  • Sparsity: Used to force activations of sparse/rare inputs
  • Adagrad: Used for feature-specific learning-rate optimization
  • Regularization: L1 and L2 regularization
  • Weight transforms: Useful for deep autoencoders
  • Probability distribution manipulation: Used for initial weight generation
  • Gradient normalization and clipping

Another important question is: when do you want to add a max pooling layer rather than a convolutional layer with the same stride? A max pooling layer has no parameters at all, whereas a convolutional layer has ...

Get Scala Machine Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.