Convolutions in three dimensions

MNIST was a grayscale example and we could represent each image as a pixel intensity value from 0 to 255, in a two-dimensional matrix. However, most of the time, we will be working with color images. Color images are actually three-dimensional matrices, where the dimensions are the image height, image width, and color. This results in a matrix with separate red, blue, and green values for each pixel in the image.

While we were previously showing two-dimensional filters, we can adapt the idea to three dimensions quite simply by performing the convolution between a (height, width, 3 (colors)) matrix and a 3 x 3 x 3 filter. In the end, we're still left with a two-dimensional output, as we take the elementwise ...

Get Deep Learning Quick Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.