Input layer

Given that CNNs are predominantly used to classify images, the input data into CNNs consists of image matrices of the dimensions h (height in pixels), w (width in pixels) and d (depth). In the case of RGB images, the depth would be three corresponding, to the three color channels, red, green, and blue (RGB). This is illustrated in Figure 7.9:

Figure 7.9: Image matrix dimensions

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.