Input data

Before we delve further into the schema of our specific dataset, let's first understand how a MLP will actually help us with this problem. Firstly, as we saw in Chapter 5, Unsupervised Learning Using Apache Spark, when studying image segmentation, images can be broken down into a matrix of either pixel-intensity values (for grayscale images) or pixel RGB values (for images with color). A single vector containing (m x n) numerical elements can then be generated, corresponding to the pixel height (m) and width (n) of the image.

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.