Case study 1 – OCR

A great real-world use case to demonstrate the power of MLPs is that of OCR. In OCR, the challenge is to recognize human writing, classifying each handwritten symbol as a letter. In the case of the English alphabet, there are 26 letters. Therefore, when applied to the English language, OCR is actually a classification problem that has k = 26 possible classes!

The dataset that we will be using has been derived from the University of California's (UCI) Machine Learning Repository, which is found at https://archive.ics.uci.edu/ml/index.php. The specific letter recognition dataset that we will use, available from both the GitHub repository accompanying this book and from https://archive.ics.uci.edu/ml/datasets/letter+recognition ...

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.