Chapter 2

Leveraging Similarity

IN THIS CHAPTER

check Understanding differences between examples

check Clustering data into meaningful groups

check Classifying and regressing after looking for data neighbors

check Grasping the difficulties of working in a high-dimensional data space

“A rose by any other name would smell as sweet.”

— JULIET, ROMEO AND JULIET

Arose is a rose. A tree is a tree. A car is a car. Even though you can make simple statements like this, one example of each kind of item doesn’t suffice to identify all the items that fit into that classification. After all, many species of trees and many kinds of roses exist. If you evaluate the problem under a machine learning framework in the examples, you find features whose values change frequently and features that somehow systematically persist (a tree is always made of wood and has a trunk and roots, for instance). When you look closely for the features’ values that repeat constantly, you can guess that certain observed objects are of much the same kind.

So children can figure out by themselves what cars are by looking at the features. After ...

Get Coding All-in-One For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.