Finding similar items

Many problems can be formulated as finding similar sets of elements, for example, customers who purchased similar products, web pages with similar content, images with similar objects, users who visited similar websites, and so on.

Two items are considered similar if they are a small distance apart. The main questions are how each item is represented and how the distance between the items is defined. There are two main classes of distance measures:

  • Euclidean distances
  • Non-Euclidean distances

Get Machine Learning in Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.