k-nearest neighbors

Our Mahout user-based recommender is making recommendations by looking at the neighborhood of the most similar users. This is commonly called k-nearest neighbors or k-NN.

It might appear that a user neighborhood is a lot like the k-means clusters we encountered in the previous chapter, but this is not quite the case. This is because each user sits at the center of their own neighborhood. With clustering, we aim to establish a smaller number of groupings, but with k-NN, there are as many neighborhoods as there are users; each user is their own neighborhood centroid.

Note

Mahout also defines ThresholdUserNeighbourhood that we could use to construct a neighborhood containing only the users that fall within a certain similarity from ...

Get Clojure for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.