Building a scalable recommendation engine using collaborative filtering in Spark 2.0

In this recipe, we will be demonstrating a recommendation system that utilizes a technique known as collaborative filtering. At the core, collaborative filtering analyzes the relationship between users themselves and the dependencies between the inventory (for example, movies, books, news articles, or songs) to identify user-to-item relationships based on a set of secondary factors called latent factors (for example, female/male, happy/sad, active/passive). The key here is that you do not need to know the latent factors in advance.

The recommendation will be produced via the ALS algorithm which is a collaborative filtering technique. At a high level, collaborative ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.