Chapter 8. Lessons Learned

Real-world projects have real-world budgets for resources and effort. It’s important to keep that in mind in the move from cutting-edge academic research in machine learning to practical, deployable recommendation engines that work well in production and provide profitable results. So it matters to recognize which approaches can make the biggest difference for the effort expended.

Simplifications chosen wisely often make a huge difference in the practical approach to recommendation. The behavior of a crowd can provide valuable data to predict the relevance of recommendations to individual users. Interesting co-occurrence can be computed at scale with basic algorithms such as ItemSimilarityJob from the Apache Mahout library, making use of log likelihood ratio anomaly-detection tests. Weighting of the computed indicators improves their ability to predict relevance for recommendations.

One cost-effective simplification is the innovative use of search capabilities, such as those of Apache Solr/Lucene, to deploy a recommender at scale in production. Search-based, item-based, recommendation underlies a two-part design for a recommendation engine that has offline learning and realtime online recommendations in response to recent user events. The result is a simple and powerful recommender that is much easier to build than many people would expect.

This two-part design for recommendation at large scale can be made easier and even more cost effective when built on ...

Get Practical Machine Learning: Innovations in Recommendation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.