O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Building Recommendation Engines

Book Description

Understand your data and user preferences to make intelligent, accurate, and profitable decisions

About This Book

  • A step-by-step guide to building recommendation engines that are personalized, scalable, and real time
  • Get to grips with the best tool available on the market to create recommender systems
  • This hands-on guide shows you how to implement different tools for recommendation engines, and when to use which

Who This Book Is For

This book caters to beginners and experienced data scientists looking to understand and build complex predictive decision-making systems, recommendation engines using R, Python, Spark, Neo4j, and Hadoop.

What You Will Learn

  • Build your first recommendation engine
  • Discover the tools needed to build recommendation engines
  • Dive into the various techniques of recommender systems such as collaborative, content-based, and cross-recommendations
  • Create efficient decision-making systems that will ease your work
  • Familiarize yourself with machine learning algorithms in different frameworks
  • Master different versions of recommendation engines from practical code examples
  • Explore various recommender systems and implement them in popular techniques with R, Python, Spark, and others

In Detail

A recommendation engine (sometimes referred to as a recommender system) is a tool that lets algorithm developers predict what a user may or may not like among a list of given items. Recommender systems have become extremely common in recent years, and are applied in a variety of applications. The most popular ones are movies, music, news, books, research articles, search queries, social tags, and products in general.

The book starts with an introduction to recommendation systems and its applications. You will then start building recommendation engines straight away from the very basics. As you move along, you will learn to build recommender systems with popular frameworks such as R, Python, Spark, Neo4j, and Hadoop. You will get an insight into the pros and cons of each recommendation engine and when to use which recommendation to ensure each pick is the one that suits you the best.

During the course of the book, you will create simple recommendation engine, real-time recommendation engine, scalable recommendation engine, and more. You will familiarize yourselves with various techniques of recommender systems such as collaborative, content-based, and cross-recommendations before getting to know the best practices of building a recommender system towards the end of the book!

Style and approach

This book follows a step-by-step practical approach where users will learn to build recommendation engines with increasing complexity in every chapter

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. Building Recommendation Engines
    1. Building Recommendation Engines
    2. Credits
    3. About the Author
    4. About the Reviewers
    5. www.PacktPub.com
      1. Why subscribe?
    6. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Downloading the color images of this book
        3. Errata
        4. Piracy
        5. Questions
    7. 1. Introduction to Recommendation Engines
      1. Recommendation engine definition
      2. Need for recommender systems
      3. Big data driving the recommender systems
      4. Types of recommender systems
        1. Collaborative filtering recommender systems
        2. Content-based recommender systems
        3. Hybrid recommender systems
        4. Context-aware recommender systems
      5. Evolution of recommender systems with technology
        1. Mahout for scalable recommender systems
        2. Apache Spark for scalable real-time recommender systems
          1. Neo4j for real-time graph-based recommender systems
      6. Summary
    8. 2. Build Your First Recommendation Engine
      1. Building our basic recommendation engine
        1. Loading and formatting data
        2. Calculating similarity between users
        3. Predicting the unknown ratings for users
      2. Summary
    9. 3. Recommendation Engines Explained
      1. Evolution of recommendation engines
      2. Nearest neighborhood-based recommendation engines
        1. User-based collaborative filtering
        2. Item-based collaborative filtering
        3. Advantages
        4. Disadvantages
      3. Content-based recommender systems
        1. User profile generation
        2. Advantages
        3. Disadvantages
      4. Context-aware recommender systems
        1. Context definition
        2. Pre-filtering approaches
        3. Post-filtering approaches
        4. Advantages
        5. Disadvantages
      5. Hybrid recommender systems
        1. Weighted method
        2. Mixed method
        3. Cascade method
        4. Feature combination method
        5. Advantages
      6. Model-based recommender systems
        1. Probabilistic approaches
        2. Machine learning approaches
        3. Mathematical approaches
        4. Advantages
      7. Summary
    10. 4. Data Mining Techniques Used in Recommendation Engines
      1. Neighbourhood-based techniques
        1. Euclidean distance
        2. Cosine similarity
        3. Jaccard similarity
        4. Pearson correlation coefficient
      2. Mathematic model techniques
        1. Matrix factorization
        2. Alternating least squares
        3. Singular value decomposition
      3. Machine learning techniques
        1. Linear regression
        2. Classification models
          1. Linear classification
          2. KNN classification
          3. Support vector machines
          4. Decision trees
          5. Ensemble methods
            1. Random forests
            2. Bagging
            3. Boosting
      4. Clustering techniques
        1. K-means clustering
      5. Dimensionality reduction
        1. Principal component analysis
      6. Vector space models
        1. Term frequency
        2. Term frequency inverse document frequency
      7. Evaluation techniques
        1. Cross-validation
        2. Regularization
          1. Root-mean-square error (RMSE)
          2. Mean absolute error (MAE)
          3. Precision and recall
      8. Summary
    11. 5. Building Collaborative Filtering Recommendation Engines
      1. Installing the recommenderlab package in RStudio
      2. Datasets available in the recommenderlab package
        1. Exploring the Jester5K dataset
          1. Description
          2. Usage
          3. Format
          4. Details
      3. Exploring the dataset
        1. Exploring the rating values
      4. Building user-based collaborative filtering with recommenderlab
        1. Preparing training and test data
        2. Creating a user-based collaborative model
        3. Predictions on the test set
        4. Analyzing the dataset
        5. Evaluating the recommendation model using the k-cross validation
        6. Evaluating user-based collaborative filtering
      5. Building an item-based recommender model
        1. Building an IBCF recommender model
        2. Model evaluation
        3. Model accuracy using metrics
        4. Model accuracy using plots
        5. Parameter tuning for IBCF
      6. Collaborative filtering using Python
        1. Installing the required packages
        2. Data source
      7. Data exploration
        1. Rating matrix representation
        2. Creating training and test sets
        3. The steps for building a UBCF
        4. User-based similarity calculation
        5. Predicting the unknown ratings for an active user
      8. User-based collaborative filtering with the k-nearest neighbors
        1. Finding the top-N nearest neighbors
      9. Item-based recommendations
        1. Evaluating the model
        2. The training model for k-nearest neighbors
        3. Evaluating the model
      10. Summary
    12. 6. Building Personalized Recommendation Engines
      1. Personalized recommender systems
      2. Content-based recommender systems
        1. Building a content-based recommendation system
        2. Content-based recommendation using R
          1. Dataset description
        3. Content-based recommendation using Python
          1. Dataset description
          2. User activity
          3. Item profile generation
          4. User profile creation
      3. Context-aware recommender systems
        1. Building a context-aware recommender systems
        2. Context-aware recommendations using R
          1. Defining the context
          2. Creating context profile
          3. Generating context-aware recommendations
      4. Summary
    13. 7. Building Real-Time Recommendation Engines with Spark
      1. About Spark 2.0
        1. Spark architecture
        2. Spark components
        3. Spark Core
          1. Structured data with Spark SQL
          2. Streaming analytics with Spark Streaming
          3. Machine learning with MLlib
          4. Graph computation with GraphX
        4. Benefits of Spark
        5. Setting up Spark
        6. About SparkSession
        7. Resilient Distributed Datasets (RDD)
        8. About ML Pipelines
      2. Collaborative filtering using Alternating Least Square
      3. Model based recommender system using pyspark
      4. MLlib recommendation engine module
      5. The recommendation engine approach
        1. Implementation
          1. Data loading
          2. Data exploration
          3. Building the basic recommendation engine
          4. Making predictions
        2. User-based collaborative filtering
        3. Model evaluation
        4. Model selection and hyperparameter tuning
          1. Cross-Validation
          2. CrossValidator
          3. Train-Validation Split
          4. Setting the ParamMaps/parameters
          5. Setting the evaluator object
      6. Summary
    14. 8. Building Real-Time Recommendations with Neo4j
      1. Discerning different graph databases
        1. Labeled property graph
          1. Understanding GraphDB core concepts
      2. Neo4j
        1. Cypher query language
          1. Cypher query basics
        2. Node syntax
        3. Relationship syntax
        4. Building your first graph
          1. Creating nodes
          2. Creating relationships
          3. Setting properties to relations
          4. Loading data from csv
      3. Neo4j Windows installation
      4. Installing Neo4j on the Linux platform
        1. Downloading Neo4j
        2. Setting up Neo4j
        3. Starting Neo4j from the command line
      5. Building recommendation engines
        1. Loading data into Neo4j
        2. Generating recommendations using Neo4j
        3. Collaborative filtering using the Euclidean distance
        4. Collaborative filtering using Cosine similarity
      6. Summary
    15. 9. Building Scalable Recommendation Engines with Mahout
      1. Mahout - a general introduction
      2. Setting up Mahout
        1. The standalone mode - using Mahout as a library
        2. Setting Mahout for the distributed mode
      3. Core building blocks of Mahout
        1. Components of a user-based collaborative recommendation engine
        2. Building recommendation engines using Mahout
        3. Dataset description
        4. User-based collaborative filtering
      4. Item-based collaborative filtering
      5. Evaluating collaborative filtering
      6. Evaluating user-based recommenders
      7. Evaluating item-based recommenders
      8. SVD recommenders
      9. Distributed recommendations using Mahout
        1. ALS recommendation on Hadoop
      10. The architecture for a scalable system
      11. Summary
    16. 10. What Next - The Future of Recommendation Engines
      1. Future of recommendation engines
      2. Phases of recommendation engines
        1. Phase 1 - general recommendation engines
        2. Phase 2 - personalized recommender systems
        3. Phase 3 - futuristic recommender systems
          1. End of search
          2. Leaving the Web behind
          3. Emerging from the Web
        4. Next best actions
        5. Use cases to look out for
          1. Smart homes
          2. Healthcare recommender systems
          3. News as recommendations
      3. Popular methodologies
        1. Serendipity
      4. Temporal aspects of recommendation engines
        1. A/B testing
        2. Feedback mechanism
      5. Summary