Methods of attrition prediction

In the previous section, we described our use case of predicting student attrition and also prepared our Spark computing platform. In this section, we need to perform the task of mapping our use case to machine learning methods, which is to select our analytical methods or predictive models (equations) for this attrition prediction project.

To model and predict student attrition, the most suitable models include logistic regression and decision tree, as both of them yield good results. Some researchers use neural network and SVM models, but the results are no better than logistic regression. Therefore, for this exercise, we will focus our efforts on logistic regression and decision trees, as well as random forest ...

Get Apache Spark Machine Learning Blueprints now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.