Methods for fraud detection

In the previous section, we described our business use case and also prepared our Spark computing platform as well as our datasets. In this section, we need to select our analytical methods or predictive models (equations) for this fraud detection project, which is to complete a task of mapping our business use case to machine learning methods.

For fraud detection, both supervised machine learning and unsupervised machine learning are commonly used. However, for this case, we will perform a supervised machine learning because we do have good data for our target variable of fraud and also because our practical goal is to reduce frauds while continuing business transactions.

To model and predict frauds, there are many suitable ...

Get Apache Spark Machine Learning Blueprints now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.