Model selection

What are we to make of all this? We have the confusion matrices from our models to guide us, but we can get a little more sophisticated when it comes to selecting the classification models. An effective tool for a classification model comparison is the Receiver Operating Characteristic (ROC) chart. Very simply, ROC is a technique for visualizing, organizing, and selecting the classifiers based on their performance (Fawcett, 2006). On the ROC chart, the y-axis is the True Positive Rate (TPR) and the x-axis is the False Positive Rate (FPR). The following are the calculations, which are quite simple:

TPR = Positives correctly classified / total positives
FPR = Negatives incorrectly classified / total negatives

Plotting the ROC results ...

Get R: Unleash Machine Learning Techniques now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

R: Unleash Machine Learning Techniques by Raghav Bali, Dipanjan Sarkar, Brett Lantz, Cory Lesmeister

Model selection

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly