Applying our classifier

Now we put the pedal to the metal. Can we classify gender by height, weight, and BMI? We will get our data from the Kaggle competition at https://www.kaggle.com/c/pf2012-diabetes/data.

We'll be using the SyncPatient and SyncTranscript data. You can look up the details regarding these datasets in the associated data dictionary. The examples that follow are placed in the data files, in a directory named data. The files have also been renamed from SyncPatient.csv and SyncTranscript.csv to training_SyncPatient.csv and training_SyncTranscript.csv respectively.

Our first step will be to create a harness that will let us explore our data to make sure that it seems reasonable. Before we do this, we should create a new method on our ...

Get Test-Driven Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.