Measuring prediction performance

We have already seen that the machine learning process consists of the following steps:

  • Model selection: We first select a suitable model for our data. Do we have labels? How many samples are available? Is the data separable? How many dimensions do we have? As this step is nontrivial, the choice will depend on the actual problem. As of Fall 2015, the scikit-learn documentation contains a much appreciated flowchart called choosing the right estimator. It is short, but very informative and worth taking a closer look at.
  • Training: We have to bring the model and data together, and this usually happens in the fit methods of the models in scikit-learn.
  • Application: Once we have trained our model, we are able to make predictions ...

Get Python: Data Analytics and Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.