
This includes algorithms for the most common machine learning tasks, such as classification, regression, clustering, dimensionality reduction, model selection, and preprocessing.

Scikit-learn comes with several real-world data sets for us to practice with. Let's take a look at one of these—the Iris data set:

from sklearn import datasets
iris = datasets.load_iris()
iris_X =
iris_y =
(150, 4)

The data set contains 150 samples of three types of irises (Setosa, Versicolor, and Virginica), each with four features. We can get a description on the dataset:


We can see that the four attributes, or features, are sepal width, sepal length, petal length, and petal width in centimeters. Each sample ...

Get Python: Deeper Insights into Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.