Fitting and evaluating the model

Now we will split the data into training and testing sets, train regressor, and evaluate its predictions:

# In[1]:from sklearn.linear_model import LinearRegressionimport pandas as pdimport matplotlib.pylab as pltfrom sklearn.model_selection import train_test_splitdf = pd.read_csv('./winequality-red.csv', sep=';')X = df[list(df.columns)[:-1]]y = df['quality']X_train, X_test, y_train, y_test = train_test_split(X, y)regressor = LinearRegression()regressor.fit(X_train, y_train)y_predictions = regressor.predict(X_test)print('R-squared: %s' % regressor.score(X_test, y_test))# Out[1]:R-squared: 0.398550890379

First, we loaded the data using pandas, and separated the response variable from the explanatory variables. ...

Get Mastering Machine Learning with scikit-learn - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.