A regression problem

Given some descriptors of a song, the goal of this problem is to predict the year when the song was produced. That's basically a regression problem, since the target variable to predict is a number in the range between 1922 and 2011.

For each song, in addition to the year of production, 90 attributes are provided. All of them are related to the timbre: 12 of them relate to the timbre average and 78 attributes describe the timbre's covariance; all the features are numerical (integer or floating point numbers).

The dataset is composed of more than half a million observations. As for the competition behind the dataset, the authors tried to achieve the best results using the first 463,715 observations as a training set and the remaining ...

Get Regression Analysis with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.