O'Reilly logo

Principles of Data Science by Sinan Ozdemir

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

K folds cross-validation

K folds cross-validation is a much better estimator of our model's performance, even more so than our train-test split. Here's how it works:

  1. We will take a finite number of equal slices of our data (usually 3, 5, or 10). Assume that this number is called k.
  2. For each "fold" of the cross-validation, we will treat k-1 of the sections as the training set, and the remaining section as our test set.
  3. For the remaining folds, a different arrangement of k-1 sections is considered for our training set and a different section is our training set.
  4. We compute a set metric for each fold of the cross-validation.
  5. We average our scores at the end.

Cross-validation is effectively using multiple train-test splits being done on the same dataset. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required