O'Reilly logo

Learning Predictive Analytics with Python by Ashish Kumar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Random sampling – splitting a dataset in training and testing datasets

Splitting the dataset in training and testing the datasets is one operation every predictive modeller has to perform before applying the model, irrespective of the kind of data in hand or the predictive model being applied. Generally, a dataset is split into training and testing datasets. The following is a description of the two types of datasets:

  • The training dataset is the one on which the model is built. This is the one on which the calculations are performed and the model equations and parameters are created.
  • The testing dataset is used to check the accuracy of the model. The model equations and parameters are used to calculate the output based on the inputs from the testing ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required