Feature selection

The process of feature selection involves ranking variables or features according to their importance by training a predictive model using them and then trying to find out which variables were the most relevant features for that model. While each model often has its own set of important features, for classification we will use a random forest model here to try and figure out which variables might be of importance in general for classification-based predictions.

We perform feature selection for several reasons, which include:

  • Removing redundant or irrelevant features without too much information loss
  • Preventing overfitting of models by using too many features
  • Reducing variance of the model which is contributed from excess features ...

Get R Machine Learning By Example now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.