Feature selection

In general, when we work with high-dimensional datasets, it is a good idea to reduce the number of features to only the most useful ones and discard the rest. This can lead to simpler models that generalize better. Feature selection is the process of reducing inputs for processing and analyzing or identifying the most significant features over the others. This selection of features is necessary to create a functional model, so as to achieve a reduction in cardinality, imposing a limit greater than the number of features that must be considered during its creation. In the following figure, a general scheme of a feature selection process is shown:

Usually, the data contains redundant information, or more than the necessary ...

Get Regression Analysis with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.