A random forest classifier is trained in order to select the most salient features for face classification. The idea is to check which features are the most often used by the ensemble of trees. By using only the most salient features in subsequent steps, computation speed can be increased, while retaining accuracy. The following code snippet shows how to compute the feature importance for the classifier and displays the top 25 most important Haar-like features:
# For speed, only extract the two first types of featuresfeature_types = ['type-2-x', 'type-2-y']# Build a computation graph using dask. This allows using multiple CPUs ...