Random forest

Random forest introduces us to a category of learning tasks called ensemble learning. In ensemble learning, we train multiple weak learners over the same or different subsets of the dataset. We then combine their outputs to come up with the final answer. It has been empirically proved that an ensemble of weak learners will perform better than any single weak learner, giving the same performance at worst. Random forest is an ensemble learning algorithm with decision trees as the weak learners. It is a very good choice for datasets with missing data values and data with small 'n' or large 'p' problems. By small 'n', we mean a smaller number of rows as compared to a large number of features or 'p'. We will discuss the major features ...

Get Learning Apache Mahout now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.