Chapter 20

Understanding the Power of the Many

In This Chapter

Understanding how a decision tree works

Using Random Forest and other bagging techniques

Taking advantage of the most performing ensembles by boosting

In this chapter, you go beyond the single machine-learning models you’ve seen until now and explore the power of ensembles, groups of models that can outperform single models. Ensembles work like the collective intelligence of crowds, using pooled information to make better predictions. The basic idea is that a group of nonperforming algorithms can produce better results than a single well-trained model.

Maybe you’ve participated in one of those games that ask you to guess the number of sweets in a jar at parties or fairs. Even though a single person has a slim chance of guessing the right number, various experiments have confirmed that if you take the wrong answers of a large number of game participants and average them, you can get close to the right answer! Such incredible shared group knowledge (the wisdom of crowds) is possible because wrong answers tend to distribute around the true one. By taking a mean or median of these wrong answers, you get the direction of ...

Get Python for Data Science For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Python for Data Science For Dummies by

Understanding the Power of the Many

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly