Chapter 3. Exploring the Unknown with Multi-armed Bandits

We'll start this chapter by building a simplistic algorithm and measuring its quality. Next, we'll build a much more intelligent algorithm. We will also build some tests to measure the quality improvement that we will achieve, specifically, a multi-armed bandit algorithm.

Understanding a bandit

A multi-armed bandit problem involves making a choice in the face of complete uncertainty. More specifically, imagine you're placed in front of several slot machines, and each has a different but fixed probability to pay out. How could you make as much money as possible?

So, this is a metaphor for the problem. It really applies to any situation where you have no information to start with, and where ...

Get Test-Driven Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Test-Driven Machine Learning by Justin Bozonier

Chapter 3. Exploring the Unknown with Multi-armed Bandits

Understanding a bandit

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly