Chapter 3. Exploring the Unknown with Multi-armed Bandits

We'll start this chapter by building a simplistic algorithm and measuring its quality. Next, we'll build a much more intelligent algorithm. We will also build some tests to measure the quality improvement that we will achieve, specifically, a multi-armed bandit algorithm.

Understanding a bandit

A multi-armed bandit problem involves making a choice in the face of complete uncertainty. More specifically, imagine you're placed in front of several slot machines, and each has a different but fixed probability to pay out. How could you make as much money as possible?

So, this is a metaphor for the problem. It really applies to any situation where you have no information to start with, and where ...

Get Test-Driven Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.