You are previewing Bandit Algorithms for Website Optimization.

Bandit Algorithms for Website Optimization

Cover of Bandit Algorithms for Website Optimization by John Myles White Published by O'Reilly Media, Inc.
  1. Bandit Algorithms for Website Optimization
  2. Preface
    1. Finding the Code for This Book
    2. Dealing with Jargon: A Glossary
    3. Conventions Used in This Book
    4. Using Code Examples
    5. Safari® Books Online
    6. How to Contact Us
    7. Acknowledgments
  3. 1. Two Characters: Exploration and Exploitation
    1. The Scientist and the Businessman
      1. Cynthia the Scientist
      2. Bob the Businessman
      3. Oscar the Operations Researcher
    2. The Explore-Exploit Dilemma
  4. 2. Why Use Multiarmed Bandit Algorithms?
    1. What Are We Trying to Do?
    2. The Business Scientist: Web-Scale A/B Testing
  5. 3. The epsilon-Greedy Algorithm
    1. Introducing the epsilon-Greedy Algorithm
    2. Describing Our Logo-Choosing Problem Abstractly
      1. What’s an Arm?
      2. What’s a Reward?
      3. What’s a Bandit Problem?
    3. Implementing the epsilon-Greedy Algorithm
    4. Thinking Critically about the epsilon-Greedy Algorithm
  6. 4. Debugging Bandit Algorithms
    1. Monte Carlo Simulations Are Like Unit Tests for Bandit Algorithms
    2. Simulating the Arms of a Bandit Problem
    3. Analyzing Results from a Monte Carlo Study
      1. Approach 1: Track the Probability of Choosing the Best Arm
      2. Approach 2: Track the Average Reward at Each Point in Time
      3. Approach 3: Track the Cumulative Reward at Each Point in Time
    4. Exercises
  7. 5. The Softmax Algorithm
    1. Introducing the Softmax Algorithm
    2. Implementing the Softmax Algorithm
    3. Measuring the Performance of the Softmax Algorithm
    4. The Annealing Softmax Algorithm
    5. Exercises
  8. 6. UCB – The Upper Confidence Bound Algorithm
    1. Introducing the UCB Algorithm
    2. Implementing UCB
    3. Comparing Bandit Algorithms Side-by-Side
    4. Exercises
  9. 7. Bandits in the Real World: Complexity and Complications
    1. A/A Testing
    2. Running Concurrent Experiments
    3. Continuous Experimentation vs. Periodic Testing
    4. Bad Metrics of Success
    5. Scaling Problems with Good Metrics of Success
    6. Intelligent Initialization of Values
    7. Running Better Simulations
    8. Moving Worlds
    9. Correlated Bandits
    10. Contextual Bandits
    11. Implementing Bandit Algorithms at Scale
  10. 8. Conclusion
    1. Learning Life Lessons from Bandit Algorithms
    2. A Taxonomy of Bandit Algorithms
    3. Learning More and Other Topics
  11. Colophon
  12. Copyright

Chapter 7. Bandits in the Real World: Complexity and Complications

So far we’ve just given you a taste of how bandit algorithms work by showing you three standard algorithms that you can try deploying on your website. In this chapter, we want to give you some tips about ways in which the real world is likely to be more complex than the simplified examples we’ve used to test the algorithms we’ve implemented. We think that bandit algorithms are a very powerful set of tools for building better websites. But they are not a panacea. Bandit algorithms can break if the world you’re exploring is more complicated than the world we’ve used in testing algorithms. Let’s list some questions you should ask yourself when deploying bandit algorithms in the wild:

  • How sure are you that you won’t subtly corrupt your deployment code?
  • How many different tests are you planning to run simultaneously? Will these tests interfere with each other? Will starting a new test while another one is already running corrupt its results?
  • How long do you plan to run your tests?
  • How many users are you willing to expose to non-preferred versions of your site?
  • How well-chosen is your metric of success?
  • How are the arms you’re measuring related to one another?
  • What additional information about context do you have when choosing arms? Do you have demographics based on browser information? Does your site have access to external information about people’s tastes in products you might advertise to them?
  • How much traffic ...

The best content for your career. Discover unlimited learning on demand for around $1/day.