Rewards

A reward, denoted by , is usually a scalar quantity that is provided as feedback to the agent to drive its learning. The goal of the agent is to maximize the sum of the reward, and this signal indicates how well the agent is doing at time step . The following examples of reward signals for different tasks may help you get a more intuitive understanding:

  • For the Atari games we discussed before, or any computer games in general, the reward signal can be +1 for every increase in score and -1 for every decrease in score.
  • For stock trading, ...

Get Hands-On Intelligent Agents with OpenAI Gym now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.