In the Q-learning algorithm, the game world is treated as a state machine. It is important to bear in mind the meaning of the parameters:
- alpha: This is the learning rate
- gamma: This is the discount rate
- rho: This is the randomness of exploration
- nu: This is the length of the walk