In order to understand the Markov decision processes (MDPs), let us define two environment types:
- A deterministic environment: In a deterministic environment, an action taken within a particular state of the environment determines a certain outcome. For example, in the game of chess out of all the possible moves at the beginning of the game, when we move a pawn from e4 to e5, the immediate next step is certain and does not differ across various games. There is also a level of certainty of reward in a deterministic environment along with the next possible state(s).
- A stochastic environment: In the case of a stochastic environment, there is always a level of randomness and uncertainty in terms of next state of the ...