Figure 1 shows a person making decisions to arrive at their destination. Moreover, suppose that on your drive from home to work, you always choose the same route. However, one day your curiosity takes over and you decide to try a different path, hoping for a shorter commute. This dilemma of trying out new routes or sticking to the best-known route is an example of exploration versus exploitation:
RL techniques are being used in many areas. A general idea that is being pursued right now is creating an algorithm that does not need anything apart from ...