Reinforcement learning is a wonderful research area of machine learning. It has a root from behavioral psychology. The mechanism would maximize some notion of cumulative reward when certain actions were taken in a set of environments (that is, an agent tries to learn optimal behavior through trial-and-error interactions within a dynamic environment setting).
Let's use an R package called ReinforcementLearning. First, let's look at the dataset, shown here:
> library("ReinforcementLearning") > set.seed(123) > data <- sampleGridSequence(1000) > dim(data) [1] 1000 4 > head(data) State Action Reward NextState 1 s2 left -1 s2 2 s4 right -1 s4 3 s2 down -1 s2 4 s4 up -1 s4 5 s4 up -1 s4 6 s1 left -1 s1 > unique(data$State) ...