We will implement the following steps in the same Bandit class we created before.
- Define the required member variables:
float initialRegret = 10f;float[] regret;float[] chance;RPSAction lastOpponentAction;RPSAction[] lastActionRM;
- Define the member function for initialization:
public void InitRegretMatching(){ if (init) return; // next steps}
- Declare the local variables and initialize them:
numActions = System.Enum.GetNames(typeof(RPSAction)).Length;regret = new float[numActions];chance = new float[numActions];int i;for (i = 0; i < numActions; i++){ regret[i] = initialRegret; chance[i] = 0f;}init = true;
- Define the member function for computing the next action to be taken:
public RPSAction GetNextActionRM(){ // next ...