modeling multi-agent reinforcement learning

An adequate theoretical framework modeling multi-agent learning dynamics has long been lacking. Recently, an evolutionary game theoretic approach using replicator dynamics is employed to fill this gap. Replicator dynamics are a methodology of evolutionary game theory to model the dynamical evolution of strategies. Exploiting the link between reinforcement learning and evolutionary game theory is beneficial for a variety of reasons. Analyzing the learning dynamics helps to gain further insight into the learning dynamics and to determine parameter configurations before learners are actually employed in the task domain.

In our work we expanded the link between multi-agent reinforcement learning and evolutionary game theory to multi-state games. In particular, we introduced the average reward game and state-coupled replicator dynamics as general concepts to apply single-state dynamics to stochastic games with non-absorbing states. The average reward game aggregats payoff information by averaging stage rewards over the interim immediate rewards. State-coupled replicator dynamics use direct state-coupling by incorporating these expected payoffs in all states under current strategies, weighted by the frequency of state occurrences.