Keywords: Multi-Agent Deep Reinforcement Learning, Game Theory, Curriculum Learning
TL;DR: Multi-Agent RL algorithms don't scale well when increasing the number of agents in a common team reward setting. We propose two approaches to tackle this: a new algorithm, and using incremental learning
Abstract: Various Reinforcement Learning (RL) algorithms rely on learning state-action value functions to learn an optimal policy. This framework can be easily extended to a Multi-Agent RL (MARL) setting by considering joint actions, making it the most common approach to this new scenario. However, such a setting presents challenges due to the exponential growth of the action space, the need for decentralized policies for real-world applications, and dealing with non-stationary environments during the learning process. This work aims to study the performance of different MARL methods when scaling the number of agents. We also propose two approaches to tackle the issue of scalability: a) a new algorithm **MFQMIX**, which combines different techniques for Q-value factorization, and b) using **Incremental Learning**, i.e. slowly increasing the number of agents in the environment. We show that MFQMIX outperforms all baselines when trained in a non-stationary setting against each other, but its performance stalls when increasing the number of agents. Nonetheless, Incremental Learning successfully improves scalability, allowing agents to learn in more than twice as crowded environments.
Submission Number: 9
Loading