Scaling Multi-Agent RL through Mean Field Games and Incremental Learning

Max Balsells; Aleix Segui Ugalde; Artur Żółkowski

Scaling Multi-Agent RL through Mean Field Games and Incremental Learning

Max Balsells, Aleix Segui Ugalde, Artur Żółkowski

Published: 25 Feb 2025, Last Modified: 25 Feb 2025MARW at AAAI 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-Agent Deep Reinforcement Learning, Game Theory, Curriculum Learning

TL;DR: Multi-Agent RL algorithms don't scale well when increasing the number of agents in a common team reward setting. We propose two approaches to tackle this: a new algorithm, and using incremental learning

Abstract: Various Reinforcement Learning (RL) algorithms rely on learning state-action value functions to learn an optimal policy. This framework can be easily extended to a Multi-Agent RL (MARL) setting by considering joint actions, making it the most common approach to this new scenario. However, such a setting presents challenges due to the exponential growth of the action space, the need for decentralized policies for real-world applications, and dealing with non-stationary environments during the learning process. This work aims to study the performance of different MARL methods when scaling the number of agents. We also propose two approaches to tackle the issue of scalability: a) a new algorithm **MFQMIX**, which combines different techniques for Q-value factorization, and b) using **Incremental Learning**, i.e. slowly increasing the number of agents in the environment. We show that MFQMIX outperforms all baselines when trained in a non-stationary setting against each other, but its performance stalls when increasing the number of agents. Nonetheless, Incremental Learning successfully improves scalability, allowing agents to learn in more than twice as crowded environments.

Submission Number: 9

Loading