Reinforcement Learning Journal, vol. 1, 2024, pp. 366–379.
Presented at the Reinforcement Learning Conference (RLC), Amherst Massachusetts, August 9–12, 2024.
In parallel with the rise of the successful value function factorization approach, numerous recent studies on Cooperative Multi-Agent Reinforcement Learning (MARL) have explored the application of Coordination Graphs (CG) to model the communication requirements among the agent population. These coordination problems often exhibit structural sparsity, which facilitates accurate joint value function learning with CGs. Value-based methods necessitate the computation of argmaxes over the exponentially large joint action space, leading to the adoption of the max-sum method from the distributed constraint optimization (DCOP) literature. However, it has been empirically observed that the performance of max-sum deteriorates with an increase in the number of agents, attributed to the increased cyclicity of the graph. While previous works have tackled this issue by sparsifying the graph based on a metric of edge importance, thereby demonstrating improved performance, we argue that neglecting topological considerations during the sparsification procedure can adversely affect action selection. Consequently, we advocate for the explicit consideration of graph cyclicity alongside edge importances. We demonstrate that this approach results in superior performance across various challenging coordination problems.
Oliver Järnefelt, Mahdi Kallel, and Carlo D'Eramo. "Cyclicity-Regularized Coordination Graphs." Reinforcement Learning Journal, vol. 1, 2024, pp. 366–379.
BibTeX:@article{jarnefelt2024cyclicity,
title={Cyclicity-Regularized Coordination Graphs},
author={J{\"{a}}rnefelt, Oliver and Kallel, Mahdi and D'Eramo, Carlo},
journal={Reinforcement Learning Journal},
volume={1},
pages={366--379},
year={2024}
}