Skill Matters: Dynamic Skill Learning for Multi-Agent Cooperative Reinforcement Learning
Neural Networks(2025)
摘要
With the popularization of intelligence, the necessity of cooperation between intelligent machines makes the research of collaborative multi-agent reinforcement learning (MARL) more extensive. Existing approaches typically address this challenge through task decomposition of the environment or role classification of agents. However, these studies may rely on the sharing of parameters between agents, resulting in the homogeneity of agent behavior, which is not effective for complex tasks. Or training that relies on external rewards is difficult to adapt to scenarios with sparse rewards. Based on the above challenges, in this paper we propose a novel dynamic skill learning (DSL) framework for agents to learn more diverse abilities motivated by internal rewards. Specifically, the DSL has two components: (i) Dynamic skill discovery, which encourages the production of meaningful skills by exploring the environment in an unsupervised manner, using the inner product between a skill vector and a trajectory representation to generate intrinsic rewards. Meanwhile, the Lipschitz constraint of the state representation function is used to ensure the proper trajectory of the learned skills. (ii) Dynamic skill assignment, which utilizes a policy controller to assign skills to each agent based on its different trajectory latent variables. In addition, in order to avoid training instability caused by frequent changes in skill selection, we introduce a regularization term to limit skill switching between adjacent time steps. We thoroughly tested the DSL approach on two challenging benchmarks, StarCraft II and Google Research Football. Experimental results show that compared with strong benchmarks such as QMIX and RODE, DSL effectively improves performance and is more adaptable to difficult collaborative scenarios.
更多查看译文
关键词
Multi-agent reinforcement learning,Diverse behaviors,Skill discovery,Skill assignment
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn