ForestColl: Efficient Collective Communications on Heterogeneous Network Fabrics.

CoRR(2024)

引用 0|浏览41
摘要
As modern DNN models grow ever larger, collective communications between the accelerators (allreduce, etc.) emerge as a significant performance bottleneck. Designing efficient communication schedules is challenging, given today's highly diverse and heterogeneous network fabrics. In this paper, we present ForestColl, a tool that generates performant schedules for any network topology. ForestColl constructs broadcast/aggregation spanning trees as the communication schedule, achieving theoretically optimal throughput. Its schedule generation runs in strongly polynomial time and is highly scalable. ForestColl supports any network fabric, including both switching fabrics and direct connections. We evaluated ForestColl on multi-box AMD MI250 and NVIDIA DGX A100 platforms. ForestColl's schedules delivered up to 130 performance compared to the vendors' own optimized communication libraries, RCCL and NCCL, and achieved a 20 outperforms other state-of-the-art schedule generation techniques with both up to 61 schedule generation speed.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
0
您的评分 :

暂无评分

数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn