Improving the Scaling of an Asynchronous Many-Task Runtime with a Lightweight Communication Engine

TitleImproving the Scaling of an Asynchronous Many-Task Runtime with a Lightweight Communication Engine
Publication TypeConference Paper
Year of Publication2023
AuthorsMor, O., G. Bosilca, and M. Snir
Conference Name52nd International Conference on Parallel Processing (ICPP 2023)
Date Published2023-09
PublisherACM
Conference LocationSalt Lake City, Utah
Keywordsasynchronous many-task, dynamic runtime, lightweight communication, low-rank Cholesky, message-passing, MPI, strong scaling
Abstract

There is a growing interest in Asynchronous Many-Task (AMT) runtimes as an efficient way to map irregular and dynamic parallel applications onto heterogeneous computing resources. In this work, we show that AMTs nonetheless struggle with communication bottlenecks when scaling computations strongly and that the design of commonly-used communication libraries such as MPI contribute to these bottlenecks. We replace MPI with LCI, a Lightweight Communication Interface that is designed for dynamic, asynchronous frameworks, as the communication layer for the PaRSEC runtime. The result is a significant reduction of end-to-end latency in communication microbenchmarks and a reduction of overall time-tosolution by up to 12% in HiCMA, a tile-based low-rank Cholesky factorization package.

URLhttp://snir.cs.illinois.edu/listed/icpp2023-69.pdf
DOI10.1145/3605573.3605642
Project Tags: 
External Publication Flag: