Synchronizing MPI Processes in Space and Time

TitleSynchronizing MPI Processes in Space and Time
Publication TypeConference Proceedings
Year of Publication2023
AuthorsSchuchart, J., S. Hunold, and G. Bosilca
Conference NameEUROMPI '23: 30th European MPI Users' Group Meeting
Date Published2023-09
PublisherACM
Conference LocationBristol, United Kingdom
ISBN Number9798400709135
Abstract

Performance benchmarks are an integral part of the development and evaluation of parallel algorithms, both in distributed applications as well as MPI implementations themselves. The initial step of the benchmark process is to obtain a common timestamp to mark the start of an operation across all involved processes, and the state-of-the-art in many applications and widely used MPI benchmark suites is the use of MPI barriers. In this paper, we show that the synchronization in space provided by an MPI_Barrier is insufficient for proper benchmark results of parallel distributed algorithms, using MPI collective operations as examples. The resulting lack of a global start timestamp for an operation leads to skewed results, with a significant impact of the used barrier algorithm. In order to mitigate these issues, we propose and discuss the implementation of MPIX_Harmonize, which extends the synchronization in space provided by MPI_Barrier with a time synchronization to guarantee a common starting timestamp across all involved processes. By replacing the use of MPI_Barrier with MPIX_Harmonize, benchmark implementors can eliminate skews resulting from barrier algorithms and achieve stable performance benchmark results. We will show that the proper time synchronization can have significant impact on the benchmark results for various implementations of MPI_Allreduce, MPI_Reduce, and MPI_Bcast.

URLhttps://dl.acm.org/doi/proceedings/10.1145/3615318
DOI10.1145/3615318.3615325
External Publication Flag: