The paper presents a novel approach to estimate the performance of MPI collective communications. Our objective is to help researchers to make appropriate decisions on their message-passing applications. For each collective communication, we attempt to apply LogGP and P-LogP standard point-to-point models. The resulted models are compared with the empirical data in order to identify the most suitable for performance characterization of collective operations. For the communications on large clusters with large size messages, the network contention problem can significantly affect the performance. Hence, to reduce the relative gap between the prediction and the measured runtime, the contention issue is also modeled, by a queuing theory analysis method, and taken in account with the total performance estimation. The experiments performed on a cluster which consists of 64 processors interconnected by Gigabit Ethernet network show encouraging results. For any collective operation, given a number of processors and a range of message sizes, there is at least one model that predicts the performance precisely. We could achieve a gap between the predicted and the measured run-time around 15%. Thus, by handling the contention problem, we could reduce around 80% of the relative gap.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hyacinthe NZIGOU MAMADOU, Takeshi NANRI, Kazuaki MURAKAMI, "Performance Models for MPI Collective Communications with Network Contention" in IEICE TRANSACTIONS on Communications,
vol. E91-B, no. 4, pp. 1015-1024, April 2008, doi: 10.1093/ietcom/e91-b.4.1015.
Abstract: The paper presents a novel approach to estimate the performance of MPI collective communications. Our objective is to help researchers to make appropriate decisions on their message-passing applications. For each collective communication, we attempt to apply LogGP and P-LogP standard point-to-point models. The resulted models are compared with the empirical data in order to identify the most suitable for performance characterization of collective operations. For the communications on large clusters with large size messages, the network contention problem can significantly affect the performance. Hence, to reduce the relative gap between the prediction and the measured runtime, the contention issue is also modeled, by a queuing theory analysis method, and taken in account with the total performance estimation. The experiments performed on a cluster which consists of 64 processors interconnected by Gigabit Ethernet network show encouraging results. For any collective operation, given a number of processors and a range of message sizes, there is at least one model that predicts the performance precisely. We could achieve a gap between the predicted and the measured run-time around 15%. Thus, by handling the contention problem, we could reduce around 80% of the relative gap.
URL: https://global.ieice.org/en_transactions/communications/10.1093/ietcom/e91-b.4.1015/_p
Copy
@ARTICLE{e91-b_4_1015,
author={Hyacinthe NZIGOU MAMADOU, Takeshi NANRI, Kazuaki MURAKAMI, },
journal={IEICE TRANSACTIONS on Communications},
title={Performance Models for MPI Collective Communications with Network Contention},
year={2008},
volume={E91-B},
number={4},
pages={1015-1024},
abstract={The paper presents a novel approach to estimate the performance of MPI collective communications. Our objective is to help researchers to make appropriate decisions on their message-passing applications. For each collective communication, we attempt to apply LogGP and P-LogP standard point-to-point models. The resulted models are compared with the empirical data in order to identify the most suitable for performance characterization of collective operations. For the communications on large clusters with large size messages, the network contention problem can significantly affect the performance. Hence, to reduce the relative gap between the prediction and the measured runtime, the contention issue is also modeled, by a queuing theory analysis method, and taken in account with the total performance estimation. The experiments performed on a cluster which consists of 64 processors interconnected by Gigabit Ethernet network show encouraging results. For any collective operation, given a number of processors and a range of message sizes, there is at least one model that predicts the performance precisely. We could achieve a gap between the predicted and the measured run-time around 15%. Thus, by handling the contention problem, we could reduce around 80% of the relative gap.},
keywords={},
doi={10.1093/ietcom/e91-b.4.1015},
ISSN={1745-1345},
month={April},}
Copy
TY - JOUR
TI - Performance Models for MPI Collective Communications with Network Contention
T2 - IEICE TRANSACTIONS on Communications
SP - 1015
EP - 1024
AU - Hyacinthe NZIGOU MAMADOU
AU - Takeshi NANRI
AU - Kazuaki MURAKAMI
PY - 2008
DO - 10.1093/ietcom/e91-b.4.1015
JO - IEICE TRANSACTIONS on Communications
SN - 1745-1345
VL - E91-B
IS - 4
JA - IEICE TRANSACTIONS on Communications
Y1 - April 2008
AB - The paper presents a novel approach to estimate the performance of MPI collective communications. Our objective is to help researchers to make appropriate decisions on their message-passing applications. For each collective communication, we attempt to apply LogGP and P-LogP standard point-to-point models. The resulted models are compared with the empirical data in order to identify the most suitable for performance characterization of collective operations. For the communications on large clusters with large size messages, the network contention problem can significantly affect the performance. Hence, to reduce the relative gap between the prediction and the measured runtime, the contention issue is also modeled, by a queuing theory analysis method, and taken in account with the total performance estimation. The experiments performed on a cluster which consists of 64 processors interconnected by Gigabit Ethernet network show encouraging results. For any collective operation, given a number of processors and a range of message sizes, there is at least one model that predicts the performance precisely. We could achieve a gap between the predicted and the measured run-time around 15%. Thus, by handling the contention problem, we could reduce around 80% of the relative gap.
ER -