Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.
Antoine TROUVÉ
Engineering Department of Kyushu University
Arnaldo J. CRUZ
Engineering Department of Kyushu University
Kazuaki J. MURAKAMI
Engineering Department of Kyushu University
Masaki ARAI
Fujitsu Laboratories Limited
Tadashi NAKAHIRA
Fujitsu Laboratories Limited
Eiji YAMANAKA
Fujitsu Limited
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Antoine TROUVÉ, Arnaldo J. CRUZ, Kazuaki J. MURAKAMI, Masaki ARAI, Tadashi NAKAHIRA, Eiji YAMANAKA, "Guide Automatic Vectorization by means of Machine Learning: A Case Study of Tensor Contraction Kernels" in IEICE TRANSACTIONS on Information,
vol. E99-D, no. 6, pp. 1585-1594, June 2016, doi: 10.1587/transinf.2015EDP7440.
Abstract: Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2015EDP7440/_p
Copy
@ARTICLE{e99-d_6_1585,
author={Antoine TROUVÉ, Arnaldo J. CRUZ, Kazuaki J. MURAKAMI, Masaki ARAI, Tadashi NAKAHIRA, Eiji YAMANAKA, },
journal={IEICE TRANSACTIONS on Information},
title={Guide Automatic Vectorization by means of Machine Learning: A Case Study of Tensor Contraction Kernels},
year={2016},
volume={E99-D},
number={6},
pages={1585-1594},
abstract={Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.},
keywords={},
doi={10.1587/transinf.2015EDP7440},
ISSN={1745-1361},
month={June},}
Copy
TY - JOUR
TI - Guide Automatic Vectorization by means of Machine Learning: A Case Study of Tensor Contraction Kernels
T2 - IEICE TRANSACTIONS on Information
SP - 1585
EP - 1594
AU - Antoine TROUVÉ
AU - Arnaldo J. CRUZ
AU - Kazuaki J. MURAKAMI
AU - Masaki ARAI
AU - Tadashi NAKAHIRA
AU - Eiji YAMANAKA
PY - 2016
DO - 10.1587/transinf.2015EDP7440
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2016
AB - Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.
ER -