Guide Automatic Vectorization by means of Machine Learning: A Case Study of Tensor Contraction Kernels

Antoine TROUVÉ; Arnaldo J. CRUZ; Kazuaki J. MURAKAMI; Masaki ARAI; Tadashi NAKAHIRA; Eiji YAMANAKA

doi:10.1587/transinf.2015EDP7440

IEICE TRANSACTIONS on Information

Guide Automatic Vectorization by means of Machine Learning: A Case Study of Tensor Contraction Kernels

Antoine TROUVÉ, Arnaldo J. CRUZ, Kazuaki J. MURAKAMI, Masaki ARAI, Tadashi NAKAHIRA, Eiji YAMANAKA

Full Text Views

0

Cite this

Summary :

Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.

Publication: IEICE TRANSACTIONS on Information Vol.E99-D No.6 pp.1585-1594

Publication Date: 2016/06/01

Publicized: 2016/03/22

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2015EDP7440

Type of Manuscript: PAPER

Category: Artificial Intelligence, Data Mining

Authors

Antoine TROUVÉ
  Engineering Department of Kyushu University
Arnaldo J. CRUZ
  Engineering Department of Kyushu University
Kazuaki J. MURAKAMI
  Engineering Department of Kyushu University
Masaki ARAI
  Fujitsu Laboratories Limited
Tadashi NAKAHIRA
  Fujitsu Laboratories Limited
Eiji YAMANAKA
  Fujitsu Limited

Keyword

automatic vectorization, machine learning, software optimization

Cite this

Copy

Antoine TROUVÉ, Arnaldo J. CRUZ, Kazuaki J. MURAKAMI, Masaki ARAI, Tadashi NAKAHIRA, Eiji YAMANAKA, "Guide Automatic Vectorization by means of Machine Learning: A Case Study of Tensor Contraction Kernels" in IEICE TRANSACTIONS on Information, vol. E99-D, no. 6, pp. 1585-1594, June 2016, doi: 10.1587/transinf.2015EDP7440.
Abstract: Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2015EDP7440/_p

Copy

@ARTICLE{e99-d_6_1585,
author={Antoine TROUVÉ, Arnaldo J. CRUZ, Kazuaki J. MURAKAMI, Masaki ARAI, Tadashi NAKAHIRA, Eiji YAMANAKA, },
journal={IEICE TRANSACTIONS on Information},
title={Guide Automatic Vectorization by means of Machine Learning: A Case Study of Tensor Contraction Kernels},
year={2016},
volume={E99-D},
number={6},
pages={1585-1594},
abstract={Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.},
keywords={},
doi={10.1587/transinf.2015EDP7440},
ISSN={1745-1361},
month={June},}

Copy

TY - JOUR
TI - Guide Automatic Vectorization by means of Machine Learning: A Case Study of Tensor Contraction Kernels
T2 - IEICE TRANSACTIONS on Information
SP - 1585
EP - 1594
AU - Antoine TROUVÉ
AU - Arnaldo J. CRUZ
AU - Kazuaki J. MURAKAMI
AU - Masaki ARAI
AU - Tadashi NAKAHIRA
AU - Eiji YAMANAKA
PY - 2016
DO - 10.1587/transinf.2015EDP7440
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2016
AB - Modern optimizing compilers tend to be conservative and often fail to vectorize programs that would have benefited from it. In this paper, we propose a way to predict the relevant command-line options of the compiler so that it chooses the most profitable vectorization strategy. Machine learning has proven to be a relevant approach for this matter: fed with features that describe the software to the compiler, a machine learning device is trained to predict an appropriate optimization strategy. The related work relies on the control and data flow graphs as software features. In this article, we consider tensor contraction programs, useful in various scientific simulations, especially chemistry. Depending on how they access the memory, different tensor contraction kernels may yield very different performance figures. However, they exhibit identical control and data flow graphs, making them completely out of reach of the related work. In this paper, we propose an original set of software features that capture the important properties of the tensor contraction kernels. Considering the Intel Merom processor architecture with the Intel Compiler, we model the problem as a classification problem and we solve it using a support vector machine. Our technique predicts the best suited vectorization options of the compiler with a cross-validation accuracy of 93.4%, leading to up to a 3-times speedup compared to the default behavior of the Intel Compiler. This article ends with an original qualitative discussion on the performance of software metrics by means of visualization. All our measurements are made available for the sake of reproducibility.
ER -

IEICE TRANSACTIONS on Information