Full Text Views
105
To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86_64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86_64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86_64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak_aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak_translator_aarch64. Xbyak_translator_aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86_64 architecture into executable codes for the Armv8-A architecture. Xbyak_translator_aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly.
Kentaro KAWAKAMI
Fujitsu Limited
Kouji KURIHARA
Fujitsu Limited
Masafumi YAMAZAKI
Fujitsu Limited
Takumi HONDA
Fujitsu Limited
Naoto FUKUMOTO
Fujitsu Limited
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kentaro KAWAKAMI, Kouji KURIHARA, Masafumi YAMAZAKI, Takumi HONDA, Naoto FUKUMOTO, "A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU" in IEICE TRANSACTIONS on Electronics,
vol. E105-C, no. 6, pp. 222-231, June 2022, doi: 10.1587/transele.2021LHP0001.
Abstract: To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86_64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86_64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86_64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak_aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak_translator_aarch64. Xbyak_translator_aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86_64 architecture into executable codes for the Armv8-A architecture. Xbyak_translator_aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly.
URL: https://global.ieice.org/en_transactions/electronics/10.1587/transele.2021LHP0001/_p
Copy
@ARTICLE{e105-c_6_222,
author={Kentaro KAWAKAMI, Kouji KURIHARA, Masafumi YAMAZAKI, Takumi HONDA, Naoto FUKUMOTO, },
journal={IEICE TRANSACTIONS on Electronics},
title={A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU},
year={2022},
volume={E105-C},
number={6},
pages={222-231},
abstract={To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86_64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86_64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86_64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak_aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak_translator_aarch64. Xbyak_translator_aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86_64 architecture into executable codes for the Armv8-A architecture. Xbyak_translator_aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly.},
keywords={},
doi={10.1587/transele.2021LHP0001},
ISSN={1745-1353},
month={June},}
Copy
TY - JOUR
TI - A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU
T2 - IEICE TRANSACTIONS on Electronics
SP - 222
EP - 231
AU - Kentaro KAWAKAMI
AU - Kouji KURIHARA
AU - Masafumi YAMAZAKI
AU - Takumi HONDA
AU - Naoto FUKUMOTO
PY - 2022
DO - 10.1587/transele.2021LHP0001
JO - IEICE TRANSACTIONS on Electronics
SN - 1745-1353
VL - E105-C
IS - 6
JA - IEICE TRANSACTIONS on Electronics
Y1 - June 2022
AB - To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86_64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86_64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86_64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak_aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak_translator_aarch64. Xbyak_translator_aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86_64 architecture into executable codes for the Armv8-A architecture. Xbyak_translator_aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly.
ER -