The search functionality is under construction.
The search functionality is under construction.

Open Access
A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU

Kentaro KAWAKAMI, Kouji KURIHARA, Masafumi YAMAZAKI, Takumi HONDA, Naoto FUKUMOTO

  • Full Text Views

    105

  • Cite this
  • Free PDF (952.6KB)

Summary :

To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86_64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86_64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86_64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak_aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak_translator_aarch64. Xbyak_translator_aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86_64 architecture into executable codes for the Armv8-A architecture. Xbyak_translator_aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly.

Publication
IEICE TRANSACTIONS on Electronics Vol.E105-C No.6 pp.222-231
Publication Date
2022/06/01
Publicized
2021/12/03
Online ISSN
1745-1353
DOI
10.1587/transele.2021LHP0001
Type of Manuscript
Special Section PAPER (Special Section on Low-Power and High-Speed Chips)
Category

Authors

Kentaro KAWAKAMI
  Fujitsu Limited
Kouji KURIHARA
  Fujitsu Limited
Masafumi YAMAZAKI
  Fujitsu Limited
Takumi HONDA
  Fujitsu Limited
Naoto FUKUMOTO
  Fujitsu Limited

Keyword