The search functionality is under construction.

IEICE TRANSACTIONS on Electronics

Open Access
Weight Compression MAC Accelerator for Effective Inference of Deep Learning

Asuka MAKI, Daisuke MIYASHITA, Shinichi SASAKI, Kengo NAKATA, Fumihiko TACHIBANA, Tomoya SUZUKI, Jun DEGUCHI, Ryuichi FUJIMOTO

  • Full Text Views

    33

  • Cite this
  • Free PDF (4MB)

Summary :

Many studies of deep neural networks have reported inference accelerators for improved energy efficiency. We propose methods for further improving energy efficiency while maintaining recognition accuracy, which were developed by the co-design of a filter-by-filter quantization scheme with variable bit precision and a hardware architecture that fully supports it. Filter-wise quantization reduces the average bit precision of weights, so execution times and energy consumption for inference are reduced in proportion to the total number of computations multiplied by the average bit precision of weights. The hardware utilization is also improved by a bit-parallel architecture suitable for granularly quantized bit precision of weights. We implement the proposed architecture on an FPGA and demonstrate that the execution cycles are reduced to 1/5.3 for ResNet-50 on ImageNet in comparison with a conventional method, while maintaining recognition accuracy.

Publication
IEICE TRANSACTIONS on Electronics Vol.E103-C No.10 pp.514-523
Publication Date
2020/10/01
Publicized
2020/05/15
Online ISSN
1745-1353
DOI
10.1587/transele.2019CTP0007
Type of Manuscript
Special Section PAPER (Special Section on Analog Circuits and Their Application Technologies)
Category
Integrated Electronics

Authors

Asuka MAKI
  Kioxia Corporation
Daisuke MIYASHITA
  Kioxia Corporation
Shinichi SASAKI
  Kioxia Corporation
Kengo NAKATA
  Kioxia Corporation
Fumihiko TACHIBANA
  Kioxia Corporation
Tomoya SUZUKI
  Kioxia Corporation
Jun DEGUCHI
  Kioxia Corporation
Ryuichi FUJIMOTO
  Kioxia Corporation

Keyword