RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks

Cheng LUO; Wei CAO; Lingli WANG; Philip H. W. LEONG

doi:10.1587/transinf.2018RCP0008

RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks

Cheng LUO, Wei CAO, Lingli WANG, Philip H. W. LEONG

Full Text Views

0

Cite this

Summary :

With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.

Publication: IEICE TRANSACTIONS on Information Vol.E102-D No.5 pp.1037-1045

Publication Date: 2019/05/01

Publicized: 2019/02/19

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2018RCP0008

Type of Manuscript: Special Section PAPER (Special Section on Reconfigurable Systems)

Category: Applications

Authors

Cheng LUO
  Fudan University
Wei CAO
  Fudan University
Lingli WANG
  Fudan University
Philip H. W. LEONG
  University of Sydney

Keyword

deep learning, residual networks, software-hardware co-design, batch-normalization layers, FPGA

Cite this

Copy

Cheng LUO, Wei CAO, Lingli WANG, Philip H. W. LEONG, "RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks" in IEICE TRANSACTIONS on Information, vol. E102-D, no. 5, pp. 1037-1045, May 2019, doi: 10.1587/transinf.2018RCP0008.
Abstract: With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018RCP0008/_p

Copy

@ARTICLE{e102-d_5_1037,
author={Cheng LUO, Wei CAO, Lingli WANG, Philip H. W. LEONG, },
journal={IEICE TRANSACTIONS on Information},
title={RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks},
year={2019},
volume={E102-D},
number={5},
pages={1037-1045},
abstract={With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.},
keywords={},
doi={10.1587/transinf.2018RCP0008},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - RNA: An Accurate Residual Network Accelerator for Quantized and Reconstructed Deep Neural Networks
T2 - IEICE TRANSACTIONS on Information
SP - 1037
EP - 1045
AU - Cheng LUO
AU - Wei CAO
AU - Lingli WANG
AU - Philip H. W. LEONG
PY - 2019
DO - 10.1587/transinf.2018RCP0008
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E102-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2019
AB - With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.
ER -