Binary neural networks (BNNs), where both activations and weights are radically quantized to be {-1, +1}, can massively accelerate the run-time performance of convolution neural networks (CNNs) for edge devices, by computation complexity reduction and memory footprint saving. However, the non-differentiable binarizing function used in BNNs, makes the binarized models hard to be optimized, and introduces significant performance degradation than the full-precision models. Many previous works managed to correct the backward gradient of binarizing function with various improved versions of straight-through estimation (STE), or in a gradual approximate approach, but the gradient suppression problem was not analyzed and handled. Thus, we propose a novel gradient corrected approximation (GCA) method to match the discrepancy between binarizing function and backward gradient in a gradual and stable way. Our work has two primary contributions: The first is to approximate the backward gradient of binarizing function using a simple leaky-steep function with variable window size. The second is to correct the gradient approximation by standardizing the backward gradient propagated through binarizing function. Experiment results show that the proposed method outperforms the baseline by 1.5% Top-1 accuracy on ImageNet dataset without introducing extra computation cost.
Song CHENG
University of Chinese Academy of Sciences,Institute of Microelectronics of Chinese Academy of Sciences
Zixuan LI
University of Chinese Academy of Sciences,Institute of Microelectronics of Chinese Academy of Sciences
Yongsen WANG
University of Chinese Academy of Sciences,Institute of Microelectronics of Chinese Academy of Sciences
Wanbing ZOU
University of Chinese Academy of Sciences,Institute of Microelectronics of Chinese Academy of Sciences
Yumei ZHOU
University of Chinese Academy of Sciences,Institute of Microelectronics of Chinese Academy of Sciences
Delong SHANG
Institute of Microelectronics of Chinese Academy of Sciences,IMECAS
Shushan QIAO
University of Chinese Academy of Sciences,Institute of Microelectronics of Chinese Academy of Sciences
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Song CHENG, Zixuan LI, Yongsen WANG, Wanbing ZOU, Yumei ZHOU, Delong SHANG, Shushan QIAO, "Gradient Corrected Approximation for Binary Neural Networks" in IEICE TRANSACTIONS on Information,
vol. E104-D, no. 10, pp. 1784-1788, October 2021, doi: 10.1587/transinf.2021EDL8026.
Abstract: Binary neural networks (BNNs), where both activations and weights are radically quantized to be {-1, +1}, can massively accelerate the run-time performance of convolution neural networks (CNNs) for edge devices, by computation complexity reduction and memory footprint saving. However, the non-differentiable binarizing function used in BNNs, makes the binarized models hard to be optimized, and introduces significant performance degradation than the full-precision models. Many previous works managed to correct the backward gradient of binarizing function with various improved versions of straight-through estimation (STE), or in a gradual approximate approach, but the gradient suppression problem was not analyzed and handled. Thus, we propose a novel gradient corrected approximation (GCA) method to match the discrepancy between binarizing function and backward gradient in a gradual and stable way. Our work has two primary contributions: The first is to approximate the backward gradient of binarizing function using a simple leaky-steep function with variable window size. The second is to correct the gradient approximation by standardizing the backward gradient propagated through binarizing function. Experiment results show that the proposed method outperforms the baseline by 1.5% Top-1 accuracy on ImageNet dataset without introducing extra computation cost.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDL8026/_p
Copy
@ARTICLE{e104-d_10_1784,
author={Song CHENG, Zixuan LI, Yongsen WANG, Wanbing ZOU, Yumei ZHOU, Delong SHANG, Shushan QIAO, },
journal={IEICE TRANSACTIONS on Information},
title={Gradient Corrected Approximation for Binary Neural Networks},
year={2021},
volume={E104-D},
number={10},
pages={1784-1788},
abstract={Binary neural networks (BNNs), where both activations and weights are radically quantized to be {-1, +1}, can massively accelerate the run-time performance of convolution neural networks (CNNs) for edge devices, by computation complexity reduction and memory footprint saving. However, the non-differentiable binarizing function used in BNNs, makes the binarized models hard to be optimized, and introduces significant performance degradation than the full-precision models. Many previous works managed to correct the backward gradient of binarizing function with various improved versions of straight-through estimation (STE), or in a gradual approximate approach, but the gradient suppression problem was not analyzed and handled. Thus, we propose a novel gradient corrected approximation (GCA) method to match the discrepancy between binarizing function and backward gradient in a gradual and stable way. Our work has two primary contributions: The first is to approximate the backward gradient of binarizing function using a simple leaky-steep function with variable window size. The second is to correct the gradient approximation by standardizing the backward gradient propagated through binarizing function. Experiment results show that the proposed method outperforms the baseline by 1.5% Top-1 accuracy on ImageNet dataset without introducing extra computation cost.},
keywords={},
doi={10.1587/transinf.2021EDL8026},
ISSN={1745-1361},
month={October},}
Copy
TY - JOUR
TI - Gradient Corrected Approximation for Binary Neural Networks
T2 - IEICE TRANSACTIONS on Information
SP - 1784
EP - 1788
AU - Song CHENG
AU - Zixuan LI
AU - Yongsen WANG
AU - Wanbing ZOU
AU - Yumei ZHOU
AU - Delong SHANG
AU - Shushan QIAO
PY - 2021
DO - 10.1587/transinf.2021EDL8026
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E104-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2021
AB - Binary neural networks (BNNs), where both activations and weights are radically quantized to be {-1, +1}, can massively accelerate the run-time performance of convolution neural networks (CNNs) for edge devices, by computation complexity reduction and memory footprint saving. However, the non-differentiable binarizing function used in BNNs, makes the binarized models hard to be optimized, and introduces significant performance degradation than the full-precision models. Many previous works managed to correct the backward gradient of binarizing function with various improved versions of straight-through estimation (STE), or in a gradual approximate approach, but the gradient suppression problem was not analyzed and handled. Thus, we propose a novel gradient corrected approximation (GCA) method to match the discrepancy between binarizing function and backward gradient in a gradual and stable way. Our work has two primary contributions: The first is to approximate the backward gradient of binarizing function using a simple leaky-steep function with variable window size. The second is to correct the gradient approximation by standardizing the backward gradient propagated through binarizing function. Experiment results show that the proposed method outperforms the baseline by 1.5% Top-1 accuracy on ImageNet dataset without introducing extra computation cost.
ER -