To implement the parallel acceleration of convolution operation of Convolutional Neural Networks (CNNs) on field programmable gate array (FPGA), large quantities of the logic resources will be consumed, expecially DSP cores. Many previous researches fail to make a well balance between DSP and LUT6. For better resource efficiency, a typical convolution structure is implemented with LUT6s in this paper. Besides, a novel convolution structure is proposed to further reduce the LUT6 resource consumption by modifying the typical convolution structure. The equations to evaluate the LUT6 resource consumptions of both structures are presented and validated. The theoretical evaluation and experimental results show that the novel structure can save 3.5-8% of LUT6s compared with the typical structure.
Huangtao WU
Sun Yat-sen University
Wenjin HUANG
Sun Yat-sen University
Rui CHEN
Sun Yat-sen University
Yihua HUANG
Sun Yat-sen University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Huangtao WU, Wenjin HUANG, Rui CHEN, Yihua HUANG, "Implementation and Area Optimization of LUT6 Based Convolution Structure on FPGA" in IEICE TRANSACTIONS on Fundamentals,
vol. E102-A, no. 12, pp. 1813-1815, December 2019, doi: 10.1587/transfun.E102.A.1813.
Abstract: To implement the parallel acceleration of convolution operation of Convolutional Neural Networks (CNNs) on field programmable gate array (FPGA), large quantities of the logic resources will be consumed, expecially DSP cores. Many previous researches fail to make a well balance between DSP and LUT6. For better resource efficiency, a typical convolution structure is implemented with LUT6s in this paper. Besides, a novel convolution structure is proposed to further reduce the LUT6 resource consumption by modifying the typical convolution structure. The equations to evaluate the LUT6 resource consumptions of both structures are presented and validated. The theoretical evaluation and experimental results show that the novel structure can save 3.5-8% of LUT6s compared with the typical structure.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E102.A.1813/_p
Copy
@ARTICLE{e102-a_12_1813,
author={Huangtao WU, Wenjin HUANG, Rui CHEN, Yihua HUANG, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Implementation and Area Optimization of LUT6 Based Convolution Structure on FPGA},
year={2019},
volume={E102-A},
number={12},
pages={1813-1815},
abstract={To implement the parallel acceleration of convolution operation of Convolutional Neural Networks (CNNs) on field programmable gate array (FPGA), large quantities of the logic resources will be consumed, expecially DSP cores. Many previous researches fail to make a well balance between DSP and LUT6. For better resource efficiency, a typical convolution structure is implemented with LUT6s in this paper. Besides, a novel convolution structure is proposed to further reduce the LUT6 resource consumption by modifying the typical convolution structure. The equations to evaluate the LUT6 resource consumptions of both structures are presented and validated. The theoretical evaluation and experimental results show that the novel structure can save 3.5-8% of LUT6s compared with the typical structure.},
keywords={},
doi={10.1587/transfun.E102.A.1813},
ISSN={1745-1337},
month={December},}
Copy
TY - JOUR
TI - Implementation and Area Optimization of LUT6 Based Convolution Structure on FPGA
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1813
EP - 1815
AU - Huangtao WU
AU - Wenjin HUANG
AU - Rui CHEN
AU - Yihua HUANG
PY - 2019
DO - 10.1587/transfun.E102.A.1813
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E102-A
IS - 12
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - December 2019
AB - To implement the parallel acceleration of convolution operation of Convolutional Neural Networks (CNNs) on field programmable gate array (FPGA), large quantities of the logic resources will be consumed, expecially DSP cores. Many previous researches fail to make a well balance between DSP and LUT6. For better resource efficiency, a typical convolution structure is implemented with LUT6s in this paper. Besides, a novel convolution structure is proposed to further reduce the LUT6 resource consumption by modifying the typical convolution structure. The equations to evaluate the LUT6 resource consumptions of both structures are presented and validated. The theoretical evaluation and experimental results show that the novel structure can save 3.5-8% of LUT6s compared with the typical structure.
ER -