1-2hit |
Cheng LUO Wei CAO Lingli WANG Philip H. W. LEONG
With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.
Chun ZHANG Yu HU Lingli WANG Lei HE Jiarong TONG
Boolean matching is a fundamental problem in FPGA synthesis, but existing Boolean matchers are not scalable to complex PLBs (programmable logic blocks) and large circuits. This paper proposes a filter-based Boolean matching method, F-BM, which accelerates Boolean matching using lookup tables implemented by Bloom filters storing pre-calculated matching results. To show the effectiveness of the proposed F-BM, a post-mapping re-synthesis minimizing area which employs Boolean matching as the kernel has been implemented. Tested on a broad selection of benchmarks, the re-synthesizer using F-BM is 80X faster with 0.5% more area, compared with the one using a SAT-based Boolean matcher.