Compression of Vehicle and Pedestrian Detection Network Based on YOLOv3 Model

Lie GUO; Yibing ZHAO; Jiandong GAO

doi:10.1587/transinf.2022DLP0021

Compression of Vehicle and Pedestrian Detection Network Based on YOLOv3 Model

Lie GUO, Yibing ZHAO, Jiandong GAO

Full Text Views

5

Cite this

Summary :

The commonly used object detection algorithm based on convolutional neural network is difficult to meet the real-time requirement on embedded platform due to its large size of model, large amount of calculation, and long inference time. It is necessary to use model compression to reduce the amount of network calculation and increase the speed of network inference. This paper conducts compression of vehicle and pedestrian detection network by pruning and removing redundant parameters. The vehicle and pedestrian detection network is trained based on YOLOv3 model by using K-means++ to cluster the anchor boxes. The detection accuracy is improved by changing the proportion of categorical losses and regression losses for each category in the loss function because of the unbalanced number of targets in the dataset. A layer and channel pruning algorithm is proposed by combining global channel pruning thresholds and L1 norm, which can reduce the time cost of the network layer transfer process and the amount of computation. Network layer fusion based on TensorRT is performed and inference is performed using half-precision floating-point to improve the speed of inference. Results show that the vehicle and pedestrian detection compression network pruned 84% channels and 15 Shortcut modules can reduce the size by 32% and the amount of calculation by 17%. While the network inference time can be decreased to 21 ms, which is 1.48 times faster than the network pruned 84% channels.

Publication: IEICE TRANSACTIONS on Information Vol.E106-D No.5 pp.735-745

Publication Date: 2023/05/01

Publicized: 2022/06/22

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2022DLP0021

Type of Manuscript: Special Section PAPER (Special Section on Deep Learning Technologies: Architecture, Optimization, Techniques, and Applications)

Category: Intelligent Transportation Systems

Authors

Lie GUO
  Dalian University of Technology
Yibing ZHAO
  Dalian University of Technology
Jiandong GAO
  Dalian University of Technology

Keyword

intelligent vehicle, YOLOv3, target detection, model pruning

Cite this

Copy

Lie GUO, Yibing ZHAO, Jiandong GAO, "Compression of Vehicle and Pedestrian Detection Network Based on YOLOv3 Model" in IEICE TRANSACTIONS on Information, vol. E106-D, no. 5, pp. 735-745, May 2023, doi: 10.1587/transinf.2022DLP0021.
Abstract: The commonly used object detection algorithm based on convolutional neural network is difficult to meet the real-time requirement on embedded platform due to its large size of model, large amount of calculation, and long inference time. It is necessary to use model compression to reduce the amount of network calculation and increase the speed of network inference. This paper conducts compression of vehicle and pedestrian detection network by pruning and removing redundant parameters. The vehicle and pedestrian detection network is trained based on YOLOv3 model by using K-means++ to cluster the anchor boxes. The detection accuracy is improved by changing the proportion of categorical losses and regression losses for each category in the loss function because of the unbalanced number of targets in the dataset. A layer and channel pruning algorithm is proposed by combining global channel pruning thresholds and L1 norm, which can reduce the time cost of the network layer transfer process and the amount of computation. Network layer fusion based on TensorRT is performed and inference is performed using half-precision floating-point to improve the speed of inference. Results show that the vehicle and pedestrian detection compression network pruned 84% channels and 15 Shortcut modules can reduce the size by 32% and the amount of calculation by 17%. While the network inference time can be decreased to 21 ms, which is 1.48 times faster than the network pruned 84% channels.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022DLP0021/_p

Copy

@ARTICLE{e106-d_5_735,
author={Lie GUO, Yibing ZHAO, Jiandong GAO, },
journal={IEICE TRANSACTIONS on Information},
title={Compression of Vehicle and Pedestrian Detection Network Based on YOLOv3 Model},
year={2023},
volume={E106-D},
number={5},
pages={735-745},
abstract={The commonly used object detection algorithm based on convolutional neural network is difficult to meet the real-time requirement on embedded platform due to its large size of model, large amount of calculation, and long inference time. It is necessary to use model compression to reduce the amount of network calculation and increase the speed of network inference. This paper conducts compression of vehicle and pedestrian detection network by pruning and removing redundant parameters. The vehicle and pedestrian detection network is trained based on YOLOv3 model by using K-means++ to cluster the anchor boxes. The detection accuracy is improved by changing the proportion of categorical losses and regression losses for each category in the loss function because of the unbalanced number of targets in the dataset. A layer and channel pruning algorithm is proposed by combining global channel pruning thresholds and L1 norm, which can reduce the time cost of the network layer transfer process and the amount of computation. Network layer fusion based on TensorRT is performed and inference is performed using half-precision floating-point to improve the speed of inference. Results show that the vehicle and pedestrian detection compression network pruned 84% channels and 15 Shortcut modules can reduce the size by 32% and the amount of calculation by 17%. While the network inference time can be decreased to 21 ms, which is 1.48 times faster than the network pruned 84% channels.},
keywords={},
doi={10.1587/transinf.2022DLP0021},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - Compression of Vehicle and Pedestrian Detection Network Based on YOLOv3 Model
T2 - IEICE TRANSACTIONS on Information
SP - 735
EP - 745
AU - Lie GUO
AU - Yibing ZHAO
AU - Jiandong GAO
PY - 2023
DO - 10.1587/transinf.2022DLP0021
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - The commonly used object detection algorithm based on convolutional neural network is difficult to meet the real-time requirement on embedded platform due to its large size of model, large amount of calculation, and long inference time. It is necessary to use model compression to reduce the amount of network calculation and increase the speed of network inference. This paper conducts compression of vehicle and pedestrian detection network by pruning and removing redundant parameters. The vehicle and pedestrian detection network is trained based on YOLOv3 model by using K-means++ to cluster the anchor boxes. The detection accuracy is improved by changing the proportion of categorical losses and regression losses for each category in the loss function because of the unbalanced number of targets in the dataset. A layer and channel pruning algorithm is proposed by combining global channel pruning thresholds and L1 norm, which can reduce the time cost of the network layer transfer process and the amount of computation. Network layer fusion based on TensorRT is performed and inference is performed using half-precision floating-point to improve the speed of inference. Results show that the vehicle and pedestrian detection compression network pruned 84% channels and 15 Shortcut modules can reduce the size by 32% and the amount of calculation by 17%. While the network inference time can be decreased to 21 ms, which is 1.48 times faster than the network pruned 84% channels.
ER -