GAO Shangbing,HUANG Zihe,GENG Xuan,et al.A visual collaborative analysis method for detecting illegal driving behavior[J].CAAI Transactions on Intelligent Systems,2021,16(6):1158-1165.[doi:10.11992/tis.202101024]





A visual collaborative analysis method for detecting illegal driving behavior
高尚兵12 黄子赫1 耿璇1 臧晨1 沈晓坤1
1. 淮阴工学院 计算机与软件工程学院,江苏 淮安 223001;
2. 淮阴工学院 江苏省物联网移动互联技术工程实验室,江苏 淮安 223001
GAO Shangbing12 HUANG Zihe1 GENG Xuan1 ZANG Chen1 Shen Xiaokun1
1. College of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian 223001, China;
2. Laboratory for Internet of Things and Mobile Internet Technology of Jiangsu Province, Huaiyin Institute of Technology, Huaian 223001, China
driving behavior recognitionmodel pruningtarget detectionattitude estimationcollaborative detectionmodel optimizationdeep learningconvolutional neural network
本文针对危险驾驶识别中主流行为检测算法可靠性差的问题,提出了一种快速、可靠的视觉协同分析方法。对手机、水杯、香烟等敏感物体进行目标检测,提出的LW(low weight)-Yolov4(You only look once v4)通过去除CSPDarknet53(cross stage partial Darknet53)卷积层中不重要的要素通道提升了检测速度,并L1正则化产生稀疏权值矩阵,添加到BN(batch normalization)层的梯度中,实现优化网络模型的目的;提出姿态检测算法对驾驶员指关节关键点进行检测,经过仿射逆变换得到原始帧中的坐标;通过视觉协同分析对比敏感物品的检测框位置与驾驶员手部坐标是否重合,判定驾驶员是否出现违规驾驶行为及类别。实验结果表明,该方法在识别精度与检测速度方面均优于主流的算法,能够满足实时性和可靠性的检测要求。
This study proposes a fast and reliable visual collaborative analysis method to improve the reliability of mainstream behavior detection algorithms in dangerous driving recognition. First, the algorithm performs target detection on sensitive objects such as mobile phones, water cups, and cigarettes. The proposed low weight-Yolov4 algorithm improves the detection speed by removing unimportant element channels in the cross-stage partial Darknet53 convolutional layer and regularizes L1 to generate a sparse weight matrix. Besides, the obtained matrix is added to the gradient of the batch normalization layer to optimize the network model. Then, an attitude detection algorithm is used to detect key points of the driver’s knuckles, and the coordinates in the original frame are obtained through the affine inverse transformation. Finally, the driver’s illegal driving behavior and its category are determined through visual collaborative analysis and comparison of the position of the detection frame of sensitive objects and coordinates of the driver’s hands. Experimental results show that the recognition accuracy and detection speed of the proposed method are better than those of mainstream algorithms, which can meet the detection requirements of real-time and reliability.


[1] 张慧, 王坤峰, 王飞跃. 深度学习在目标视觉检测中的应用进展与展望[J]. 自动化学报, 2017, 43(8): 1289-1305
ZHANG Hui, WANG Kunfeng, WANG Feiyue. Advances and perspectives on applications of deep learning in visual object detection[J]. Acta automatica sinica, 2017, 43(8): 1289-1305
[2] LE T H N, ZHENG Yutong, ZHU Chenchen, et al. Multiple scale faster-RCNN approach to driver’s cell-phone usage and hands on steering wheel detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Las Vegas, USA, 2016: 46-53.
[3] 李俊俊, 杨华民, 张澍裕, 等. 基于神经网络融合的司机违规行为识别[J]. 计算机应用与软件, 2018, 35(12): 222-227, 319
LI Junjun, YANG Huamin, ZHANG Shuyu, et al. Driver’s illegal behavior recognition based on neural network fusion[J]. Computer applications and software, 2018, 35(12): 222-227, 319
[4] 魏泽发. 基于深度学习的出租车司机违规行为检测[D]. 西安: 长安大学, 2019.
WEI Zefa. Taxi driver violation detection based on deep learning[D]. Xi’an: Chang’an University, 2019.
[5] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//14th European Conference on Computer Vision. Amsterdam, The Netherlands, 2016: 21-37.
[6] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2020-12-03] https://arxiv.org/abs/1409.1556.
[7] JIN Chongchong, ZHU Zhongjie, BAI Yongqiang, et al. A deep-learning-based scheme for detecting driver cell-phone use[J]. IEEE access, 2020, 8: 18580-18589.
[8] HUANG Chen, WANG Xiaochen, CAO Jiannong, et al. HCF: a hybrid CNN framework for behavior detection of distracted drivers[J]. IEEE access, 2020, 8: 109335-109349.
[9] HE Anqing, CHEN Guohua, ZHENG Wei, et al. Driver cell-phone use detection based on CornerNet-Lite network[C]//OP Conference Series: Earth and Environmental Science. Smolensk, Russian, 2021: 042004.
[10] LAW H, TENG Yun, RUSSAKOVSKY O, et al. Cornernet-lite: efficient keypoint based object detection[EB/OL]. (2019-04-18)[2020-12-03] https://arxiv:1904.08900.2019.
[11] MASOOD S, RAI A, AGGARWAL A, et al. Detecting distraction of drivers using convolutional neural network[J]. Pattern recognition letters, 2020, 139: 79-85.
[12] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL].(2020-04-23)[2020-12-03] https://arxiv.org/abs/2004.10934.
[13] REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL].(2018-04-08)[2020-12-03] https://arxiv.org/abs/1804.020767.
[14] TAN Mingxing, PANG Ruoming, LE O V. EfficientDet: scalable and efficient object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA, 2020: 10778-10787.
[15] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137-1149.
[16] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, Italy, 2017: 2999-3007.
[17] DAI Jifeng, LI Yi, HE Kaiming, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 379-387.
[18] CAO Zhe, HIDALGO G, SIMON T, et al. OpenPose: realtime multi-person 2D pose estimation using part affinity fields[J]. IEEE transactions on pattern analysis and machine intelligence, 2021, 43(1): 172-186.
[19] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.
[20] LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 8759-8768.
[21] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017: 936-944.
[22] REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA, 2019: 658-666.
[23] YUN S, HAN D, CHUN S, et al. Cutmix: regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South), 2019: 6022-6031.
[24] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA, 2020: 1571-1580.
[25] 赵传君, 王素格, 李德玉. 跨领域文本情感分类研究进展[J]. 软件学报, 2020, 31(6): 1723-1746
ZHAO Chuanjun, WANG Suge, LI Deyu. Research progress on cross-domain text sentiment classification[J]. Journal of software, 2020, 31(6): 1723-1746
[26] 王岩. 深度神经网络的归一化技术研究[D]. 南京: 南京邮电大学, 2019: 1179-1185.
WANG Yan. Analysis of normalization for deep neural networks[D]. Nanjing: Nanjing University of Posts and Telecommunications, 2019: 1179-1185.
[27] ZHANG Shuang, SONG Zongxi. An ethnic costumes classification model with optimized learning rate[C]//The 11th International Conference on Digital Image Processing. Guangzhou, China, 2019: 1179-1185.


更新日期/Last Update: 2021-12-25