<-上一篇/Previous Article 下一篇/Next Article->

[1]高尚兵,黄子赫,耿璇,等.视觉协同的违规驾驶行为分析方法[J].智能系统学报,2021,16(6):1158-1165.[doi:10.11992/tis.202101024]
　GAO Shangbing,HUANG Zihe,GENG Xuan,et al.A visual collaborative analysis method for detecting illegal driving behavior[J].CAAI Transactions on Intelligent Systems,2021,16(6):1158-1165.[doi:10.11992/tis.202101024]

点击复制

视觉协同的违规驾驶行为分析方法

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 16 期数: 2021年第6期页码: 1158-1165 栏目: 吴文俊人工智能科学技术奖论坛出版日期: 2021-11-05

Title:: A visual collaborative analysis method for detecting illegal driving behavior

作者:: 高尚兵^1,2, 黄子赫¹, 耿璇¹, 臧晨¹, 沈晓坤¹; 1. 淮阴工学院计算机与软件工程学院，江苏淮安 223001;
2. 淮阴工学院江苏省物联网移动互联技术工程实验室，江苏淮安 223001

Author(s):: GAO Shangbing^1,2, HUANG Zihe¹, GENG Xuan¹, ZANG Chen¹, Shen Xiaokun¹; 1. College of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian 223001, China;
2. Laboratory for Internet of Things and Mobile Internet Technology of Jiangsu Province, Huaiyin Institute of Technology, Huaian 223001, China

关键词:: 驾驶行为识别; 模型剪枝; 目标检测; 姿态估计; 协同检测; 模型优化; 深度学习; 卷积神经网络

Keywords:: driving behavior recognition; model pruning; target detection; attitude estimation; collaborative detection; model optimization; deep learning; convolutional neural network

分类号:: TP391.4

DOI:: 10.11992/tis.202101024

摘要:: 本文针对危险驾驶识别中主流行为检测算法可靠性差的问题，提出了一种快速、可靠的视觉协同分析方法。对手机、水杯、香烟等敏感物体进行目标检测，提出的LW(low weight)-Yolov4(You only look once v4)通过去除CSPDarknet53(cross stage partial Darknet53)卷积层中不重要的要素通道提升了检测速度，并L₁正则化产生稀疏权值矩阵，添加到BN(batch normalization)层的梯度中，实现优化网络模型的目的；提出姿态检测算法对驾驶员指关节关键点进行检测，经过仿射逆变换得到原始帧中的坐标；通过视觉协同分析对比敏感物品的检测框位置与驾驶员手部坐标是否重合，判定驾驶员是否出现违规驾驶行为及类别。实验结果表明，该方法在识别精度与检测速度方面均优于主流的算法，能够满足实时性和可靠性的检测要求。

Abstract:: This study proposes a fast and reliable visual collaborative analysis method to improve the reliability of mainstream behavior detection algorithms in dangerous driving recognition. First, the algorithm performs target detection on sensitive objects such as mobile phones, water cups, and cigarettes. The proposed low weight-Yolov4 algorithm improves the detection speed by removing unimportant element channels in the cross-stage partial Darknet53 convolutional layer and regularizes L₁ to generate a sparse weight matrix. Besides, the obtained matrix is added to the gradient of the batch normalization layer to optimize the network model. Then, an attitude detection algorithm is used to detect key points of the driver’s knuckles, and the coordinates in the original frame are obtained through the affine inverse transformation. Finally, the driver’s illegal driving behavior and its category are determined through visual collaborative analysis and comparison of the position of the detection frame of sensitive objects and coordinates of the driver’s hands. Experimental results show that the recognition accuracy and detection speed of the proposed method are better than those of mainstream algorithms, which can meet the detection requirements of real-time and reliability.

参考文献/References:: [1] 张慧, 王坤峰, 王飞跃. 深度学习在目标视觉检测中的应用进展与展望[J]. 自动化学报, 2017, 43(8): 1289-1305
ZHANG Hui, WANG Kunfeng, WANG Feiyue. Advances and perspectives on applications of deep learning in visual object detection[J]. Acta automatica sinica, 2017, 43(8): 1289-1305
[2] LE T H N, ZHENG Yutong, ZHU Chenchen, et al. Multiple scale faster-RCNN approach to driver’s cell-phone usage and hands on steering wheel detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Las Vegas, USA, 2016: 46-53.
[3] 李俊俊, 杨华民, 张澍裕, 等. 基于神经网络融合的司机违规行为识别[J]. 计算机应用与软件, 2018, 35(12): 222-227, 319
LI Junjun, YANG Huamin, ZHANG Shuyu, et al. Driver’s illegal behavior recognition based on neural network fusion[J]. Computer applications and software, 2018, 35(12): 222-227, 319
[4] 魏泽发. 基于深度学习的出租车司机违规行为检测[D]. 西安: 长安大学, 2019.
WEI Zefa. Taxi driver violation detection based on deep learning[D]. Xi’an: Chang’an University, 2019.
[5] LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//14th European Conference on Computer Vision. Amsterdam, The Netherlands, 2016: 21-37.
[6] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2020-12-03] https://arxiv.org/abs/1409.1556.
[7] JIN Chongchong, ZHU Zhongjie, BAI Yongqiang, et al. A deep-learning-based scheme for detecting driver cell-phone use[J]. IEEE access, 2020, 8: 18580-18589.
[8] HUANG Chen, WANG Xiaochen, CAO Jiannong, et al. HCF: a hybrid CNN framework for behavior detection of distracted drivers[J]. IEEE access, 2020, 8: 109335-109349.
[9] HE Anqing, CHEN Guohua, ZHENG Wei, et al. Driver cell-phone use detection based on CornerNet-Lite network[C]//OP Conference Series: Earth and Environmental Science. Smolensk, Russian, 2021: 042004.
[10] LAW H, TENG Yun, RUSSAKOVSKY O, et al. Cornernet-lite: efficient keypoint based object detection[EB/OL]. (2019-04-18)[2020-12-03] https://arxiv:1904.08900.2019.
[11] MASOOD S, RAI A, AGGARWAL A, et al. Detecting distraction of drivers using convolutional neural network[J]. Pattern recognition letters, 2020, 139: 79-85.
[12] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL].(2020-04-23)[2020-12-03] https://arxiv.org/abs/2004.10934.
[13] REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL].(2018-04-08)[2020-12-03] https://arxiv.org/abs/1804.020767.
[14] TAN Mingxing, PANG Ruoming, LE O V. EfficientDet: scalable and efficient object detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA, 2020: 10778-10787.
[15] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6): 1137-1149.
[16] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, Italy, 2017: 2999-3007.
[17] DAI Jifeng, LI Yi, HE Kaiming, et al. R-FCN: object detection via region-based fully convolutional networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 379-387.
[18] CAO Zhe, HIDALGO G, SIMON T, et al. OpenPose: realtime multi-person 2D pose estimation using part affinity fields[J]. IEEE transactions on pattern analysis and machine intelligence, 2021, 43(1): 172-186.
[19] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.
[20] LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 8759-8768.
[21] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017: 936-944.
[22] REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA, 2019: 658-666.
[23] YUN S, HAN D, CHUN S, et al. Cutmix: regularization strategy to train strong classifiers with localizable features[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South), 2019: 6022-6031.
[24] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA, 2020: 1571-1580.
[25] 赵传君, 王素格, 李德玉. 跨领域文本情感分类研究进展[J]. 软件学报, 2020, 31(6): 1723-1746
ZHAO Chuanjun, WANG Suge, LI Deyu. Research progress on cross-domain text sentiment classification[J]. Journal of software, 2020, 31(6): 1723-1746
[26] 王岩. 深度神经网络的归一化技术研究[D]. 南京: 南京邮电大学, 2019: 1179-1185.
WANG Yan. Analysis of normalization for deep neural networks[D]. Nanjing: Nanjing University of Posts and Telecommunications, 2019: 1179-1185.
[27] ZHANG Shuang, SONG Zongxi. An ethnic costumes classification model with optimized learning rate[C]//The 11th International Conference on Digital Image Processing. Guangzhou, China, 2019: 1179-1185.

相似文献/References:: [1]丁贵广,陈辉,王澳,等.视觉深度学习模型压缩加速综述[J].智能系统学报,2024,19(5):1072.[doi:10.11992/tis.202311011]
　DING Guiguang,CHEN Hui,WANG Ao,et al.Review of model compression and acceleration for visual deep learning[J].CAAI Transactions on Intelligent Systems,2024,19():1072.[doi:10.11992/tis.202311011]

备注/Memo

收稿日期:2021-01-18。
基金项目:国家重点研发计划项目(2018YFB1004904)；江苏高校“青蓝工程”项目；江苏省高校自然科学研究重大项目(18KJA520001)
作者简介:高尚兵，教授，博士，主要研究方向为机器学习、计算机视觉、模式识别和数据挖掘。获中国仿真学会科技进步二等奖、吴文俊人工智能科技进步三等奖。发表学术论文100余篇;黄子赫，硕士研究生，主要研究方向为计算机视觉、模式识别和数据挖掘;耿璇，本科生，主要研究方向为图像识别
通讯作者:高尚兵.E-mail:luxiaofen_2002@126.com

更新日期/Last Update: 2021-12-25

视觉协同的违规驾驶行为分析方法 PDF下载HTML

备注/Memo

视觉协同的违规驾驶行为分析方法

PDF下载 HTML