<-上一篇/Previous Article 下一篇/Next Article->

[1]莫宏伟,汪海波.基于Faster R-CNN的人体行为检测研究[J].智能系统学报,2018,13(6):967-973.[doi:10.11992/tis.201801025]
　MO Hongwei,WANG Haibo.Research on human behavior detection based on Faster R-CNN[J].CAAI Transactions on Intelligent Systems,2018,13(6):967-973.[doi:10.11992/tis.201801025]

点击复制

基于Faster R-CNN的人体行为检测研究

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 13 期数: 2018年第6期页码: 967-973 栏目: 学术论文—机器学习出版日期: 2018-10-25

Title:: Research on human behavior detection based on Faster R-CNN

作者:: 莫宏伟, 汪海波; 哈尔滨工程大学自动化学院, 黑龙江哈尔滨 150001

Author(s):: MO Hongwei, WANG Haibo; College of Automation, Harbin Engineering University, Harbin 150001, China

关键词:: 人体行为检测; 更快速区域卷积神经网络; 在线难例挖掘; 深度学习; 目标检测; 卷积神经网络; 批规范化; 迁移学习

Keywords:: human behavior detection; faster R-CNN; OHEM; deep learning; object detection; convolutional neural network; batch normalization; transfer learning

分类号:: TP181

DOI:: 10.11992/tis.201801025

摘要:: 由于人体行为类内差异大，类间相似性大，而且还存在视觉角度与遮挡等问题，使用人工提取特征的方法特征提取难度大并且难以提取有效特征，使得人体行为检测率较低。针对这个问题，本文在物体检测的基础上使用检测效果较好的Faster R-CNN算法来进行人体行为检测，并对Faster R-CNN算法与批量规范化算法和在线难例挖掘算法进行结合，有效利用了深度学习算法实现人体行为检测。对此改进算法进行实验验证，验证的分类和位置精度达到了80%以上，实验结果表明，改进的算法具有识别精度高的特点。

Abstract:: Because of large intra-class difference and large inter-class similarity of human behaviors, as well as problems such as visual angle and occlusion, it is difficult to extract features, especially effective features, using the manual feature extraction method. This results in low accuracy of human behavior detection. To solve this problem, this paper applies a faster region-based convolutional neural network (Faster R-CNN) algorithm, which has a better detection effect, to detect human behavior on the basis of object detection. By combining the Faster-RCNN algorithm with batch normalization algorithm and an online hard example mining algorithm, the deep learning algorithm is effectively utilized to detect human behavior. Experimental results show that the accuracy of classification and position of the improved algorithm exceeds 80%, thereby verifying its high recognition accuracy.

参考文献/References:: [1] IOFFE S, SZEGEDY C. Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France, 2015:448-456.
[2] GIRSHICK R, DONAHUE J, DARRELL T, et al. Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 38(1):142-158.
[3] 李航. 统计学习方法[M]. 北京:清华大学出版社, 2012:36-58. LI Hang. Statistical learning method[M]. Beijing:Tsinghua University Press, 2012:36-58.
[4] 张文达, 许悦雷, 倪嘉成, 等. 基于多尺度分块卷积神经网络的图像目标识别算法[J]. 计算机应用, 2016, 36(4):1033-1038 ZHANG Wenda, XU Yuelei, NI Jiacheng, et al. Image target recognition method based on multi-scale block convolutional neural network[J]. Journal of computer applications, 2016, 36(4):1033-1038
[5] GIRSHICK R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015:1440-1448.
[6] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada, 2015:91-99.
[7] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland, 2014:346-361.
[8] JI Shuiwang, XU Wei, YANG Ming, et al. 3D convolutional neural networks for automatic human action recognition[P]. USA:8345984, 2013.
[9] KARPATHY A, TODERICI G, SHETTY S, et al. Large-scale video classification with convolutional neural networks[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014:1725-1732.
[10] SUN Lin, JIA Kui, CHAN T H, et al. DL-SFA:deeply-learned slow feature analysis for action recognition[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014:2625-2632.
[11] DONAHUE J, HENDRICKS L A, GUADARRAMA S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:2625-2634.
[12] GKIOXARI G, HARIHARAN B, GIRSHICK R, et al. R-CNNs for pose estimation and action detection[J]. Computer science, 2014, 12(8):1221-1229.
[13] GKIOXARI G, GIRSHICK R, MALIK J. Contextual action recognition with R*CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015:1080-1088.
[14] KHAN F S, VAN DE WEIJER J, ANWER R M, et al. Semantic pyramids for gender and action recognition[J]. IEEE transactions on image processing, 2014, 23(8):3633-3645.
[15] FEICHTENHOFER C, PINZ A, WILDES R P. Spatiotemporal multiplier networks for video action recognition[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017:7445-7454.
[16] GKIOXARI G, GIRSHICK R, MALIK J. Actions and attributes from wholes and parts[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015:2470-2478.
[17] KAR A, RAI N, SIKKA K, et al. Adascan:adaptive scan pooling in deep convolutional neural networks for human action recognition in videos[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017:5699-5708.
[18] FEICHTENHOFER C, PINZ A, WILDES R P. Spatiotemporal residual networks for video action recognition[C]//Proceedings of the 30th Conference on Neural Information Processing Systems. Barcelona, Spain, 2016:3468-3476.
[19] HERATH S, HARANDI M, PORIKLI F. Going deeper into action recognition:a survey[J]. Image and vision computing, 2017, 60:4-21.
[20] 邓利群. 三维人体动作识别及其在交互舞蹈系统上的应用[D]. 合肥:中国科学技术大学, 2012. DENG Liqun. 3D mocap data recognition and its application on interactive dancing game[D]. Hefei:University of Science and Technology of China, 2012.
[21] 申晓霞, 张桦, 高赞, 等. 基于深度信息和RGB图像的行为识别算法[J]. 模式识别与人工智能, 2013, 26(8):722-728 SHEN Xiaoxia, ZHANG Hua, GAO Zan, et al. Behavior recognition algorithm based on depth information and RGB image[J]. Pattern recognition and artificial intelligence, 2013, 26(8):722-728
[22] 王忠民, 曹洪江, 范琳. 一种基于卷积神经网络深度学习的人体行为识别方法[J]. 计算机科学, 2016, 43(11A):56-58, 87 WANG Zhongmin, CAO Hongjiang, FAN Lin. Method on human activity recognition based on convolutional neural networks[J]. Computer science, 2016, 43(11A):56-58, 87
[23] 迟元峰, 顾敏. 基于深度学习的人体行为识别研究[J]. 工业控制计算机, 2017, 31(1):104-105 CHI Yuanfeng, GU Min. Human action recognition based on deep learning[J]. Industrial control computer, 2017, 31(1):104-105
[24] 朱煜, 赵江坤, 王逸宁, 等. 基于深度学习的人体行为识别算法综述[J]. 自动化学报, 2016, 42(6):848-857 ZHU Yu, ZHAO Jiangkun, WANG Yi’ning, et al. A review of human action recognition based on deep learning[J]. Acta automatica sinica, 2016, 42(6):848-857
[25] SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016:761-769.

备注/Memo

收稿日期:2018-01-16。
基金项目:国家自然科学基金项目（60035117）.
作者简介:莫宏伟,主要研究方向为人工智能、类脑计算、智能机器人。承担完成国家自然科学基金、国防预研等项目17项。中国人工智能学会自然计算与数字城市专业委员会副主任,黑龙江省生物医学工程学会理事。中国生物医学工程学会高级会员。中国计算机学会高级会员。International Journal of Swarm Intelligence Research、《电子学报》编委。IEEE Tran on Industrial Informatics 2018专刊《医疗卫生中的大数据处理》副主编。发表学术论文70余篇。出版专著6部,授权发明专利7项;汪海波,男,1990年生,硕士研究生,主要研究方向为深度学习。
通讯作者:莫宏伟.E-mail:honwei2004@126.com

更新日期/Last Update: 2018-12-25

基于Faster R-CNN的人体行为检测研究 PDF下载HTML

备注/Memo

基于Faster R-CNN的人体行为检测研究

PDF下载 HTML