[1]王成济,罗志明,钟准,等.一种多层特征融合的人脸检测方法[J].智能系统学报,2018,(01):138-146.[doi:10.11992/tis.201707018]
 WANG Chengji,LUO Zhiming,ZHONG Zhun,et al.Face detection method fusing multi-layer features[J].CAAI Transactions on Intelligent Systems,2018,(01):138-146.[doi:10.11992/tis.201707018]
点击复制

一种多层特征融合的人脸检测方法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
期数:
2018年01期
页码:
138-146
栏目:
出版日期:
2018-01-24

文章信息/Info

Title:
Face detection method fusing multi-layer features
作者:
王成济12 罗志明12 钟准12 李绍滋12
1. 厦门大学 智能科学与技术系, 福建 厦门 361005;
2. 厦门大学 福建省类脑计算技术及应用重点实验室, 福建 厦门 361005
Author(s):
WANG Chengji12 LUO Zhiming12 ZHONG Zhun12 LI Shaozi12
1. Intelligent Science & Technology Department, Xiamen University, Xiamen 361005, China;
2. Fujian Key Laboratory of Brain-inspired Computing Technique and Applications, Xiamen University, Xiamen 361005, China
关键词:
人脸检测多姿态多尺度遮挡复杂场景卷积神经网络特征融合非极大值抑制
Keywords:
face detectionmulti posemulti scaleoccludecomplex scenesconvolutional neural networkfeature fusionnon-maximum suppression
分类号:
TP391.41
DOI:
10.11992/tis.201707018
摘要:
由于姿态、光照、尺度等原因,卷积神经网络需要学习出具有强判别力的特征才能应对复杂场景下的人脸检测问题。受卷积神经网络中特定特征层感受野大小限制,单独一层的特征无法应对多姿态多尺度的人脸,为此提出了串联不同大小感受野的多层特征融合方法用于检测多元化的人脸;同时,通过引入加权降低得分的方法,改进了目前常用的非极大值抑制算法,用于处理由于遮挡造成的相邻人脸的漏检问题。在FDDB和WiderFace两个数据集上的实验结果显示,文中提出的多层特征融合方法能显著提升检测结果,改进后的非极大值抑制算法能够提升相邻人脸之间的检测准确率。
Abstract:
To address the issues of pose, lighting variation, and scales, convolutional neural networks (CNNs) need to learn features with strong discrimination handle the face detection problem in complex scenes. Owing to the size limitations of the specific feature layer’s receptive field in convolutional neural networks, the features computed from a single layer of the CNNs are incapable of dealing with faces in multi poses and multi scales. Therefore, a multi-layer feature fusion method that is realized by fusing the different sizes of receptive fields is proposed to detect diversified faces. Moreover, via introducing the method of weighted score decrease, the present usual non-maximum suppression algorithm was improved to deal with the detection omission of neighboring faces caused by shielding. The experiment results with the FDDB and WiderFace datasets demonstrated that the fusion method proposed in this study can significantly boost detection performance, while the improved non-maximum suppression algorithm can increase the detection accuracy between neighboring faces.

参考文献/References:

[1] ZAFEIRIOU S, ZHANG Cha, ZHANG Zhengyou. A survey on face detection in the wild: past, present and future[J]. Computer vision and image understanding, 2015, 138: 1-24.
[2] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA, 2014: 580-587.
[3] GIRSHICK R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile, 2015: 1440-1448.
[4] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada, 2015, 1: 91-99.
[5] HUANG Lichao, YANG Yi, DENG Yafeng, et al. DenseBox: unifying landmark localization with end to end object detection[J]. arXiv preprint arXiv: 1509.04874, 2015.
[6] YU Jiahui, JIANG Yuning, WANG Zhangyang, et al. UnitBox: An advanced object detection network[C]//Proceedings of the 2016 ACM on Multimedia Conference. Amsterdam, The Netherlands, 2016: 516-520.
[7] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the International Conference on Learning Representations. Oxford, USA, 2015.
[8] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA, 2015: 3431-3440.
[9] VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, HI, USA, 2001, 1: I-511-I-518.
[10] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2): 91-110.
[11] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA, 2005, 1: 886-893.
[12] OSUNA E, FREUND R, GIROSIT F. Training support vector machines: an application to face detection[C]//Proceedings of the 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Juan, Argentina, 1997: 130-136.
[13] FRIEDMAN J, HASTIE T, TIBSHIRANI R. Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors)[J]. The annals of statistics, 2000, 29(5): 337-407. (请核对修改的是否正确)
[14] ZITNICK C L, DOLLáR P. Edge boxes: locating object proposals from edges[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland, 2014: 391-405.
[15] UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective search for object recognition[J]. International journal of computer vision, 2013, 104(2): 154-171.
[16] LI Haoxiang, LIN Zhe, SHEN Xiaohui, et al. A convolutional neural network cascade for face detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA, 2015: 5325-5334.
[17] FARFADE S S, SABERIAN M J, LI Lijia. Multi-view face detection using deep convolutional neural networks[C]//Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. Shanghai, China, 2015: 643-650.
[18] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 26th Annual Conference on Neural Information Processing Systems 2012. Lake Tahoe, Nevada, USA, 2012: 1097-1105.
[19] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, 2016: 770-778.
[20] HARIHARAN B, ARBELáEZ P, GIRSHICK R, et al. Hypercolumns for object segmentation and fine-grained localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA, 2015: 447-456.
[21] BODLA N, SINGH B, CHELLAPPA R, et al. Improving object detection with one line of code[J]. arXiv preprint arXiv: 1704.04503, 2017.
[22] YANG Shuo, LUO Ping, LOY C C, et al. Wider Face: A face detection benchmark[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, 2016: 5525-5533.
[23] JAIN V, LEARNED-MILLER E. FDDB: A benchmark for face detection in unconstrained settings[R]. UMass Amherst Technical Report UMCS-2010-009, 2010.
[24] DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA, 2009: 248-255.
[25] KINGMA D P, BA J L. Adam: A method for stochastic optimization[C]//Proceedings of International Conference on Learning Representations. Toronto, Canada, 2015.
[26] YANG Bin, YAN Junjie, LEI Zhen, et al. Aggregate channel features for multi-view face detection[C]//Proceedings of the 2014 IEEE International Joint Conference on Biometrics (IJCB). Clearwater, FL, USA, 2014: 1-8.
[27] MARKUS N, FRLJAK M, PANDZIC I S, et al. A method for object detection based on pixel intensity comparisons organized in decision trees[J]. CoRR, 2014. (未找到本条文献信息, 请核对)
[28] MATHIAS M, BENENSON R, PEDERSOLI M, et al. Face detection without bells and whistles[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland, 2014: 720-735.
[29] CHEN Dong, REN Shaoqing, WEI Yichen, et al. Joint cascade face detection and alignment[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland, 2014: 109-122.
[30] OHN-BAR E, TRIVEDI M M. To boost or not to boost? On the limits of boosted trees for object detection[C]//Proceedings of the 23rd International Conference on Pattern Recognition (ICPR). Cancun, Mexico, 2016: 3350-3355.
[31] YANG Shuo, LUO Ping, LOY C C, et al. From facial parts responses to face detection: A deep learning approach[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile, 2015: 3676-3684.

相似文献/References:

[1]陈春燕,章品正,罗立民.基于粒特征和连续Adaboost的人脸检测[J].智能系统学报,2009,(05):446.[doi:10.3969/j.issn.1673-4785.2009.05.010]
 CHEN Chun-yan,ZHANG Pin-zheng,LUO Li-min.Face detection using real Adaboost on granular features[J].CAAI Transactions on Intelligent Systems,2009,(01):446.[doi:10.3969/j.issn.1673-4785.2009.05.010]

备注/Memo

备注/Memo:
收稿日期:2017-07-10。
基金项目:国家自然科学基金项目(61572409, 61402386, 81230087, 61571188).
作者简介:王成济,男,1993年生,硕士研究生,主要研究方向为视频目标检测和图像分割;罗志明,男,1989年生,博士研究生,主要研究方向为图像分割、目标检测、医学图像分析。发表学术论文8篇;李绍滋,男,1963年生,教授,博士生导师,主要研究方向为计算机视觉、机器学习和数据挖掘。先后主持或参加过多项国家863项目、国家自然科学基金项目、教育部博士点基金项目、省科技重点项目等多个项目的研究,发表学术论文300多篇。
通讯作者:李绍滋.E-mail:szlig@xmu.edu.cn.
更新日期/Last Update: 2018-02-01