[1]夏洋洋,龚勋,洪西进.人脸识别背后的数据清理问题研究[J].智能系统学报,2017,12(5):616-623.[doi:10.11992/tis.201706025]
XIA Yangyang,GONG Xun,HONG Xijin.Research on the data cleansing problem for face recognition technology[J].CAAI Transactions on Intelligent Systems,2017,12(5):616-623.[doi:10.11992/tis.201706025]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
12
期数:
2017年第5期
页码:
616-623
栏目:
学术论文—机器感知与模式识别
出版日期:
2017-10-25
- Title:
-
Research on the data cleansing problem for face recognition technology
- 作者:
-
夏洋洋1, 龚勋1, 洪西进1,2
-
1. 西南交通大学 信息科学与技术学院, 四川 成都 611756;
2. 台湾科技大学 资讯工程系, 台湾 台北 10607
- Author(s):
-
XIA Yangyang1, GONG Xun1, HONG Xijin1,2
-
1. School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, China;
2. Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, China
-
- 关键词:
-
深度卷积神经网络; DCNN; 清理图像; 人脸识别; 大型数据库
- Keywords:
-
deep convolution neural network; DCNN; cleansing image; face recognition; large database
- 分类号:
-
TP391.4
- DOI:
-
10.11992/tis.201706025
- 摘要:
-
人脸识别技术在深度卷积神经网络(deep convolution neural networks,DCNN)的快速发展下取得了显著的成就。这些成果主要体现在更深层次的DCNN架构和更大的训练数据库。然而,由大多数私人公司持有的大型数据库(百万级)并不对外公开,即使当前部分开放的大型数据库,因为标注信息过少,无法保证精度,会影响DCNN的训练。本文提出了一种易于使用的多角度清理图像方法来提高数据的准确性:通过人脸检测算法清除掉无法检测到人脸的图像;在清理后的数据集上利用已有模型提取图像特征,并计算相似度,进而统计出一类人脸图像中每一张图像与其他图像不相似的数目,根据改进参数清理数据。实验表明,清理后的数据库训练模型在LFW和Youtube Face数据集上测试的精度得到了提升,使用较小规模数据集情况下,在LFW数据集上取得了99.17%的准确率,在Youtube Face数据集也达到了93.53%的准确率。
- Abstract:
-
Face recognition technology has made a significant progress in the rapid development of deep convolution neural networks (DCNN). These developments are mainly focused toward a denser DCNN architecture and larger training database. However, DCNN training is affected because the large-scale database held by most private companies are not publically accessible. Moreover, current large-scale open databases are not accessible because of the slight availability of the labeled information and hard-to-guarantee accuracy. This study presents an easy-to-use image cleansing method to improve the accuracy of data from the following perspectives:First, deleting the face image that cannot be detected by face detection; second, using the existing model to extract the features of an image on the cleaned dataset and calculate the similarity; and finally, counting the number of images that are unlike the other images. The data were cleansed according to the improved parameters extracted from the abovementioned perspectives. The experimental results reveal that the cleansed database training model has improved the accuracy of face recognition in LFW(labeled faces in the wild) and YouTube face database. In the case of using a small-scale dataset, an accuracy of 99.17% and 93.53% was achieved on the LFW and YouTube face datasets, respectively.
备注/Memo
收稿日期:2017-06-08。
基金项目:国家自然科学基金项目(61202191);计算智能重庆市重点实验室开放基金项目(CQ-LCI-2013-06);国家重点研发计划项目(2016YFC0802209).
作者简介:夏洋洋,男,1990年生,硕士研究生,主要研究方向为深度学习、图像处理、人脸识别;龚勋,男,1980年生,副教授,博士,主要研究方向为图像处理及模式识别、三维人脸建模、人脸图像分析及识别。获国家发明专利2项,发表学术论文30余篇,出版专著1部;洪西进,男,1957年生,特聘教授,博士,主要研究方向为信息安全、生物辨识、云计算与大数据、智能图像处理。发明专利13项,发表SCI期刊学术论文80余篇,国际学术会议论文110余篇。
通讯作者:龚勋.E-mail:xgong@swjtu.edu.cn
更新日期/Last Update:
2017-10-25