[1]王凯诚,鲁华祥,龚国良,等.基于注意力机制的显著性目标检测方法[J].智能系统学报,2020,15(5):956-963.[doi:10.11992/tis.201903001]
 WANG Kaicheng,LU Huaxiang,GONG Guoliang,et al.Salient object detection method based on the attention mechanism[J].CAAI Transactions on Intelligent Systems,2020,15(5):956-963.[doi:10.11992/tis.201903001]
点击复制

基于注意力机制的显著性目标检测方法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第15卷
期数:
2020年5期
页码:
956-963
栏目:
学术论文—机器感知与模式识别
出版日期:
2020-10-31

文章信息/Info

Title:
Salient object detection method based on the attention mechanism
作者:
王凯诚12 鲁华祥134 龚国良1 陈刚1
1. 中国科学院 半导体研究所,北京 100083;
2. 中国科学院大学 未来技术学院,北京 100089;
3. 中国科学院 脑科学与智能技术卓越创新中心,上海 200031;
4. 半导体神经网络智能感知与计算技术北京市重点实验室,北京 100083
Author(s):
WANG Kaicheng12 LU Huaxiang134 GONG Guoliang1 CHEN Gang1
1. Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China;
2. School of Future Technology, University of Chinese Academy of Sciences, Beijing 100089, China;
3. Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China;
4. Semiconductor Neural Network Intelligent Perception and Computing Technology Beijing Key Lab, Beijing 100083, China
关键词:
显著性目标检测深度学习全卷积神经网络视觉注意力多尺度特征图像处理人工智能计算机视觉
Keywords:
salient object detectiondeep learningfully convolutional neural networkvisual attentionmulti-scale featuresimage processingartificial intelligencecomputer vision
分类号:
TP391
DOI:
10.11992/tis.201903001
文献标志码:
A
摘要:
针对目前主流的基于全卷积神经网络的显著性目标检测方法,受限于卷积层感受野大小,低层特征缺少全局性的信息,而高层特征由于多次池化操作分辨率较低,无法准确地预测目标边缘等细节的问题,本文提出了基于注意力的显著性目标检测方法。在ResNet-50网络中加入注意力精炼模块,利用训练样本的显著真值图对空间注意力进行有监督的学习,使得不同像素位置的相关性更准确。通过深度融合多尺度的特征,用低层特征优化高层特征,精修网络的预测结果使其更加准确。在DUT-OMRON和ECSSD数据集上的测试结果显示,本文方法能显著提升检测效果,F-measure和平均绝对误差都优于其他同类方法。
Abstract:
Salient object detection simulates human visual mechanism. At present, the mainstream methods are based on fully convolutional neural networks. Limited by the receptive fields of convolution layers, low-level features lack a global description of images, whereas high-level features are too coarse to accurately segment details of objects, such as edges, because of multi-stage downsampling operations. To solve this problem, we propose a salient object detection method based on the attention mechanism. We introduce novel attention refinement modules. The ground-truth attention calculated from the training datasets is employed to supervise spatial attention. Through this method, the network learns more accurate position relevance between different pixels. In addition, to refine the output salient maps, we gradually combine the multi-scale features and optimize low-layer features with high-layer features. Sufficient experiments on DUT-OMRON and ECSSD datasets have demonstrated that the proposed method outperforms the others in terms of the value of the F measure and mean absolute error.

参考文献/References:

[1] ZHANG Fan, DU Bo, ZHANG Liangpei. Salien-cy-guided unsupervised feature learning for scene classi-fication[J]. IEEE transactions on geoscience and remote sensing, 2015, 53(4): 2175-2184.
[2] ITTI L, KOCH C, NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE transactions on pattern analysis and machine intelligence, 1998, 20(11): 1254-1259.
[3] HONG S, YOU T, KWAK S, et al. Online tracking by learning discriminative saliency map with convolutional neural network[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France, 2015: 597-606.
[4] TREISMAN A M, GELADE G. A feature-integration theory of attention[J]. Cognitive psychology, 1980, 12(1): 97-136.
[5] KOCH C, ULLMAN S. Shifts in selective visual attention: towards the underlying neural circuitry[J]. Human neurobiology, 1985, 4(4): 219-227.
[6] WOLFE J M, CAVE K R, FRANZEL S L. Guided search: an alternative to the feature integration model for visual search[J]. Journal of experimental psychology: human perception and performance, 1989, 15(3): 419-433.
[7] LIU Tie, SUN Jian, ZHENG Nanning, et al. Learning to detect a salient object[C]//Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA, 2007: 1-8.
[8] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradi-ent-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[9] LONG J, SHELHAMER E, DARRELL T. Fully convo-lutional networks for semantic segmenta-tion[C]//Proceedings of 2015 IEEE Conference on Com-puter Vision and Pattern Recognition. Boston, USA, 2015: 3431-3440.
[10] HE Shengfeng, LAU R W, LIU Wenxi, et al. Supercnn: a superpixelwise convolutional neural network for salient object detection[J]. International journal of computer vision, 2015, 115(3): 330-344.
[11] WANG Lijun, LU Huchuan, RUAN Xiang, et al. Deep networks for saliency detection via local estimation and global search[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015: 3183-3192.
[12] WANG Xiang, MA Huimin, CHEN Xiaozhi. Salient object detection via fast R-CNN and low-level cues[C]//Proceedings of 2016 IEEE International Con-ference on Image Processing (ICIP). Phoenix, USA, 2016: 1042-1046.
[13] WANG Linzhao, WANG Lijun, LU Huchuan, et al. Sa-liency detection with recurrent fully convolutional net-works[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands, 2016: 825-841.
[14] LIU Nian, HAN Junwei. Dhsnet: deep hierarchical sali-ency network for salient object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016: 678-686.
[15] WANG Tiantian, BORJI A, ZHANG Lihe, et al. A stagewise refinement model for detecting salient objects in images[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy, 2017: 4039-4048.
[16] LAROCHELLE H, HINTON G. Learning to combine foveal glimpses with a third-order Boltzmann ma-chine[C]//Proceedings of the 23rd International Conference on Neural Information Processing Systems. Vancouver, Canada, 2010: 1243-1251.
[17] FU Jianlong, ZHENG Heliang, MEI Tao. Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA, 2017: 4476-4484.
[18] CHEN Long, ZHANG Hanwang, XIAO Jun, et al. SCA-CNN: spatial and channel-wise attention in convo-lutional networks for image captioning[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA, 2017: 6298-6306.
[19] HU Jie, SHEN Li, SUN Gang. Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 7132-7141.
[20] WANG Xiaolong, GIRSHICK R B, GUPTA A, et al. Non-local neural networks[C]//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 7794-7803.
[21] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016: 770-778.
[22] CHENG Mingming, MITRA N J, HUANG Xiaolei, et al. Global contrast based salient region detection[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(3): 569-582.
[23] YANG Chuan, ZHANG Lihe, LU Huchuan, et al. Sali-ency detection via graph-based manifold rank-ing[C]//Proceedings of 2013 IEEE Conference on Com-puter Vision and Pattern Recognition. Portland, USA, 2013: 3166-3173.
[24] YAN Qiong, XU Li, SHI Jianping, et al. Hierarchical saliency detection[C]//Proceedings of 2013 IEEE Con-ference on Computer Vision and Pattern Recognition. Portland, USA, 2013: 1155-1162.
[25] DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 248-255.
[26] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep con-volutional nets, atrous convolution, and fully connected CRFs[J]. IEEE transactions on pattern analysis and ma-chine intelligence, 2018, 40(4): 834-848.
[27] ACHANTA R, HEMAMI S, ESTRADA F, et al. Fre-quency-tuned salient region detection[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009: 1597-1604.
[28] JIANG Huaizu, WANG Jingdong, YUAN Zejian, et al. Salient object detection: a discriminative regional feature integration approach[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013: 2083-2090.
[29] TONG Na, LU Huchuan, RUAN Xiang, et al. Salient object detection via bootstrap learning[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015: 1884-1892.
[30] LI Guanbin, YU Yizhou. Visual saliency based on mul-tiscale deep features[C]//Proceedings of 2015 IEEE Con-ference on Computer Vision and Pattern Recognition. Boston, USA, 2015: 5455-5463.
[31] ZHAO Rui, OUYANG Wanli, LI Hongsheng, et al. Sali-ency detection by multi-context deep learn-ing[C]//Proceedings of 2015 IEEE Conference on Com-puter Vision and Pattern Recognition. Boston, USA, 2015: 1265-1274.
[32] LI Xi, ZHAO Liming, WEI Lina, et al. DeepSaliency: multi-task deep neural network model for salient object detection[J]. IEEE transactions on image processing, 2016, 25(8): 3919-3930.
[33] LEE G, TAI Y, KIM J. Deep saliency with encoded low level distance map and high level features[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016: 660-668.
[34] LI Guanbin, YU Yizhou. Deep contrast learning for salient object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016: 478-487.
[35] WANG Tiantian, ZHANG Lihe, LU Huchuan, et al. Kernelized subspace ranking for saliency detec-tion[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, the Netherlands, 2016: 450-466.
[36] YU Changqian, WANG Jingbo, PENG Chao, et al. Learning a discriminative feature network for semantic segmentation[C]//Proceedings of 2018 IEEE/CVF Con-ference on Computer Vision and Pattern Recognition. Salt Lake City, USA, 2018: 1857-1866.
[37] ZHAO Hengshuang, SHI Jianping, QI Xiaojuan, et al. Pyramid scene parsing network[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017: 6230-6239.
[38] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmen-tation[J]. arXiv: 1706.05587, 2017.

相似文献/References:

[1]张媛媛,霍静,杨婉琪,等.深度信念网络的二代身份证异构人脸核实算法[J].智能系统学报,2015,10(02):193.[doi:10.3969/j.issn.1673-4785.201405060]
 ZHANG Yuanyuan,HUO Jing,YANG Wanqi,et al.A deep belief network-based heterogeneous face verification method for the second-generation identity card[J].CAAI Transactions on Intelligent Systems,2015,10(5):193.[doi:10.3969/j.issn.1673-4785.201405060]
[2]丁科,谭营.GPU通用计算及其在计算智能领域的应用[J].智能系统学报,2015,10(01):1.[doi:10.3969/j.issn.1673-4785.201403072]
 DING Ke,TAN Ying.A review on general purpose computing on GPUs and its applications in computational intelligence[J].CAAI Transactions on Intelligent Systems,2015,10(5):1.[doi:10.3969/j.issn.1673-4785.201403072]
[3]马晓,张番栋,封举富.基于深度学习特征的稀疏表示的人脸识别方法[J].智能系统学报,2016,11(3):279.[doi:10.11992/tis.201603026]
 MA Xiao,ZHANG Fandong,FENG Jufu.Sparse representation via deep learning features based face recognition method[J].CAAI Transactions on Intelligent Systems,2016,11(5):279.[doi:10.11992/tis.201603026]
[4]刘帅师,程曦,郭文燕,等.深度学习方法研究新进展[J].智能系统学报,2016,11(5):567.[doi:10.11992/tis.201511028]
 LIU Shuaishi,CHENG Xi,GUO Wenyan,et al.Progress report on new research in deep learning[J].CAAI Transactions on Intelligent Systems,2016,11(5):567.[doi:10.11992/tis.201511028]
[5]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,11(6):728.[doi:10.11992/tis.201611021]
 MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11(5):728.[doi:10.11992/tis.201611021]
[6]王亚杰,邱虹坤,吴燕燕,等.计算机博弈的研究与发展[J].智能系统学报,2016,11(6):788.[doi:10.11992/tis.201609006]
 WANG Yajie,QIU Hongkun,WU Yanyan,et al.Research and development of computer games[J].CAAI Transactions on Intelligent Systems,2016,11(5):788.[doi:10.11992/tis.201609006]
[7]黄心汉.A3I:21世纪科技之光[J].智能系统学报,2016,11(6):835.[doi:10.11992/tis.201605022]
 HUANG Xinhan.A3I: the star of science and technology for the 21st century[J].CAAI Transactions on Intelligent Systems,2016,11(5):835.[doi:10.11992/tis.201605022]
[8]宋婉茹,赵晴晴,陈昌红,等.行人重识别研究综述[J].智能系统学报,2017,12(06):770.[doi:10.11992/tis.201706084]
 SONG Wanru,ZHAO Qingqing,CHEN Changhong,et al.Survey on pedestrian re-identification research[J].CAAI Transactions on Intelligent Systems,2017,12(5):770.[doi:10.11992/tis.201706084]
[9]杨梦铎,栾咏红,刘文军,等.基于自编码器的特征迁移算法[J].智能系统学报,2017,12(06):894.[doi:10.11992/tis.201706037]
 YANG Mengduo,LUAN Yonghong,LIU Wenjun,et al.Feature transfer algorithm based on an auto-encoder[J].CAAI Transactions on Intelligent Systems,2017,12(5):894.[doi:10.11992/tis.201706037]
[10]王科俊,赵彦东,邢向磊.深度学习在无人驾驶汽车领域应用的研究进展[J].智能系统学报,2018,13(01):55.[doi:10.11992/tis.201609029]
 WANG Kejun,ZHAO Yandong,XING Xianglei.Deep learning in driverless vehicles[J].CAAI Transactions on Intelligent Systems,2018,13(5):55.[doi:10.11992/tis.201609029]

备注/Memo

备注/Memo:
收稿日期:2019-03-02。
基金项目:国家自然科学基金项目(61701473);中国科学院STS计划项目(KFJ-STS-ZDTP-070);北京市科技计划项目(Z181100001518006);中国科学院国防科技创新基金项目(CXJJ-17-M152);中国科学院战略性先导科技专项(A类)(XDA18040400)
作者简介:王凯诚,硕士研究生,主要研究方向为神经网络芯片、机器学习;鲁华祥,研究员,博士生导师,主要研究方向为类神经计算芯片、类脑神经计算技术和应用系统、信息与信号处理。出版专著1部,授权发明专利10项。发表学术论文40余篇;龚国良,副研究员,主要研究方向为智能算法与类脑计算系统、图像处理芯片、AI芯片、神经网络算法及其应用研究。授权发明专利4项。发表学术论文6篇。
通讯作者:龚国良.E-mail:gongmianjie@semi.ac.cn
更新日期/Last Update: 2021-01-15