[1]申凯,王晓峰,杨亚东.基于双向消息链路卷积网络的显著性物体检测[J].智能系统学报,2019,14(06):1152-1162.[doi:10.11992/tis.201812003]
 SHEN Kai,WANG Xiaofeng,YANG Yadong.Salient object detection based on bidirectional message link convolution neural network[J].CAAI Transactions on Intelligent Systems,2019,14(06):1152-1162.[doi:10.11992/tis.201812003]
点击复制

基于双向消息链路卷积网络的显著性物体检测(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第14卷
期数:
2019年06期
页码:
1152-1162
栏目:
出版日期:
2019-11-05

文章信息/Info

Title:
Salient object detection based on bidirectional message link convolution neural network
作者:
申凯 王晓峰 杨亚东
上海海事大学 信息工程学院, 上海 201306
Author(s):
SHEN Kai WANG Xiaofeng YANG Yadong
College Of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
关键词:
显著性物体检测卷积神经网络注意力机制双向消息链路多尺度融合
Keywords:
salient object detectionconvolutional neural networkattention mechanismbidirectional message linkmulti-scale fusion
分类号:
TP391.4
DOI:
10.11992/tis.201812003
摘要:
有效特征的提取和高效使用是显著性物体检测中极具挑战的任务之一。普通卷积神经网络很难兼顾提取有效特征和高效使用这些特征。本文提出双向消息链路卷积网络(bidirectional message link convolution network,BML-CNN)模型,提取和融合有效特征信息用于显著性物体检测。首先,利用注意力机制引导特征提取模块提取实体有效特征,并以渐进方式选择整合多层次之间的上下文信息。然后使用带有跳过连接结构的网络与带门控函数的消息传递链路组成的双向信息链路,将高层语义信息与浅层轮廓信息相融合。最后,使用多尺度融合策略,编码多层有效卷积特征,以生成最终显著图。实验表明,BML-CNN在不同指标下均获得最好的表现。
Abstract:
The effective extraction and efficient utilization of features are among the most challenging tasks in salient object detection. The common convolutional neural network (CNN) can hardly reach a fine trade-off between effective feature extraction and efficient utilization. This paper proposes a bidirectional message link convolutional neural network (BML-CNN) model, which can extract and fuse effective features for salient object detection. First, the attention mechanism is used to guide the feature extraction module to extract the effective entity features, select, and integrate the multi-level context information in a progressive way. Second, the high-level semantic information is merged with shallow-profile information by a bidirectional message link, which is composed of a skip connection structure and a messaging link with a gating function. Finally, the saliency map can be generated by multi-scale fusion strategy, and effective features are encoded on several layers. The qualitative and quantitative experiments on six benchmark datasets show that the BML-CNN reaches the state-of-the-art performance under different indexes.

参考文献/References:

[1] ACHANTA R, HEMAMI S, ESTRADA F, et al. Frequency-tuned salient region detection[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA, 2009:1597-1604.
[2] RONNEBERGER O, FISCHER P, BROX T. U-Net:convolutional networks for biomedical image segmentation[J]. arXiv:1505.04597, 2015.
[3] ITTI L, KOCH C, NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE transactions on pattern analysis and machine intelligence, 1998, 20(11):1254-1259.
[4] LIU Tie, ZHENG Nanning, DING Wei, et al. Video attention:learning to detect a salient object sequence[C]/Proceedings of 200819th International Conference on Pattern Recognition. Tampa, USA, 2008:1-4.
[5] LI Guanbin, YU Yizhou. Visual saliency based on multiscale deep features[J]. Computer science, 2015.
[6] WANG Lijun, LU Huchuan, RUAN Xiang, et al. Deep networks for saliency detection via local estimation and global search[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:3183-3192.
[7] WANG Linzhao, WANG Lijun, LU Huchuan, et al. Salient object detection with recurrent fully convolutional networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 41(7):1734-1746.
[8] CHENG Mingming, ZHANG Guoxin, MITRA N, et al. Global contrast based salient region detection[C]//Proceedings of 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA, 2011:409-416.
[9] JIANG Zhuolin, DAVIS L S. Submodular salient region detection[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013:2043-2050.
[10] JUNG C, KIM C. A unified spectral-domain approach for saliency detection and its application to automatic object segmentation[J]. IEEE transactions on image processing, 2012, 21(3):1272-1283.
[11] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J]. Computer science, arXiv:1511.00561, 2015.
[12] REN Zhixiang, GAO Shenghua, CHIA L T, et al. Region-based saliency detection and its application in object recognition[J]. IEEE transactions on circuits and systems for video technology, 2014, 24(5):769-779.
[13] MU Nana, XU Xiaolong, ZHANG Xong, et al. Salient object detection using a covariance-based CNN model in low-contrast images[J]. Neural computing and applications, 2018, 29(8):181-192.
[14] ZHOU Li, YANG Zhaohui, ZHOU Zongtan, et al. Salient region detection using diffusion process on a two-layer sparse graph[J]. IEEE transactions on image processing, 2017, 26(12):5882-5894.
[15] LIU Tie, DUAN Haibin, SHANG Yuanyuan, et al. Automatic salient object sequence rebuilding for video segment analysis[J]. Science China information sciences, 2018, 61(1):012205.
[16] ZHANG Jing, FENG Shengwei, LI Da, et al. Image retrieval using the extended salient region[J]. Information sciences, 2017, 399:154-182.
[17] SINGH C, PREET KAUR K. A fast and efficient image retrieval system based on color and texture features[J]. Journal of visual communication and image representation, 2016, 41:225-238.
[18] XU Gongwen, XU Lina, LI Xiaomei, et al. An image retrieval method based on visual dictionary and saliency region[J]. International journal of signal processing, image processing and pattern recognition, 2016, 9(7):263-274.
[19] ZHAO Rui, OUYANG Wanli, LI Hongsheng, et al. Saliency detection by multi-context deep learning[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:1265-1274.
[20] DAI Jifeng, HE Kaiming, LI Yi, et al. Instance-sensitive fully convolutional networks[J]. Computer science, 2016.
[21] YANG Wei, OUYANG Wanli, LI Hongsheng, et al. End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016:3073-3082.
[22] HARIHARAN B, ARBELáEZ P, GIRSHICK R, et al. Hypercolumns for object segmentation and fine-grained localization[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:447-456.
[23] LEE G, TAI Y W, KIM J. Deep saliency with encoded low level distance map and high level features[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016:660-668.
[24] LIU Nian, HAN Junwei. DHSNet:deep hierarchical saliency network for salient object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016:678-686.
[25] XIAO Fen, DENG Wenzheng, PENG Liangchan, et al. Multi-scale deep neural network for salient object detection[J]. IET image processing, 2018, 12(11):2036-2041.
[26] HOU Qibin, CHENG Mingming, HU Xiaowei, et al. Deeply supervised salient object detection with short connections[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017:5300-5309.
[27] ZHANG Pingping, WANG Dong, LU Huchuan, et al. Amulet:aggregating multi-level convolutional features for salient object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy, 2017:202-211.
[28] JIN Xiaojie, CHEN Yunpeng, FENG Jiashi, et al. Multi-path feedback recurrent neural network for scene parsing[J]. Computer science, arXiv:1608.07706, 2016.
[29] CHEN Long, ZHANG Hanwang, XIAO Jun, et al. SCA-CNN:spatial and channel-wise attention in convolutional networks for image captioning[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017:6298-6306.
[30] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:3431-3440.
[31] YAN Qiong, XU Li, SHI Jianping, et al. Hierarchical saliency detection[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013:1155-1162.
[32] YANG Zichao, HE Xiaodong, GAO Jianfeng, et al. Stacked attention networks for image question answering[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016:21-29.
[33] XU Huijuan, SAENKO K. Ask, attend and answer:exploring question-guided spatial attention for visual question answering[J]. Computer science, arXiv:1511.05234, 2015.
[34] WANG Fei, JIANG Mengqing, QIAN Chen, et al. Residual attention network for image classification[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA, 2017:6450-6458.
[35] XU K, BA J, KIROS R, et al. Show, attend and tell:neural image caption generation with visual attention[J]. Computer science, arXiv:1502.03044, 2015.
[36] SIMONYAN K, VEDALDI A, ZISSERMAN A. Deep inside convolutional networks:visualising image classification models and saliency maps[J]. Computer science, 2013.
[37] ZERIER M D, FERGUS R. Visualizing and understanding convolutional networks[J]. Computer science, 2013.
[38] MAHENDRAN A, VEDALDI A. Understanding deep image representations by inverting them[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:5188-5196.
[39] WANG Lijun, OUYANG Wanli, WANG Xiaogang, et al. Visual tracking with fully convolutional networks[C]/Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015:3119-3127.
[40] ZHANG Pingping, WANG Dong, LU Huchuan, et al. Learning uncertain convolutional features for accurate saliency detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy, 2017:212-221.
[41] YANG Chuan, ZHANG Lihe, LU Huchuan, et al. Saliency detection via graph-based manifold ranking[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013:3166-3173.
[42] LI Yin, HOU Xiaodi, KOCH C, et al. The secrets of salient object segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014:280-287.
[43] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The Pascal visual object classes (VOC) challenge[J]. International journal of computer vision, 2010, 88(2):303-338.
[44] GLOROT X, BENGIO Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. Sardinia, Italy, 2010:249-256.
[45] TONG Na, LU Huchuan, RUAN Xiang, et al. Salient object detection via bootstrap learning[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:1884-1892.
[46] WANG Tiantian, ZHANG Lihe, LU Huchuan, et al. Kernelized subspace ranking for saliency detection[C]//Proceedings of the 14thEuropean Conference on Computer Vision. Amsterdam, The Netherlands, 2016:450-466.
[47] JIANG Huaizu, WANG Jingdong, YUAN Zejian, et al. Salient object detection:a discriminative regional feature integration approach[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013:2083-2090.
[48] LI Xi, ZHAO Liming, WEI Lina, et al. DeepSaliency:multi-task deep neural network model for salient object detection[J]. IEEE transactions on image processing, 2016, 25(8):3919-3930.
[49] LI Guanbin, YU Yizhou. Deep contrast learning for salient object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA, 2016:478-487.
[50] ZHU Wangjiang, LIANG Shuang, WEI Yichen, et al. Saliency optimization from robust background detection[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, 2014:2814-2821.

相似文献/References:

[1]殷瑞,苏松志,李绍滋.一种卷积神经网络的图像矩正则化策略[J].智能系统学报,2016,11(1):43.[doi:10.11992/tis.201509018]
 YIN Rui,SU Songzhi,LI Shaozi.Convolutional neural network’s image moment regularizing strategy[J].CAAI Transactions on Intelligent Systems,2016,11(06):43.[doi:10.11992/tis.201509018]
[2]龚震霆,陈光喜,任夏荔,等.基于卷积神经网络和哈希编码的图像检索方法[J].智能系统学报,2016,11(3):391.[doi:10.11992/tis.201603028]
 GONG Zhenting,CHEN Guangxi,REN Xiali,et al.An image retrieval method based on a convolutional neural network and hash coding[J].CAAI Transactions on Intelligent Systems,2016,11(06):391.[doi:10.11992/tis.201603028]
[3]刘帅师,程曦,郭文燕,等.深度学习方法研究新进展[J].智能系统学报,2016,11(5):567.[doi:10.11992/tis.201511028]
 LIU Shuaishi,CHENG Xi,GUO Wenyan,et al.Progress report on new research in deep learning[J].CAAI Transactions on Intelligent Systems,2016,11(06):567.[doi:10.11992/tis.201511028]
[4]师亚亭,李卫军,宁欣,等.基于嘴巴状态约束的人脸特征点定位算法[J].智能系统学报,2016,11(5):578.[doi:10.11992/tis.201602006]
 SHI Yating,LI Weijun,NING Xin,et al.A facial feature point locating algorithmbased on mouth-state constraints[J].CAAI Transactions on Intelligent Systems,2016,11(06):578.[doi:10.11992/tis.201602006]
[5]宋婉茹,赵晴晴,陈昌红,等.行人重识别研究综述[J].智能系统学报,2017,12(06):770.[doi:10.11992/tis.201706084]
 SONG Wanru,ZHAO Qingqing,CHEN Changhong,et al.Survey on pedestrian re-identification research[J].CAAI Transactions on Intelligent Systems,2017,12(06):770.[doi:10.11992/tis.201706084]
[6]杨晓兰,强彦,赵涓涓,等.基于医学征象和卷积神经网络的肺结节CT图像哈希检索[J].智能系统学报,2017,12(06):857.[doi:10.11992/tis.201706035]
 YANG Xiaolan,QIANG Yan,ZHAO Juanjuan,et al.Hashing retrieval for CT images of pulmonary nodules based on medical signs and convolutional neural networks[J].CAAI Transactions on Intelligent Systems,2017,12(06):857.[doi:10.11992/tis.201706035]
[7]王科俊,赵彦东,邢向磊.深度学习在无人驾驶汽车领域应用的研究进展[J].智能系统学报,2018,13(01):55.[doi:10.11992/tis.201609029]
 WANG Kejun,ZHAO Yandong,XING Xianglei.Deep learning in driverless vehicles[J].CAAI Transactions on Intelligent Systems,2018,13(06):55.[doi:10.11992/tis.201609029]
[8]莫凌飞,蒋红亮,李煊鹏.基于深度学习的视频预测研究综述[J].智能系统学报,2018,13(01):85.[doi:10.11992/tis.201707032]
 MO Lingfei,JIANG Hongliang,LI Xuanpeng.Review of deep learning-based video prediction[J].CAAI Transactions on Intelligent Systems,2018,13(06):85.[doi:10.11992/tis.201707032]
[9]王成济,罗志明,钟准,等.一种多层特征融合的人脸检测方法[J].智能系统学报,2018,13(01):138.[doi:10.11992/tis.201707018]
 WANG Chengji,LUO Zhiming,ZHONG Zhun,et al.Face detection method fusing multi-layer features[J].CAAI Transactions on Intelligent Systems,2018,13(06):138.[doi:10.11992/tis.201707018]
[10]葛园园,许有疆,赵帅,等.自动驾驶场景下小且密集的交通标志检测[J].智能系统学报,2018,13(03):366.[doi:10.11992/tis.201706040]
 GE Yuanyuan,XU Youjiang,ZHAO Shuai,et al.Detection of small and dense traffic signs in self-driving scenarios[J].CAAI Transactions on Intelligent Systems,2018,13(06):366.[doi:10.11992/tis.201706040]

备注/Memo

备注/Memo:
收稿日期:2018-12-04。
基金项目:国家自然科学基金项目(61872231,61703267);上海海事大学研究生创新基金项目(2017ycx083).
作者简介:申凯,男,1996年生,硕士研究生,主要研究方向为计算机视觉、图像处理与视觉问答;王晓峰,男,1958年生,教授,博士生导师,International Journal of Granular Computing,Rough Sets and Intelligent Systems (IJGCRSIS)编委,中国人工智能学会机器学习专业委员会常务委员,中国人工智能学会智能交通专业委员会委员等。主要研究方向为人工智能、数据挖掘与知识发现。主持和参加国家863计划课题、国家自然科学基金重点课题各1项,主持国家合作项目2项、辽宁省自然科学基金2项,科研项目30余项。发表学术论文70余篇;杨亚东,男,1990年生,博士研究生,主要研究方向为计算机视觉、图像处理。
通讯作者:王晓峰.E-mail:xfwang@shmtu.edu.cn
更新日期/Last Update: 2019-12-25