[1]董俊杰,刘华平,谢珺,等.基于反馈注意力机制和上下文融合的非模式实例分割[J].智能系统学报,2021,16(4):801-810.[doi:10.11992/tis.202007042]
DONG Junjie,LIU Huaping,XIE Jun,et al.Feedback attention mechanism and context fusion based amodal instance segmentation[J].CAAI Transactions on Intelligent Systems,2021,16(4):801-810.[doi:10.11992/tis.202007042]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
16
期数:
2021年第4期
页码:
801-810
栏目:
吴文俊人工智能科学技术奖论坛
出版日期:
2021-07-05
- Title:
-
Feedback attention mechanism and context fusion based amodal instance segmentation
- 作者:
-
董俊杰1, 刘华平2, 谢珺1, 续欣莹3, 孙富春2
-
1. 太原理工大学 信息与计算机学院,山西 晋中 030600;
2. 清华大学 智能技术与系统国家重点实验室,北京 100084;
3. 太原理工大学 电气与动力工程学院,山西 太原 030024
- Author(s):
-
DONG Junjie1, LIU Huaping2, XIE Jun1, XU Xinying3, SUN Fuchun2
-
1. College of Information and Computer, Taiyuan University of Technology, Jinzhong 030600, China;
2. State Key Lab. of Intelligent Technology and Systems, Tsinghua University, Beijing 100084, China;
3. College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024, China
-
- 关键词:
-
非模式实例分割; 遮挡预测; 反馈连接; 注意力机制; 上下文信息; 深度学习; 神经网络; 计算机视觉
- Keywords:
-
amodal instance segmentation; occlusion prediction; feedback connection; attention mechanism; context information; deep learning; neural network; computer vision
- 分类号:
-
TP183
- DOI:
-
10.11992/tis.202007042
- 摘要:
-
非模式实例分割是最近提出的对实例分割的扩展,其任务是对每个对象实例的可见区域和被遮挡区域都进行预测,感知完整的物理结构和语义概念。在预测对象被遮挡部分的形状和语义时,往往由于特征表示的识别能力不够和对上下文信息缺乏而导致对遮挡区域预测欠拟合甚至错误。针对这个问题,提出一个上下文注意模块和反馈注意力机制的特征金字塔结构,引入反馈连接进行再学习。该方法能够有效捕获全局语义信息和精细的空间细节,通过在COCO-amodal数据集训练和验证,非模式实例分割掩码平均精确率从8.4%提高到14.3%,平均召回率从16.6%提高到20.8%。实验结果表明,该方法能够显著提高对物体被遮挡部分预测的准确率,有效解决欠拟合问题。
- Abstract:
-
Recently, model instance segmentation has been proposed as an extension of instance segmentation to predict the visible and occluded areas of each object instance and perceive the complete physical structure and semantic concepts. When the shapes and meanings of occluded objects are being predicted, underfitting or even wrong results are obtained in the occlusion prediction due to the insufficient recognition capability of feature representation and the lack of contextual information. To solve this problem, this paper proposes a contextual attention module and feature pyramid structure of feedback attention mechanism and introduces feedback connections for relearning. The proposed method can effectively capture global semantic information and fine spatial details. Through training and verification in the COCO-amodal dataset, the average precision of the amodal instance segmentation mask increases from 8.4% to 14.3%, and the average recall rate increases from 16.6% to 20.8%. Experimental results show that this method can significantly improve the accuracy of occlusion prediction and effectively end underfitting.
备注/Memo
收稿日期:2020-07-24。
基金项目:山西省自然科学基金项目(201801D121144,201801D221190);辽宁省科技厅机器人技术国家重点实验室联合基金项目(2020-KF-22-06)
作者简介:董俊杰,硕士研究生,主要研究方向为智能信息处理、计算机视觉和图像识别;刘华平,副教授,博士生导师,IEEE Senior Member、中国人工智能学会理事、中国人工智能学会认知系统与信息处理专业委员会秘书长。主要研究方向为机器人感知、学习与控制、多模态信息融合。发表学术论文340余篇;谢珺,副教授,主要研究方向为粗糙集、粒计算、数据挖掘和智能信息处理.
通讯作者:刘华平.E-mail:hpliu@tsinghua.edu.cn
更新日期/Last Update:
1900-01-01