<-上一篇/Previous Article 下一篇/Next Article->

[1]杨宇宇,杨霄,潘在宇,等.基于原型引导与自适应特征融合的域适应语义分割[J].智能系统学报,2025,20(1):150-161.[doi:10.11992/tis.202403010]
　YANG Yuyu,YANG Xiao,PAN Zaiyu,et al.Domain adaptive semantic segmentation based on prototype-guided and adaptive feature fusion[J].CAAI Transactions on Intelligent Systems,2025,20(1):150-161.[doi:10.11992/tis.202403010]

点击复制

基于原型引导与自适应特征融合的域适应语义分割

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 20 期数: 2025年第1期页码: 150-161 栏目: 学术论文—智能系统出版日期: 2025-01-05

Title:: Domain adaptive semantic segmentation based on prototype-guided and adaptive feature fusion

作者:: 杨宇宇, 杨霄, 潘在宇, 王军; 中国矿业大学信息与控制工程学院, 江苏徐州 221116

Author(s):: YANG Yuyu, YANG Xiao, PAN Zaiyu, WANG Jun; School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China

关键词:: 深度学习; 无监督学习; 域适应; 语义分割; 注意力机制; 自训练学习; 自适应; 迁移学习; 原型引导

Keywords:: deep learning; unsupervised learning; domain adaptation; semantic segmentation; attention mechanism; self-training learning; self-adaptive; transfer learning; prototype guidance

分类号:: TP301

DOI:: 10.11992/tis.202403010

摘要:: 无监督域自适应技术对于减少计算机视觉任务中的数据标注工作量具有重要意义，尤其在像素级的语义分割中。然而，目标域的特征分布离散和类别不平衡问题，如模糊的类边界和某些类别的样本过少，对无监督域自适应技术构成了挑战。针对上述挑战，本文提出了一种原型引导的自适应特征融合模型。其中，通过引入原型引导的双重注意力网络融合空间和通道注意力特征，增强类内紧凑性。此外，本文提出自适应特征融合模块，灵活调整各特征的重要性，使网络能够在不同的空间位置和通道上捕捉到更加具有类别区分性的特征，进一步提升语义分割性能。在两个具有挑战性的合成–真实基准GTA5-to-Cityscape和SYNTHIA-to-Cityscape上的实验结果证明了本文方法的有效性，展现出模型对复杂场景和不平衡数据的处理应对能力。

Abstract:: Unsupervised domain adaptation techniques are of significant importance to reducing the data annotation workload for computer vision tasks, particularly in pixel-level semantic segmentation. However, challenges such as the dispersed feature distribution and class imbalance in the target domain, such as blurred class boundaries and insufficient samples for certain categories, pose challenges to this technology. To address these challenges, this paper proposes a prototype-guided adaptive feature fusion model. It incorporates a dual attention network guided by prototypes to fuse spatial and channel attention features, enhancing class-wise compactness. Furthermore, this paper introduces an adaptive feature fusion module that flexibly adjusts the importance of each feature, enabling the network to capture more class-discriminative features across different spatial locations and channels, thereby further enhancing the performance of semantic segmentation. Experimental results on two challenging synthetic-to-real benchmarks of GTA5-to-Cityscape and SYNTHIA-to-Cityscape demonstrate the effectiveness of our method, showcasing the model’s capability to handle complex scenes and imbalanced data.

参考文献/References:: [1] 景庄伟, 管海燕, 彭代峰, 等. 基于深度神经网络的图像语义分割研究综述[J]. 计算机工程, 2020, 46(10): 1-17.
JING Zhuangwei, GUAN Haiyan, PENG Daifeng, et al. Survey of research in image semantic segmentation based on deep neural network[J]. Computer engineering, 2020, 46(10): 1-17.
[2] 计梦予, 袭肖明, 于治楼. 基于深度学习的语义分割方法综述[J]. 信息技术与信息化, 2017(10): 137-140.
JI Mengyu, XI Xiaoming, YU Zhilou. A review of semantic segmentation based on deep learning[J]. Information technology and informatization, 2017(10): 137-140.
[3] SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence. Boston: IEEE, 2017: 640-651.
[4] 范苍宁, 刘鹏, 肖婷, 等. 深度域适应综述: 一般情况与复杂情况[J]. 自动化学报, 2021, 47(3): 515-548.
FAN Cangning, LIU Peng, XIAO Ting, et al. A review of deep domain adaptation: general situation and complex situation[J]. Acta automatica sinica, 2021, 47(3): 515-548.
[5] 高德鹏. 基于跨域正则化模型的域适应方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2020.
GAO Depeng. Research on domain adaptation method based on cross-domain regularization model[D]. Harbin: Harbin Institute of Technology, 2020.
[6] 王格格, 郭涛, 余游, 等. 基于生成对抗网络的无监督域适应分类模型[J]. 电子学报, 2020, 48(6): 1190-1197.
WANG Gege, GUO Tao, YU You, et al. Unsupervised domain adaptation classification model based on generative adversarial network[J]. Acta electronica sinica, 2020, 48(6): 1190-1197.
[7] BOUSMALIS K, SILBERMAN N, DOHAN D, et al. Unsupervised pixel-level domain adaptation with generative adversarial networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 95-104.
[8] ZHOU Wei, WANG Yukang, CHU Jiajia, et al. Affinity space adaptation for semantic segmentation across domains[J]. IEEE transactions on image processing, 2021, 30: 2549-2561.
[9] 高子航, 刘兆英, 张婷, 等. 基于对抗域适应的红外舰船目标分割[J]. 数据采集与处理, 2023, 38(3): 598-607.
GAO Zihang, LIU Zhaoying, ZHANG Ting, et al. Infrared ship target segmentation based on adversarial domain adaptation[J]. Journal of data acquisition and processing, 2023, 38(3): 598-607.
[10] 张桂梅, 鲁飞飞, 龙邦耀, 等. 结合自集成和对抗学习的域自适应城市场景语义分割[J]. 模式识别与人工智能, 2021, 34(1): 58-67.
ZHANG Guimei, LU Feifei, LONG Bangyao, et al. Domain adaptation semantic segmentation for urban scene combining self-ensembling and adversarial learning[J]. Pattern recognition and artificial intelligence, 2021, 34(1): 58-67.
[11] ZHAO Yihao, WU Ruihai, DONG Hao. Unpaired image-to-image translation using adversarial consistency loss[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2020: 800-815.
[12] 李美丽, 杨传颖, 石宝. 基于语义分割的图像风格迁移技术研究[J]. 计算机工程与应用, 2020, 56(24): 207-213.
LI Meili, YANG Chuanying, SHI Bao. Research on image style transfer technology based on semantic segmentation[J]. Computer engineering and applications, 2020, 56(24): 207-213.
[13] 吕佳, 李婷婷. 半监督自训练方法综述[J]. 重庆师范大学学报(自然科学版), 2021, 38(5): 98-106.
LYU Jia, LI Tingting. A summary of semi-supervised self-training methods[J]. Journal of Chongqing normal university (natural science edition), 2021, 38(5): 98-106.
[14] 张勋晖, 周勇, 赵佳琦, 等. 基于熵增强的无监督域适应遥感图像语义分割[J]. 计算机应用研究, 2021, 38(9): 2852-2856.
ZHANG Xunhui, ZHOU Yong, ZHAO Jiaqi, et al. Entropy enhanced unsupervised domain adaptive remote sensing image semantic segmentation[J]. Application research of computers, 2021, 38(9): 2852-2856.
[15] ZHANG Pan, ZHANG Bo, ZHANG Ting, et al. Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 12409-12419.
[16] ZHAO Hengshuang, SHI Jianping, QI Xiaojuan, et al. Pyramid scene parsing network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6230-6239.
[17] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 40(4): 834-848.
[18] YANG Zhen, PENG Xiaobao, YIN Zhijian, et al. Deeplab_v3_plus-net for image semantic segmentation with channel compression[C]//2020 IEEE 20th International Conference on Communication Technology. Nanning: IEEE, 2020: 1320-1324.
[19] LIU Ze, LIN Yutong, CAO Yue, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 9992-10002.
[20] XIE Enze, WANG Wenhai, YU Zhiding, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[J]. Advances in neural information processing systems, 2021, 34: 12077.
[21] LIU Ze, HU Han, LIN Yutong, et al. Swin transformer V2: scaling up capacity and resolution[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11999-12009.
[22] SMOLA A J, GRETTON A, BORGWARDT K. Maximum mean discrepancy[C]//2006 ICONIP 13th International Conference on Neural Information Processing. HongKong: Springer International Publishing, 2006: 3-6.
[23] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks[EB/OL]. (2014-06-10)[2024-03-05]. https://arxiv.org/abs/1406.2661v1.
[24] HOFFMAN J, WANG Dequan, YU F, et al. FCNs in the wild: pixel-level adversarial and constraint-based adaptation[EB/OL]. (2016-12-08)[2024-02-15]. https://doi.org/10.48550/acxiv.
[25] ZHU Junyan, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2242-2251.
[26] TSAI Y H, HUNG W C, SCHULTER S, et al. Learning to adapt structured output space for semantic segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7472-7481.
[27] VU T H, JAIN H, BUCHER M, et al. ADVENT: adversarial entropy minimization for domain adaptation in semantic segmentation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2512-2521.
[28] JIANG Zhengkai, LI Yuxi, YANG Ceyuan, et al. Prototypical contrast adaptation for Domain adaptive semantic segmentation[M]//Lecture Notes in Computer Science. Cham: Springer Nature Switzerland, 2022: 36-54.
[29] HOYER L, DAI Dengxin, WANG Haoran, et al. MIC: masked image consistency for context-enhanced domain adaptation[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 11721-11732.
[30] CHEN Mu, ZHENG Zhedong, YANG Yi, et al. PiPa: pixel- and patch-wise self-supervised learning for domain adaptative semantic segmentation[C]//Proceedings of the 31st ACM International Conference on Multimedia. Ottawa: ACM, 2023: 1905-1914.
[31] WANG Yisen, MA Xingjun, CHEN Zaiyi, et al. Symmetric cross entropy for robust learning with noisy labels[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 322-330.
[32] ZOU Yang, YU Zhiding, LIU Xiaofeng, et al. Confidence regularized self-training[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 5981-5990.
[33] RICHTER S R, VINEET V, ROTH S, et al. Playing for data: ground truth from computer games[M]//Lecture Notes in Computer Science. Cham: Springer International Publishing, 2016: 102-118.
[34] ROS G, SELLART L, MATERZYNSKA J, et al. The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vega: IEEE, 2016: 3234-3243.
[35] CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 3213-3223.
[36] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[37] YANG Yanchao, SOATTO S. FDA: fourier domain adaptation for semantic segmentation[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 4084-4094.
[38] KANG Guoliang, WEI Yunchao, YANG Yi, et al. Pixel-level cycle association: a new perspective for domain adaptive semantic segmentation[J]. Advances in neural information processing systems, 2020, 33: 3569.
[39] IQBAL J, ALI M. MLSL: multi-level self-supervised learning for domain adaptation with spatially independent and semantically consistent labeling[C]//2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass: IEEE, 2020: 1853-1862.
[40] LUO Yawei, LIU Ping, ZHENG Liang, et al. Category-level adversarial adaptation for semantic segmentation using purified features[J]. IEEE transactions on pattern analysis and machine intelligence, 2022, 44(8): 3940-3956.
[41] TOLDO M, MICHIELI U, ZANUTTIGH P. Unsupervised domain adaptation in semantic segmentation via orthogonal and clustered embeddings[C]//2021 IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 1357-1367.
[42] MELAS-KYRIAZI L, MANRAI A K. PixMatch: unsupervised domain adaptation via pixelwise consistency training[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 12430-12440.
[43] IQBAL J, RAWAL H, HAFIZ R, et al. Distribution regularized self-supervised learning for domain adaptation of semantic segmentation[J]. Image and vision computing, 2022, 124: 104504.
[44] CAO Yihong, ZHANG Hui, LU Xiao, et al. Adaptive refining-aggregation-separation framework for unsupervised domain adaptation semantic segmentation[J]. IEEE transactions on circuits and systems for video technology, 2023, 33(8): 3822-3832.
[45] GUO Yaqian, WANG Xin, LI Ce, et al. Domain adaptive semantic segmentation by optimal transport[J]. Fundamental research, 2024, 4(5): 981-991.
[46] ZHANG Yuhang, TIAN Shishun, LIAO Muxin, et al. A hybrid domain learning framework for unsupervised semantic segmentation[J]. Neurocomputing, 2023, 516: 133-145.
[47] CHUNG I, YOO J, KWAK N. Exploiting inter-pixel correlations in unsupervised domain adaptation for semantic segmentation[C]//2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops. Waikoloa: IEEE, 2023: 12-21.
[48] LI Jing, ZHOU Kang, QIAN Shenhan, et al. Feature re-representation and reliable pseudo label retraining for cross-domain semantic segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2024, 46(3): 1682-1694.
[49] VAN DER MAATEN L, HINTON G. Visualizing data using t-SNE[J]. Journal of machine learning research, 2008, 9(11): 01301.

相似文献/References:: [1]张媛媛,霍静,杨婉琪,等.深度信念网络的二代身份证异构人脸核实算法[J].智能系统学报,2015,10(2):193.[doi:10.3969/j.issn.1673-4785.201405060]
　ZHANG Yuanyuan,HUO Jing,YANG Wanqi,et al.A deep belief network-based heterogeneous face verification method for the second-generation identity card[J].CAAI Transactions on Intelligent Systems,2015,10():193.[doi:10.3969/j.issn.1673-4785.201405060]
[2]丁科,谭营.GPU通用计算及其在计算智能领域的应用[J].智能系统学报,2015,10(1):1.[doi:10.3969/j.issn.1673-4785.201403072]
　DING Ke,TAN Ying.A review on general purpose computing on GPUs and its applications in computational intelligence[J].CAAI Transactions on Intelligent Systems,2015,10():1.[doi:10.3969/j.issn.1673-4785.201403072]
[3]申彦,朱玉全.CMP上基于数据集划分的K-means多核优化算法[J].智能系统学报,2015,10(4):607.[doi:10.3969/j.issn.1673-4785.201411036]
　SHEN Yan,ZHU Yuquan.An optimized algorithm of K-means based on data set partition on CMP systems[J].CAAI Transactions on Intelligent Systems,2015,10():607.[doi:10.3969/j.issn.1673-4785.201411036]
[4]马晓,张番栋,封举富.基于深度学习特征的稀疏表示的人脸识别方法[J].智能系统学报,2016,11(3):279.[doi:10.11992/tis.201603026]
　MA Xiao,ZHANG Fandong,FENG Jufu.Sparse representation via deep learning features based face recognition method[J].CAAI Transactions on Intelligent Systems,2016,11():279.[doi:10.11992/tis.201603026]
[5]刘帅师,程曦,郭文燕,等.深度学习方法研究新进展[J].智能系统学报,2016,11(5):567.[doi:10.11992/tis.201511028]
　LIU Shuaishi,CHENG Xi,GUO Wenyan,et al.Progress report on new research in deep learning[J].CAAI Transactions on Intelligent Systems,2016,11():567.[doi:10.11992/tis.201511028]
[6]程旸,王士同.基于局部保留投影的多可选聚类发掘算法[J].智能系统学报,2016,11(5):600.[doi:10.11992/tis.201508022]
　CHENG Yang,WANG Shitong.A multiple alternative clusterings mining algorithm using locality preserving projections[J].CAAI Transactions on Intelligent Systems,2016,11():600.[doi:10.11992/tis.201508022]
[7]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,11(6):728.[doi:10.11992/tis.201611021]
　MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11():728.[doi:10.11992/tis.201611021]
[8]王亚杰,邱虹坤,吴燕燕,等.计算机博弈的研究与发展[J].智能系统学报,2016,11(6):788.[doi:10.11992/tis.201609006]
　WANG Yajie,QIU Hongkun,WU Yanyan,et al.Research and development of computer games[J].CAAI Transactions on Intelligent Systems,2016,11():788.[doi:10.11992/tis.201609006]
[9]黄心汉.A3I:21世纪科技之光[J].智能系统学报,2016,11(6):835.[doi:10.11992/tis.201605022]
　HUANG Xinhan.A3I: the star of science and technology for the 21st century[J].CAAI Transactions on Intelligent Systems,2016,11():835.[doi:10.11992/tis.201605022]
[10]宋婉茹,赵晴晴,陈昌红,等.行人重识别研究综述[J].智能系统学报,2017,12(6):770.[doi:10.11992/tis.201706084]
　SONG Wanru,ZHAO Qingqing,CHEN Changhong,et al.Survey on pedestrian re-identification research[J].CAAI Transactions on Intelligent Systems,2017,12():770.[doi:10.11992/tis.201706084]
[11]莫凌飞,蒋红亮,李煊鹏.基于深度学习的视频预测研究综述[J].智能系统学报,2018,13(1):85.[doi:10.11992/tis.201707032]
　MO Lingfei,JIANG Hongliang,LI Xuanpeng.Review of deep learning-based video prediction[J].CAAI Transactions on Intelligent Systems,2018,13():85.[doi:10.11992/tis.201707032]
[12]杨文元.多标记学习自编码网络无监督维数约简[J].智能系统学报,2018,13(5):808.[doi:10.11992/tis.201804051]
　YANG Wenyuan.Unsupervised dimensionality reduction of multi-label learning via autoencoder networks[J].CAAI Transactions on Intelligent Systems,2018,13():808.[doi:10.11992/tis.201804051]
[13]赵玉新,赵廷.海底声呐图像智能底质分类技术研究综述[J].智能系统学报,2020,15(3):587.[doi:10.11992/tis.202004026]
　ZHAO Yuxin,ZHAO Ting.Survey of the intelligent seabed sediment classification technology based on sonar images[J].CAAI Transactions on Intelligent Systems,2020,15():587.[doi:10.11992/tis.202004026]
[14]王倩倩,苗夺谦,张远健.深度自编码与自更新稀疏组合的异常事件检测算法[J].智能系统学报,2020,15(6):1197.[doi:10.11992/tis.202007003]
　WANG Qianqian,MIAO Duoqian,ZHANG Yuanjian.Abnormal event detection method based on deep auto-encoder and self-updating sparse combination[J].CAAI Transactions on Intelligent Systems,2020,15():1197.[doi:10.11992/tis.202007003]
[15]杨慧,张婷,金晟,等.基于二进制生成对抗网络的视觉回环检测研究[J].智能系统学报,2021,16(4):673.[doi:10.11992/tis.202007007]
　YANG Hui,ZHANG Ting,JIN Sheng,et al.Visual loop closure detection based on binary generative adversarial network[J].CAAI Transactions on Intelligent Systems,2021,16():673.[doi:10.11992/tis.202007007]
[16]李旭,蔡彪,胡能兵.基于三元互信息的图对比学习方法研究[J].智能系统学报,2024,19(5):1257.[doi:10.11992/tis.202308004]
　LI Xu,CAI Biao,HU Nengbing.Research on graph contrastive learning method based on ternary mutual information[J].CAAI Transactions on Intelligent Systems,2024,19():1257.[doi:10.11992/tis.202308004]
[17]杨奡飞,续欣莹,谢刚,等.印刷电路板缺陷持续检测与定位方法研究[J].智能系统学报,2025,20(1):219.[doi:10.11992/tis.202310024]
　YANG Aofei,XU Xinying,XIE Gang,et al.Research on continual detection and localization method for printed circuit board defect[J].CAAI Transactions on Intelligent Systems,2025,20():219.[doi:10.11992/tis.202310024]

备注/Memo

收稿日期:2024-3-5。
基金项目:新一代人工智能国家科技重大专项(2020AAA0107300)；中央高校基本科研业务费专项(2023QN1077).
作者简介:杨宇宇，硕士研究生，主要研究方向为深度学习、域适应语义分割。E-mail：yyb904yyy@163.com。;杨霄，博士研究生，主要研究方向为计算机视觉、多模态表征学习。E-mail：yangxiao523x@163.com。;王军，教授，博士生导师，主要研究方向为智能机器人与无人系统、生物特征识别、机器视觉。主持新一代人工智能国家科技重大专项。E-mail：jrobot@126.com。
通讯作者:王军. E-mail：jrobot@126.com

更新日期/Last Update: 2025-01-05

基于原型引导与自适应特征融合的域适应语义分割 PDF下载HTML

备注/Memo

基于原型引导与自适应特征融合的域适应语义分割

PDF下载 HTML