<-上一篇/Previous Article 下一篇/Next Article->

[1]姜文涛,由卓丞,张晟翀.动态掩码卷积的图像分类网络[J].智能系统学报,2026,21(2):423-434.[doi:10.11992/tis.202503019]
　JIANG Wentao,YOU Zhuocheng,ZHANG Shengchong.Dynamic mask convolution for image classification networks[J].CAAI Transactions on Intelligent Systems,2026,21(2):423-434.[doi:10.11992/tis.202503019]

点击复制

动态掩码卷积的图像分类网络

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 21 期数: 2026年第2期页码: 423-434 栏目: 学术论文—机器感知与模式识别出版日期: 2026-03-05

Title:: Dynamic mask convolution for image classification networks

作者:: 姜文涛¹, 由卓丞¹, 张晟翀²; 1. 辽宁工程技术大学软件学院, 辽宁葫芦岛 125105;
2. 光电信息控制和安全技术重点实验室, 天津 300308

Author(s):: JIANG Wentao¹, YOU Zhuocheng¹, ZHANG Shengchong²; 1. College of Software, Liaoning Technology University, Huludao 125105, China;
2. Science and Technology on Electro-Optical Information Security Control Laboratory, Tianjin 300308, China

关键词:: 图像分类; 掩码机制; 残差网络; 动态掩码卷积; 膨胀卷积; 注意力机制; 特征融合; 特征提取

Keywords:: image classification; masking mechanism; residual networks; dynamic mask convolution; dilated convolution; attention mechanism; feature fusion; feature extraction

分类号:: TP391

DOI:: 10.11992/tis.202503019

摘要:: 针对复杂场景下传统图像分类方法存在的特征适应性弱、多尺度信息捕捉能力有限以及细节特征表达能力不足的问题，提出了一种基于动态掩码卷积的图像分类网络。1）设计多分支掩码卷积融合模块，将多分支结构与动态掩码机制相结合，以实现不同尺度信息的融合，并根据输入图像的上下文信息动态选择和强化关键特征，从而提升网络的特征提取能力。2）在残差学习中引入自适应增强模块，通过整合像素级与通道级注意力机制自适应调整特征权重，精准地捕捉图像中重要的细节信息。在CIFAR-10、CIFAR-100、SVHN、Imagenette和Imagewoof数据集上的实验，分别达到了96.85%、82.39%、97.88%、93.35%、85.93%的分类准确率，显著优于传统图像分类方法，该网络能够在面对多样化的图像特征和复杂的场景时，表现出优异和稳定的分类性能，为深度学习在图像分类领域的应用提供了新的思路。

Abstract:: Aiming at the problems of traditional image classification methods in complex scenes, such as weak feature adaptability, limited ability to capture multi-scale information, and insufficient ability to express detailed features, an image classification network based on dynamic mask convolution is proposed. Firstly, the multi-branch mask convolution fusion module is designed, which combines the multi-branch structure with the dynamic mask mechanism to realize the fusion of different scale information, and dynamically selects and strengthens the key features according to the context information of the input image, so as to improve the feature extraction ability of the network. Secondly, the adaptive enhancement module is introduced in the residual learning, and the feature weights are adaptively adjusted by integrating the pixel-level and channel level attention mechanisms to accurately capture the important details in the image. Through experiments on CIFAR-10, CIFAR-100, SVHN, Imagenette, and Imagewoof datasets, the classification accuracy of 96.85%, 82.39%, 97.88%, 93.35% and 85.93% respectively, which is significantly better than the traditional image classification methods. The network can show excellent and stable classification performance in the face of diverse image features and complex scenes, and provides a new idea for the application of deep learning in the field of image classification.

参考文献/References:: [1] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90
[2] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]// 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 412-420.
[3] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
[4] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[5] ZAGORUYKO S, KOMODAKIS N. Wide residual networks[EB/OL]. (2016-05-23) [2025-03-12]. https://arxiv.org/abs/1605.07146.
[6] ABDI M, NAHAVANDI S. Multi-residual networks: improving the speed and accuracy of residual networks[EB/OL]. (2016-09-19) [2025-03-12]. https://arxiv.org/pdf/1609.05672.pdf.
[7] WANG Ao, CHEN Hui, LIN Zijia, et al. LSNet: see large, focus small[C]//2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2025: 9718-9729.
[8] YANG Jiangnan, LIU Shuangli, WU Jingjun, et al. Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection[J]. Proceedings of the AAAI conference on artificial intelligence, 2025, 39(9): 9202-9210
[9] TAN Mingxing, LE Q V. EfficientNetV2: smaller models and faster training[C]//International Conference on Machine Learning. Virtual: PMLR, 2021: 13-24.
[10] YU Weihao, ZHOU Pan, YAN Shuicheng, et al. InceptionNeXt: when inception meets ConvNeXt[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 5672-5683.
[11] LIU Zhuang, MAO Hanzi, WU Chaoyuan, et al. A ConvNet for the 2020s[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11966-11976.
[12] LUO Zhengbo, SUN Zitang, ZHOU Weilian, et al. Rethinking ResNets: improved stacking strategies with high-order schemes for image classification[J]. Complex & intelligent systems, 2022, 8(4): 3395-3407
[13] 许新征, 李杉. 基于特征膨胀卷积模块的轻量化技术研究[J]. 电子学报, 2023, 51(2): 355-364 XU Xinzheng, LI Shan. Research of lightweight convolution neural network based on feature expansion convolution[J]. Acta electronica sinica, 2023, 51(2): 355-364
[14] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2017: 6000-6010.
[15] DAI Zihang, LIU Hanxiao, LE Q V, et al. CoAtNet: marrying convolution and attention for all data sizes[J]. IEEE transactions on pattern analysis and machine intelligence, 2022, 44(9): 3201-3212
[16] CAO Yue, XU Jiarui, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul: IEEE, 2019: 1971-1980.
[17] 赵凤, 耿苗苗, 刘汉强, 等. 卷积神经网络与视觉Transformer联合驱动的跨层多尺度融合网络高光谱图像分类方法[J]. 电子与信息学报, 2024, 46(5): 2237-2248 ZHAO Feng, GENG Miaomiao, LIU Hanqiang, et al. Convolutional neural network and vision transformer-driven cross-layer multi-scale fusion network for hyperspectral image classification[J]. Journal of electronics & information technology, 2024, 46(5): 2237-2248
[18] WU Gang, JIANG Junjun, JIANG Kui, et al. DSwinIR: rethinking window-based attention for image restoration[J]. IEEE transactions on pattern analysis and machine intelligence, 2025: 1-18.
[19] 刘万军, 赵思琪, 曲海成, 等. 结合前景特征增强与区域掩码自注意力的细粒度图像分类[J]. 智能系统学报, 2022, 17(6): 1134-1144 LIU Wanjun, ZHAO Siqi, QU Haicheng, et al. Combining foreground feature reinforcement and region mask self-attention for fine-grained image classification[J]. CAAI transactions on intelligent systems, 2022, 17(6): 1134-1144
[20] KANG Ming, TING C M, TING F F, et al. ASF-YOLO: a novel YOLO model with attentional scale sequence fusion for cell instance segmentation[J]. Image and vision computing, 2024, 147: 105057
[21] LU Liping, XIONG Qian, XU Bingrong, et al. MixDehazeNet: mix structure block for image dehazing network[C]//2024 International Joint Conference on Neural Networks. Yokohama: IEEE, 2024: 1-10.
[22] CUBUK E D, ZOPH B, SHLENS J, et al. AutoAugment: learning augmentation policies from data[C]//International Conference on Machine Learning. Los Angeles: PMLR, 2019: 874-883.
[23] ZHONG Zhun, ZHENG Liang, KANG Guoliang, et al. Random erasing data augmentation[J]. Proceedings of the AAAI conference on artificial intelligence, 2020, 34(7): 13001-13008
[24] LOSHCHILOV I, HUTTER F. SGDR: stochastic gradient descent with warm restarts[C]//International Conference on Learning Representations. Toulon: OpenReview.net, 2017: 1-16.
[25] HAN Kai, WANG Yunhe, TIAN Qi, et al. GhostNet: more features from cheap operations[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 1577-1586.
[26] ZHOU Chenlin, ZHANG Han, ZHOU Zhaokun, et al. QKFormer: query-key interaction for efficient vision Transformers[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 1700-1709.
[27] MA Chenxiang, WU Jibin, SI Chenyang, et al. Scaling supervised local learning with augmented auxiliary networks[C]//International conference on learning representations. Vienna: OpenReview. net, 2024: 1-18.
[28] WU Xidong, GAO Shangqian, ZHANG Zeyu, et al. Auto- train-once: controller network guided automatic network pruning from scratch[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2024: 16163-16173.
[29] 邱云飞, 张家欣, 兰海, 等. 融合张量合成注意力的改进ResNet图像分类模型[J]. 激光与光电子学进展, 2023, 60(6): 97-106 QIU Yunfei, ZHANG Jiaxin, LAN Hai, et al. Improved ResNet image classification model based on tensor synthesis attention[J]. Laser & optoelectronics progress, 2023, 60(6): 97-106
[30] 姜文涛, 陈晨, 张晟翀. 空间位置矫正的稀疏特征图像分类网络[J]. 光电工程, 2024, 51(5): 240050 JIANG Wentao, CHEN Chen, ZHANG Shengchong. Sparse feature image classification network with spatial position correction[J]. Opto-electronic engineering, 2024, 51(5): 240050
[31] 袁姮, 刘杰, 姜文涛, 等. 特征重排列注意力机制的双池化残差分类网络[J]. 中国图象图形学报, 2025, 30(1): 110-129 YUAN Heng, LIU Jie, JIANG Wentao, et al. Double-pooling residual classification network based on feature reordering attention mechanism[J]. Journal of image and graphics, 2025, 30(1): 110-129

相似文献/References:: [1]李海峰,杜军平.颜色特征的图像分类技术研究[J].智能系统学报,2008,3(2):65.[doi:CNKI:SUN:ZNXT.0.2008-02-017]
[2]李海峰,杜军平.颜色特征的图像分类技术研究[J].智能系统学报,2008,3(2):155.
　LI Hai-feng,DU Jun-ping.Image classification technology based on color features[J].CAAI Transactions on Intelligent Systems,2008,3():155.
[3]姚伏天,钱沄涛.高斯过程及其在高光谱图像分类中的应用[J].智能系统学报,2011,6(5):396.
　YAO Futian,QIAN Yuntao.Gaussian process and its applications in hyperspectral image classification[J].CAAI Transactions on Intelligent Systems,2011,6():396.
[4]尤雅萍,成运,苏松志,等.基于谱域-空域结合特征和图割原理的高光谱图像分类[J].智能系统学报,2015,10(2):201.[doi:10.3969/j.issn.1673-4785.201410040]
　YOU Yaping,CHENG Yun,SU Songzhi,et al.Hyperspectral image classification based on spectral-spatial combination features and graph cut[J].CAAI Transactions on Intelligent Systems,2015,10():201.[doi:10.3969/j.issn.1673-4785.201410040]
[5]赵骞,李敏,赵晓杰,等.基于感受野学习的特征词袋模型简化算法[J].智能系统学报,2016,11(5):663.[doi:10.11992/tis.201601001]
　ZHAO Qian,LI Min,ZHAO Xiaojie,et al.Learning receptive fields for compact bag-of-feature model[J].CAAI Transactions on Intelligent Systems,2016,11():663.[doi:10.11992/tis.201601001]
[6]费宇杰,吴小俊.一种局部聚合描述符和组显著编码相结合的编码方法[J].智能系统学报,2017,12(2):172.[doi:10.11992/tis.201602010]
　FEI Yujie,WU Xiaojun.A new feature coding algorithm based on the combination of group salient coding and VLAD[J].CAAI Transactions on Intelligent Systems,2017,12():172.[doi:10.11992/tis.201602010]
[7]杨梦铎,栾咏红,刘文军,等.基于自编码器的特征迁移算法[J].智能系统学报,2017,12(6):894.[doi:10.11992/tis.201706037]
　YANG Mengduo,LUAN Yonghong,LIU Wenjun,et al.Feature transfer algorithm based on an auto-encoder[J].CAAI Transactions on Intelligent Systems,2017,12():894.[doi:10.11992/tis.201706037]
[8]马忠丽,刘权勇,武凌羽,等.一种基于联合表示的图像分类方法[J].智能系统学报,2018,13(2):220.[doi:10.11992/tis.201611036]
　MA Zhongli,LIU Quanyong,WU Lingyu,et al.Syncretic representation method for image classification[J].CAAI Transactions on Intelligent Systems,2018,13():220.[doi:10.11992/tis.201611036]
[9]魏彩锋,孙永聪,曾宪华.图正则化字典对学习的轻度认知功能障碍预测[J].智能系统学报,2019,14(2):369.[doi:10.11992/tis.201709033]
　WEI Caifeng,SUN Yongcong,ZENG Xianhua.Dictionary pair learning with graph regularization for mild cognitive impairment prediction[J].CAAI Transactions on Intelligent Systems,2019,14():369.[doi:10.11992/tis.201709033]
[10]赵玉新,赵廷.海底声呐图像智能底质分类技术研究综述[J].智能系统学报,2020,15(3):587.[doi:10.11992/tis.202004026]
　ZHAO Yuxin,ZHAO Ting.Survey of the intelligent seabed sediment classification technology based on sonar images[J].CAAI Transactions on Intelligent Systems,2020,15():587.[doi:10.11992/tis.202004026]

备注/Memo

收稿日期:2025-3-12。
基金项目:国家自然科学基金项目(61601213)；辽宁省自然科学基金项目(20170540426)；辽宁省教育厅重点基金项目(LJYL049).
作者简介:姜文涛，副教授，博士，主要研究方向为图像与视觉信息计算。主持预研基金项目、辽宁省教育厅科学技术项目和辽宁省自然科学基金面上项目，发表学术论文35篇。E-mail：lntuwulue@163.com。;由卓丞，硕士，主要研究方向为深度学习与图像处理、模式识别与人工智能。E-mail：1046491150@qq.com。;张晟翀，高级工程师，硕士，主要研究方向为数字信号处理，发表学术论文10余篇。E-mail：zsc417@126.com。
通讯作者:姜文涛. E-mail：lntuwulue@163.com

更新日期/Last Update: 1900-01-01

动态掩码卷积的图像分类网络 PDF下载HTML

备注/Memo

动态掩码卷积的图像分类网络

PDF下载 HTML