[1]姜文涛,由卓丞,张晟翀.动态掩码卷积的图像分类网络[J].智能系统学报,2026,21(2):423-434.[doi:10.11992/tis.202503019]
JIANG Wentao,YOU Zhuocheng,ZHANG Shengchong.Dynamic mask convolution for image classification networks[J].CAAI Transactions on Intelligent Systems,2026,21(2):423-434.[doi:10.11992/tis.202503019]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
21
期数:
2026年第2期
页码:
423-434
栏目:
学术论文—机器感知与模式识别
出版日期:
2026-03-05
- Title:
-
Dynamic mask convolution for image classification networks
- 作者:
-
姜文涛1, 由卓丞1, 张晟翀2
-
1. 辽宁工程技术大学 软件学院, 辽宁 葫芦岛 125105;
2. 光电信息控制和安全技术重点实验室, 天津 300308
- Author(s):
-
JIANG Wentao1, YOU Zhuocheng1, ZHANG Shengchong2
-
1. College of Software, Liaoning Technology University, Huludao 125105, China;
2. Science and Technology on Electro-Optical Information Security Control Laboratory, Tianjin 300308, China
-
- 关键词:
-
图像分类; 掩码机制; 残差网络; 动态掩码卷积; 膨胀卷积; 注意力机制; 特征融合; 特征提取
- Keywords:
-
image classification; masking mechanism; residual networks; dynamic mask convolution; dilated convolution; attention mechanism; feature fusion; feature extraction
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202503019
- 摘要:
-
针对复杂场景下传统图像分类方法存在的特征适应性弱、多尺度信息捕捉能力有限以及细节特征表达能力不足的问题,提出了一种基于动态掩码卷积的图像分类网络。1)设计多分支掩码卷积融合模块,将多分支结构与动态掩码机制相结合,以实现不同尺度信息的融合,并根据输入图像的上下文信息动态选择和强化关键特征,从而提升网络的特征提取能力。2)在残差学习中引入自适应增强模块,通过整合像素级与通道级注意力机制自适应调整特征权重,精准地捕捉图像中重要的细节信息。在CIFAR-10、CIFAR-100、SVHN、Imagenette和Imagewoof数据集上的实验,分别达到了96.85%、82.39%、97.88%、93.35%、85.93%的分类准确率,显著优于传统图像分类方法,该网络能够在面对多样化的图像特征和复杂的场景时,表现出优异和稳定的分类性能,为深度学习在图像分类领域的应用提供了新的思路。
- Abstract:
-
Aiming at the problems of traditional image classification methods in complex scenes, such as weak feature adaptability, limited ability to capture multi-scale information, and insufficient ability to express detailed features, an image classification network based on dynamic mask convolution is proposed. Firstly, the multi-branch mask convolution fusion module is designed, which combines the multi-branch structure with the dynamic mask mechanism to realize the fusion of different scale information, and dynamically selects and strengthens the key features according to the context information of the input image, so as to improve the feature extraction ability of the network. Secondly, the adaptive enhancement module is introduced in the residual learning, and the feature weights are adaptively adjusted by integrating the pixel-level and channel level attention mechanisms to accurately capture the important details in the image. Through experiments on CIFAR-10, CIFAR-100, SVHN, Imagenette, and Imagewoof datasets, the classification accuracy of 96.85%, 82.39%, 97.88%, 93.35% and 85.93% respectively, which is significantly better than the traditional image classification methods. The network can show excellent and stable classification performance in the face of diverse image features and complex scenes, and provides a new idea for the application of deep learning in the field of image classification.
备注/Memo
收稿日期:2025-3-12。
基金项目:国家自然科学基金项目(61601213);辽宁省自然科学基金项目(20170540426);辽宁省教育厅重点基金项目(LJYL049).
作者简介:姜文涛,副教授,博士,主要研究方向为图像与视觉信息计算。主持预研基金项目、辽宁省教育厅科学技术项目和辽宁省自然科学基金面上项目,发表学术论文35篇。E-mail:lntuwulue@163.com。;由卓丞,硕士,主要研究方向为深度学习与图像处理、模式识别与人工智能。E-mail:1046491150@qq.com。;张晟翀,高级工程师,硕士,主要研究方向为数字信号处理,发表学术论文10余篇。E-mail:zsc417@126.com。
通讯作者:姜文涛. E-mail:lntuwulue@163.com
更新日期/Last Update:
1900-01-01