[1]姜文涛,王鑫杰,张晟翀.空间约束注意力机制的图像分类网络[J].智能系统学报,2025,20(6):1444-1460.[doi:10.11992/tis.202505025]
 JIANG Wentao,WANG Xinjie,ZHANG Shengchong.Spatially constrained attention mechanism for image classification network[J].CAAI Transactions on Intelligent Systems,2025,20(6):1444-1460.[doi:10.11992/tis.202505025]
点击复制

空间约束注意力机制的图像分类网络

参考文献/References:
[1] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[2] DING Xiaohan, ZHANG Xiangyu, HAN Jungong, et al. Scaling up your kernels to 31×31: revisiting large kernel design in CNNs[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 11953-11965.
[3] TAN Mingxing, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks[EB/OL]. (2019-05-28)[2020-09-11]. http://arxiv.org/pdf/1905.11946.pdf.
[4] LIU Xinyu, PENG Houwen, ZHENG Ningxin, et al. EfficientViT: memory efficient vision Transformer with cascaded group attention[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 14420-14430.
[5] 姜文涛, 张大鹏. 优化分类的弱目标孪生网络跟踪研究[J]. 智能系统学报, 2023, 18(5): 984-993.
JIANG Wentao, ZHANG Dapeng. Research on weak object tracking based on Siamese network with optimized classification[J]. CAAI transactions on intelligent systems, 2023, 18(5): 984-993.
[6] 刘晓敏, 余梦君, 乔振壮, 等. 面向多源遥感数据分类的尺度自适应融合网络[J]. 电子与信息学报, 2024, 46(9): 3693-3702.
LIU Xiaomin, YU Mengjun, QIAO Zhenzhuang, et al. Scale adaptive fusion network for multimodal remote sensing data classification[J]. Journal of electronics & information technology, 2024, 46(9): 3693-3702.
[7] 刘佳, 宋泓, 陈大鹏, 等. 非语言信息增强和对比学习的多模态情感分析模型[J]. 电子与信息学报, 2024, 46(8): 3372-3381.
LIU Jia, SONG Hong, CHEN Dapeng, et al. A multimodal sentiment analysis model enhanced with non-verbal information and contrastive learning[J]. Journal of electronics & information technology, 2024, 46(8): 3372-3381.
[8] 王柳, 梁铭炬. 融合深度信息的室内场景分割算法[J]. 计算机系统应用, 2024, 33(3): 111-117.
WANG Liu, LIANG Mingju. Indoor scene segmentation algorithm based on fusion of deep information[J]. Computer systems and applications, 2024, 33(3): 111-117.
[9] ZHAO Youpeng, TANG Huadong, JIANG Yingying, et al. Parameter-efficient vision Transformer with linear attention[C]//2023 IEEE International Conference on Image Processing. Kuala Lumpur: IEEE, 2023: 1275-1279.
[10] SARKAR R, LIANG Hanxue, FAN Zhiwen, et al. Edge-MoE: memory-efficient multi-task vision Transformer architecture with task-level sparsity via mixture-of-experts[C]//2023 IEEE/ACM International Conference on Computer Aided Design. San Francisco: IEEE, 2023: 1-9.
[11] WANG Wenxiao, CHEN Wei, QIU Qibo, et al. CrossFormer: a versatile vision Transformer hinging on cross-scale attention[J]. IEEE transactions on pattern analysis and machine intelligence, 2024, 46(5): 3123-3136.
[12] 姜文涛, 孟庆姣. 自适应时空正则化的相关滤波目标跟踪[J]. 智能系统学报, 2023, 18(4): 754-763.
JIANG Wentao, MENG Qingjiao. Correlation filter tracking for adaptive spatiotemporal regularization[J]. CAAI transactions on intelligent systems, 2023, 18(4): 754-763.
[13] YANG Jian, LI Chen, LI Xuelong. Underwater image restoration with light-aware progressive network[C]//ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island: IEEE, 2023: 1-5.
[14] LI Zixuan, WANG Yuangen. Optimizing Transformer for large-hole image inpainting[C]//2023 IEEE International Conference on Image Processing. Kuala Lumpur: IEEE, 2023: 1180-1184.
[15] CHEN Xiangyu, WANG Xintao, ZHANG Wenlong, et al. Hat: hybrid attention Transformer for image restoration[EB/OL]. (2023-09-11)[2025-10-01]. https://arxiv.org/abs/2309.05239.
[16] JI Jiahuan, ZHONG Baojiang, SONG Weigang, et al. Learning multi-scale features for jpeg image artifacts removal[C]//2023 IEEE International Conference on Image Processing. Kuala Lumpur: IEEE, 2023: 1565-1569.
[17] LIU Yifeng, TIAN Jing. Probabilistic attention map: a probabilistic attention mechanism for convolutional neural networks[J]. Sensors, 2024, 24(24): 8187.
[18] POLANSKY M G, HERRMANN C, HUR J, et al. Boundary attention: learning curves, corners, junctions and grouping[EB/OL]. (2024-01-01)[2025-10-01]. https://arxiv.org/abs/2401.00935.
[19] XIAO Da, MENG Qingye, LI Shengping, et al. Improving Transformers with dynamically composable multi-head attention[EB/OL]. (2024-05-17)[2025-10-01]. https://arxiv.org/abs/2405.08553.
[20] YU Xiang, GUO Hongbo, YUAN Ying, et al. An improved medical image segmentation framework with Channel-Height-Width-Spatial attention module[J]. Engineering applications of artificial intelligence, 2024, 136: 108751.
[21] ZAGORUYKO S, KOMODAKIS N. Wide residual networks[EB/OL]. (2016-05-23)[2025-10-01]. https://arxiv.org/abs/1605.07146.
[22] WANG Qilong, WU Banggu, ZHU Pengfei, et al. ECA-net: efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531–11539.
[23] PARMAR N, VASWANI A, USZKOREIT J, et al. Image Transformer[C]//International Conference on Machine Learning. Stockholm: PMLR, 2018: 4055-4064.
[24] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30: 5998-6008.
[25] CHIEN Y. Pattern classification and scene analysis[J]. IEEE transactions on automatic control, 1974, 19(4): 462-463.
[26] SHARMA N, JAIN V, MISHRA A. An analysis of convolutional neural networks for image classification[J]. Procedia computer science, 2018, 132: 377-384.
[27] NETZER Y, WANG T, COATES A, et al. The street view house numbers (SVHN) dataset[EB/OL]. (2011-12-12)[2023-05-04]. http://ufldl.stanford.edu/housenumbers/.
[28] STALLKAMP J, SCHLIPSING M, SALMEN J, et al. The German traffic sign recognition benchmark [EB/OL]. (2012-03-16)[2023-05-04]. http://benchmark.ini.rub.de/?section=gtsrb&subsection=news.
[29] HU Jie, SHEN Li, SAMUEL A, et al. Squeeze-and-excitation networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 42(8): 1
[30] LI Xiang, WANG Wenhai, HU Xiaolin, et al. Selective kernel networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 510-519.
[31] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich: Springer, 2018: 3-19.
[32] CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1800-1807.
[33] DAI Jifeng, QI Haozhi, XIONG Yuwen, et al. Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 764-773.
[34] YANG B, BENDER G, LE Q V, et al. CondConv: conditionally parameterized convolutions for efficient inference[EB/OL]. (2019-04-10)[2024-10-12]. https://arxiv.org/abs/1904.04971.
[35] LUU M L, HUANG Zeyi, XING E P, et al. Expeditious sali ency-guided mix-up through random gradient Threshold ing[EB/OL]. (2022-12-09)[2024-10-12]. https://arxiv.org/abs/2212.04875.
[36] 郭玉荣, 张珂, 王新胜, 等. 端到端双通道特征重标定DenseNet图像分类[J]. 中国图象图形学报, 2020, 25(3): 486-497.
GUO Yurong, ZHANG Ke, WANG Xinsheng, et al. Image classification method based on end-to-end dual feature reweight DenseNet[J]. Journal of image and graphics, 2020, 25(3): 486-497.
[37] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2024-10-12]. https://arxiv.org/abs/1409.1556.
[38] HASSANI A, WALTON S, SHAH N, et al. Escaping the big data paradigm with compact Transformers[EB/OL]. (2022-06-07)[2024-10-12]. https://arxiv.org/abs/2104.05704.
[39] ZHOU C L, ZHANG H, ZHOU Z K, et al. QKFormer: hierarchical spiking Transformer using Q-K attention[EB/OL]. (2024-03-25)[2024-10-08]. https://arxiv.org/abs/2403.16552.
[40] CHOROMANSKI K, LIKHOSHERSTOV V, DOHAN D, et al. Rethinking attention with performers[EB/OL]. (2020-09-30)[2024-01-05]. https://arxiv.org/pdf/2009.14794.pdf.
[41] LAN Hai, WANG Xihao, SHEN Hao, et al. Couplformer: rethinking vision Transformer with coupling attention[C]//2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2023: 6464-6473.
[42] 谢奕涛, 苏鹭梅, 杨帆, 等. 面向目标类别分类的无数据知识蒸馏方法[J]. 中国图象图形学报, 2024, 29(11): 3401-3416.
XIE Yitao, SU Lumei, YANG Fan, et al. Data-free knowledge distillation for target class classification[J]. Journal of Image and Graphics, 2024, 29(11): 3401-3416.
[43] 柴智, 丁春涛, 郭慧, 等. CN2Conv: 面向物联网设备的强鲁棒CNN设计方法[J]. 计算机应用研究, 2025, 42(7): 2154-2160
CHAI Zhi, DING Chuntao, GUO Hui, et al. Combined non-linearity convolution kernel generation: strong robust CNN design method based on IoT[J]. Application research of computers, 2025, 42(7): 2154-2160
[44] 宫智宇, 王士同. 面向重尾噪声图像分类的残差网络学习方法[J/OL]. 计算机应用. [2025-10-02]. https://doi.org/10.11772/j.issn.1001-9081.2024101407.
GONG Zhiyu, WANG Shitong. Residual network learning method for image classification under heavy-tail noise[J/OL]. Computer applications. [2025-10-02]. https://doi.org/10.11772/j.issn.1001-9081.2024101407.
[45] 杨育婷, 李玲玲, 刘旭, 等. 基于多尺度-多方向Transformer的图像识别[J]. 计算机学报, 2025, 48(2): 249-265.
YANG Yuting, LI Lingling, LIU Xu, et al. Multi-scale and multi-directional Transformer-based image recognition[J]. Chinese journal of computers, 2025, 48(2): 249-265.
[46] 朱秋慧, 杨靖, 黄若愚, 等. 基于部分卷积的多尺度特征卷积神经网络模型[J/OL]. 无线电通信技术. [2025-05-21]. http://kns.cnki.net/kcms/detail/13.1099.TN.20250310.1707.012.html.
Zhu Qiuhui, Yang Jing, Huang Ruoyu, et al. Partial convolution-based multi-scale feature convolutional neural network model[J/OL]. Radio communications technology. [2025-05-21]. http://kns.cnki.net/kcms/detail/13.1099.TN.20250310.1707.012.html.
相似文献/References:
[1]李海峰,杜军平.颜色特征的图像分类技术研究[J].智能系统学报,2008,3(2):65.[doi:CNKI:SUN:ZNXT.0.2008-02-017]
[2]李海峰,杜军平.颜色特征的图像分类技术研究[J].智能系统学报,2008,3(2):155.
 LI Hai-feng,DU Jun-ping.Image classification technology based on color features[J].CAAI Transactions on Intelligent Systems,2008,3():155.
[3]姚伏天,钱沄涛.高斯过程及其在高光谱图像分类中的应用[J].智能系统学报,2011,6(5):396.
 YAO Futian,QIAN Yuntao.Gaussian process and its applications in hyperspectral image classification[J].CAAI Transactions on Intelligent Systems,2011,6():396.
[4]尤雅萍,成运,苏松志,等.基于谱域-空域结合特征和图割原理的高光谱图像分类[J].智能系统学报,2015,10(2):201.[doi:10.3969/j.issn.1673-4785.201410040]
 YOU Yaping,CHENG Yun,SU Songzhi,et al.Hyperspectral image classification based on spectral-spatial combination features and graph cut[J].CAAI Transactions on Intelligent Systems,2015,10():201.[doi:10.3969/j.issn.1673-4785.201410040]
[5]赵骞,李敏,赵晓杰,等.基于感受野学习的特征词袋模型简化算法[J].智能系统学报,2016,11(5):663.[doi:10.11992/tis.201601001]
 ZHAO Qian,LI Min,ZHAO Xiaojie,et al.Learning receptive fields for compact bag-of-feature model[J].CAAI Transactions on Intelligent Systems,2016,11():663.[doi:10.11992/tis.201601001]
[6]费宇杰,吴小俊.一种局部聚合描述符和组显著编码相结合的编码方法[J].智能系统学报,2017,12(2):172.[doi:10.11992/tis.201602010]
 FEI Yujie,WU Xiaojun.A new feature coding algorithm based on the combination of group salient coding and VLAD[J].CAAI Transactions on Intelligent Systems,2017,12():172.[doi:10.11992/tis.201602010]
[7]杨梦铎,栾咏红,刘文军,等.基于自编码器的特征迁移算法[J].智能系统学报,2017,12(6):894.[doi:10.11992/tis.201706037]
 YANG Mengduo,LUAN Yonghong,LIU Wenjun,et al.Feature transfer algorithm based on an auto-encoder[J].CAAI Transactions on Intelligent Systems,2017,12():894.[doi:10.11992/tis.201706037]
[8]马忠丽,刘权勇,武凌羽,等.一种基于联合表示的图像分类方法[J].智能系统学报,2018,13(2):220.[doi:10.11992/tis.201611036]
 MA Zhongli,LIU Quanyong,WU Lingyu,et al.Syncretic representation method for image classification[J].CAAI Transactions on Intelligent Systems,2018,13():220.[doi:10.11992/tis.201611036]
[9]魏彩锋,孙永聪,曾宪华.图正则化字典对学习的轻度认知功能障碍预测[J].智能系统学报,2019,14(2):369.[doi:10.11992/tis.201709033]
 WEI Caifeng,SUN Yongcong,ZENG Xianhua.Dictionary pair learning with graph regularization for mild cognitive impairment prediction[J].CAAI Transactions on Intelligent Systems,2019,14():369.[doi:10.11992/tis.201709033]
[10]赵玉新,赵廷.海底声呐图像智能底质分类技术研究综述[J].智能系统学报,2020,15(3):587.[doi:10.11992/tis.202004026]
 ZHAO Yuxin,ZHAO Ting.Survey of the intelligent seabed sediment classification technology based on sonar images[J].CAAI Transactions on Intelligent Systems,2020,15():587.[doi:10.11992/tis.202004026]

备注/Memo

收稿日期:2025-5-27。
基金项目:国家自然科学基金项目(61601213);辽宁省自然科学基金项目(20170540426);辽宁省教育厅重点基金项目(LJYL049).
作者简介:姜文涛,副教授,博士,主要研究方向为图像与视觉信息计算。主持国防预研基金项目、辽宁省教育厅科学技术项目和辽宁省自然科学基金面上项目,发表学术论文35余篇。 E-mail:lntuwulue@163.com。;王鑫杰,硕士研究生,主要研究方向为深度学习与图像处理、模式识别与人工智能。E-mail:2585178999@qq.com。;张晟翀,硕士研究生,高级工程师,主要研究方向为数字信号处理。发表学术论文10余篇。E-mail:zsc417@126.com。
通讯作者:姜文涛. E-mail:lntuwulue@163.com

更新日期/Last Update: 1900-01-01
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com