<-上一篇/Previous Article 下一篇/Next Article->

[1]彭雨彤,梁凤梅.融合CNN和ViT的乳腺超声图像肿瘤分割方法[J].智能系统学报,2024,19(3):556-564.[doi:10.11992/tis.202304046]
　PENG Yutong,LIANG Fengmei.Tumor segmentation method for breast ultrasound images incorporating CNN and ViT[J].CAAI Transactions on Intelligent Systems,2024,19(3):556-564.[doi:10.11992/tis.202304046]

点击复制

融合CNN和ViT的乳腺超声图像肿瘤分割方法

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 19 期数: 2024年第3期页码: 556-564 栏目: 学术论文—机器学习出版日期: 2024-05-05

Title:: Tumor segmentation method for breast ultrasound images incorporating CNN and ViT

作者:: 彭雨彤, 梁凤梅; 太原理工大学电子信息与光学工程学院, 山西晋中 030600

Author(s):: PENG Yutong, LIANG Fengmei; College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Jinzhong 030600, China

关键词:: 卷积神经网络; 乳腺超声图像分割; Swin Transformer; 交叉注意力机制; 混合损失函数; 可形变卷积; 多头跳跃注意力; 深度学习

Keywords:: convolutional neural network; breast ultrasound image segmentation; Swin Transformer; crossover attention mechanism; hybrid-loss function; deformable convolution; multihead skip attention; deep learning

分类号:: TP391

DOI:: 10.11992/tis.202304046

文献标志码:: 2023-08-31

摘要:: 针对乳腺超声图像肿瘤区域形状大小差异大导致分割困难，卷积神经网络（convolutional neural networks, CNN）建模长距离依赖性和空间相关性方面存在局限性，视觉Transformer（vision Transformer, ViT）要求数据量巨大等问题，提出一种融合CNN和ViT的分割方法。使用改进的Swin Transformer模块和基于可形变卷积的CNN编码器模块分别提取全局特征和局部细节特征，设计使用交叉注意力机制融合这两种尺度的特征表示，训练过程采取二元交叉熵损失混合边界损失函数，有效提高分割精度。在两个公共数据集上的实验结果表明，与现有经典算法相比所提方法的分割结果有显著提升，dice系数提升3.841 2%，验证所提方法的有效性和可行性。

Abstract:: A segmentation method that fuses CNN and ViT is proposed to address the problems of large differences in shape and size of tumor regions of breast ultrasound images that lead to difficulty in segmentation, limitations in long-range dependency and spatial correlation in convolutional neural network (CNN) modeling, and the huge amount of data required by vision Transformer (ViT). Global and local detail features were extracted using a modified Swin Transformer module and a CNN encoder module based on deformable convolution, respectively. The design uses a cross-attention mechanism to fuse the feature representations of the two scales, and the training process adopts a binary cross-entropy loss combined with a boundary loss function. This approach effectively improves the segmentation accuracy. Experimental results on two public datasets show that the segmentation findings of the proposed method have been significantly improved compared with those of the existing classical algorithms, with a 3.8412% improvement in the dice coefficient. This outcome verifies the effectiveness and feasibility of the proposed method.

参考文献/References:: [1] ZHENG Rongshou, ZHANG Siwei, ZENG Hongmei, et al. Cancer incidence and mortality in China, 2016[J]. Journal of the national cancer center, 2022, 2(1): 1–9.
[2] 高艳多, 阎炯, 赵胜, 等. 1990—2019年中国女性乳腺癌发病和死亡趋势的年龄-时期-队列模型分析[J]. 中国预防医学杂志, 2022, 23(12): 909–916
GAO Yanduo, YAN Jiong, ZHAO Sheng, et al. Trends in incidence and mortality of female breast cancer in China from 1990 to 2019 using age-period-cohort analysis model[J]. Chinese preventive medicine, 2022, 23(12): 909–916
[3] XIAN Min, ZHANG Yingtao, CHENG H D, et al. Automatic breast ultrasound image segmentation: a survey[EB/OL]. (2017–04–04)[2023–04–24]. http://arxiv.org/abs/1704.01472.
[4] 苏丽, 孙雨鑫, 苑守正. 基于深度学习的实例分割研究综述[J]. 智能系统学报, 2022, 17(1): 16–31
SU Li, SUN Yuxin, YUAN Shouzheng. A survey of instance segmentation research based on deep learning[J]. CAAI transactions on intelligent systems, 2022, 17(1): 16–31
[5] 施俊, 汪琳琳, 王珊珊, 等. 深度学习在医学影像中的应用综述[J]. 中国图象图形学报, 2020, 25(10): 1953–1981
SHI Jun, WANG Linlin, WANG Shanshan, et al. Applications of deep learning in medical imaging: a survey[J]. Journal of image and graphics, 2020, 25(10): 1953–1981
[6] 张宇, 梁凤梅, 刘建霞. 融合类激活映射和视野注意力的皮肤病变分割[J/OL]. 计算机工程与应用, 2022: 1–10. (2022–10–12) [2023–04–24]. https://kns.cnki.net/kcms/detail/11.2127.tp.20221011.1633.008.html.
ZHANG Yu, LIANG Fengmei, LIU Jianxia. Skin lesion segmentation based on classification activation mapping and visual field attention[J/OL]. Computer engineering and applications, 2022: 1–10. (2022–10–12) [2023–04–24]. https://kns.cnki.net/kcms/detail/11.2127.tp.20221011.1633.008.html.
[7] RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2015: 234–241.
[8] ALMAJALID R, SHAN Juan, DU Yaodong, et al. Development of a deep-learning-based method for breast ultrasound image segmentation[C]//2018 17th IEEE International Conference on Machine Learning and Applications. Orlando: IEEE, 2018: 1103–1108.
[9] CHEN Gongping, DAI Yu, ZHANG Jianxun. RRCNet: refinement residual convolutional network for breast ultrasound images segmentation[J]. Engineering applications of artificial intelligence, 2023, 117: 105601.
[10] HE Qiqi, YANG Qiuju, XIE Minghao. HCTNet: a hybrid CNN-transformer network for breast ultrasound image segmentation[J]. Computers in biology and medicine, 2023, 155: 106629.
[11] ZHUANG Zhemin, LI Nan, JOSEPH RAJ A N, et al. An RDAU-NET model for lesion segmentation in breast ultrasound images[J]. PLoS One, 2019, 14(8): e0221535.
[12] D’ASCOLI S, TOUVRON H, LEAVITT M L, et al. ConViT: improving vision transformers with soft convolutional inductive biases[J]. Journal of statistical mechanics: theory and experiment, 2022, 2022(11): 114005.
[13] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. (2020–10–22) [2023–04–24]. http://arxiv.org/abs/2010.11929.
[14] VALANARASU J M J, OZA P, HACIHALILOGLU I, et al. Medical transformer: gated axial-attention for medical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2021: 36–46.
[15] CHEN Jieneng, LU Yongyi, YU Qihang, et al. TransUNet: transformers make strong encoders for medical image segmentation[EB/OL]. (2021–02–08)[2023–04–24]. http://arxiv.org/abs/2102.04306.
[16] ZHANG Yundong, LIU Huiye, HU Qiang. TransFuse: fusing transformers and CNNs for medical image segmentation[C]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2021: 24th International Conference. Strasbourg: ACM, 2021: 14–24.
[17] HEIDARI M, KAZEROUNI A, SOLTANY M, et al. HiFormer: hierarchical multi-scale representations using transformers for medical image segmentation[C]//2023 IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2023: 6191–6201.
[18] AL-DHABYANI W, GOMAA M, KHALED H, et al. Dataset of breast ultrasound images[J]. Data in brief, 2020, 28: 104863.
[19] YAP M H, GOYAL M, OSMAN F, et al. Breast ultrasound region of interest detection and lesion localisation[J]. Artificial intelligence in medicine, 2020, 107: 101880.
[20] DAI Jifeng, QI Haozhi, XIONG Yuwen, et al. Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 764–773.
[21] LIU Ze, LIN Yutong, CAO Yue, et al. Swin Transformer: hierarchical Vision Transformer using Shifted Windows[C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 9992–10002.
[22] CHEN C F R, FAN Quanfu, PANDA R. CrossViT: cross-attention multi-scale vision transformer for image classification[C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 347–356.
[23] BOKHOVKIN A, BURNAEV E. Boundary loss for remote sensing imagery semantic segmentation[C]//International Symposium on Neural Networks. Cham: Springer, 2019: 388–401.
[24] AZAD R, HEIDARI M, SHARIATNIA M, et al. TransDeepLab: convolution-free transformer-based DeepLab v3+ for medical image segmentation[C]//International Workshop on PRedictive Intelligence In MEdicine. Cham: Springer, 2022: 91–102.
[25] CAO Hu, WANG Yueyue, CHEN J, et al. Swin-unet: unet-like pure transformer for medical image segmentation[C]//Computer Vision – ECCV 2022 Workshops. Tel Aviv: ACM, 2022: 205–218.
[26] AZAD R, AL-ANTARY M T, HEIDARI M, et al. TransNorm: transformer provides a strong spatial normalization mechanism for a deep segmentation model[J]. IEEE access, 2022, 10: 108205–108215.

相似文献/References:: [1]殷瑞,苏松志,李绍滋.一种卷积神经网络的图像矩正则化策略[J].智能系统学报,2016,11(1):43.[doi:10.11992/tis.201509018]
　YIN Rui,SU Songzhi,LI Shaozi.Convolutional neural network’s image moment regularizing strategy[J].CAAI Transactions on Intelligent Systems,2016,11():43.[doi:10.11992/tis.201509018]
[2]龚震霆,陈光喜,任夏荔,等.基于卷积神经网络和哈希编码的图像检索方法[J].智能系统学报,2016,11(3):391.[doi:10.11992/tis.201603028]
　GONG Zhenting,CHEN Guangxi,REN Xiali,et al.An image retrieval method based on a convolutional neural network and hash coding[J].CAAI Transactions on Intelligent Systems,2016,11():391.[doi:10.11992/tis.201603028]
[3]刘帅师,程曦,郭文燕,等.深度学习方法研究新进展[J].智能系统学报,2016,11(5):567.[doi:10.11992/tis.201511028]
　LIU Shuaishi,CHENG Xi,GUO Wenyan,et al.Progress report on new research in deep learning[J].CAAI Transactions on Intelligent Systems,2016,11():567.[doi:10.11992/tis.201511028]
[4]师亚亭,李卫军,宁欣,等.基于嘴巴状态约束的人脸特征点定位算法[J].智能系统学报,2016,11(5):578.[doi:10.11992/tis.201602006]
　SHI Yating,LI Weijun,NING Xin,et al.A facial feature point locating algorithmbased on mouth-state constraints[J].CAAI Transactions on Intelligent Systems,2016,11():578.[doi:10.11992/tis.201602006]
[5]宋婉茹,赵晴晴,陈昌红,等.行人重识别研究综述[J].智能系统学报,2017,12(6):770.[doi:10.11992/tis.201706084]
　SONG Wanru,ZHAO Qingqing,CHEN Changhong,et al.Survey on pedestrian re-identification research[J].CAAI Transactions on Intelligent Systems,2017,12():770.[doi:10.11992/tis.201706084]
[6]杨晓兰,强彦,赵涓涓,等.基于医学征象和卷积神经网络的肺结节CT图像哈希检索[J].智能系统学报,2017,12(6):857.[doi:10.11992/tis.201706035]
　YANG Xiaolan,QIANG Yan,ZHAO Juanjuan,et al.Hashing retrieval for CT images of pulmonary nodules based on medical signs and convolutional neural networks[J].CAAI Transactions on Intelligent Systems,2017,12():857.[doi:10.11992/tis.201706035]
[7]王科俊,赵彦东,邢向磊.深度学习在无人驾驶汽车领域应用的研究进展[J].智能系统学报,2018,13(1):55.[doi:10.11992/tis.201609029]
　WANG Kejun,ZHAO Yandong,XING Xianglei.Deep learning in driverless vehicles[J].CAAI Transactions on Intelligent Systems,2018,13():55.[doi:10.11992/tis.201609029]
[8]莫凌飞,蒋红亮,李煊鹏.基于深度学习的视频预测研究综述[J].智能系统学报,2018,13(1):85.[doi:10.11992/tis.201707032]
　MO Lingfei,JIANG Hongliang,LI Xuanpeng.Review of deep learning-based video prediction[J].CAAI Transactions on Intelligent Systems,2018,13():85.[doi:10.11992/tis.201707032]
[9]王成济,罗志明,钟准,等.一种多层特征融合的人脸检测方法[J].智能系统学报,2018,13(1):138.[doi:10.11992/tis.201707018]
　WANG Chengji,LUO Zhiming,ZHONG Zhun,et al.Face detection method fusing multi-layer features[J].CAAI Transactions on Intelligent Systems,2018,13():138.[doi:10.11992/tis.201707018]
[10]葛园园,许有疆,赵帅,等.自动驾驶场景下小且密集的交通标志检测[J].智能系统学报,2018,13(3):366.[doi:10.11992/tis.201706040]
　GE Yuanyuan,XU Youjiang,ZHAO Shuai,et al.Detection of small and dense traffic signs in self-driving scenarios[J].CAAI Transactions on Intelligent Systems,2018,13():366.[doi:10.11992/tis.201706040]

备注/Memo

收稿日期:2023-04-24。
基金项目:山西省重点研发计划项目(202102030201012).
作者简介:彭雨彤，硕士研究生，主要研究方向为医学图像处理。E-mail：pyt34567@163.com;梁凤梅，副教授，博士，主要研究方向为图像处理与传输、智能信息处理。主持完成省自然科学基金1项、省科技成果推广项目1项、省技术创新项目1项。获得山西省科技进步二等奖1项（第一完成人）、山西省科技进步三等奖2项。发表学术论文50余篇。E-mail：fm_liang@163.com
通讯作者:梁凤梅. E-mail：fm_liang@163.com

更新日期/Last Update: 1900-01-01

融合CNN和ViT的乳腺超声图像肿瘤分割方法 PDF下载HTML

备注/Memo

融合CNN和ViT的乳腺超声图像肿瘤分割方法

PDF下载 HTML