[1]彭雨彤,梁凤梅.融合CNN和ViT的乳腺超声图像肿瘤分割方法[J].智能系统学报,2024,19(3):556-564.[doi:10.11992/tis.202304046]
PENG Yutong,LIANG Fengmei.Tumor segmentation method for breast ultrasound images incorporating CNN and ViT[J].CAAI Transactions on Intelligent Systems,2024,19(3):556-564.[doi:10.11992/tis.202304046]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第3期
页码:
556-564
栏目:
学术论文—机器学习
出版日期:
2024-05-05
- Title:
-
Tumor segmentation method for breast ultrasound images incorporating CNN and ViT
- 作者:
-
彭雨彤, 梁凤梅
-
太原理工大学 电子信息与光学工程学院, 山西 晋中 030600
- Author(s):
-
PENG Yutong, LIANG Fengmei
-
College of Electronic Information and Optical Engineering, Taiyuan University of Technology, Jinzhong 030600, China
-
- 关键词:
-
卷积神经网络; 乳腺超声图像分割; Swin Transformer; 交叉注意力机制; 混合损失函数; 可形变卷积; 多头跳跃注意力; 深度学习
- Keywords:
-
convolutional neural network; breast ultrasound image segmentation; Swin Transformer; crossover attention mechanism; hybrid-loss function; deformable convolution; multihead skip attention; deep learning
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202304046
- 文献标志码:
-
2023-08-31
- 摘要:
-
针对乳腺超声图像肿瘤区域形状大小差异大导致分割困难,卷积神经网络(convolutional neural networks, CNN)建模长距离依赖性和空间相关性方面存在局限性,视觉Transformer(vision Transformer, ViT)要求数据量巨大等问题,提出一种融合CNN和ViT的分割方法。使用改进的Swin Transformer模块和基于可形变卷积的CNN编码器模块分别提取全局特征和局部细节特征,设计使用交叉注意力机制融合这两种尺度的特征表示,训练过程采取二元交叉熵损失混合边界损失函数,有效提高分割精度。在两个公共数据集上的实验结果表明,与现有经典算法相比所提方法的分割结果有显著提升,dice系数提升3.841 2%,验证所提方法的有效性和可行性。
- Abstract:
-
A segmentation method that fuses CNN and ViT is proposed to address the problems of large differences in shape and size of tumor regions of breast ultrasound images that lead to difficulty in segmentation, limitations in long-range dependency and spatial correlation in convolutional neural network (CNN) modeling, and the huge amount of data required by vision Transformer (ViT). Global and local detail features were extracted using a modified Swin Transformer module and a CNN encoder module based on deformable convolution, respectively. The design uses a cross-attention mechanism to fuse the feature representations of the two scales, and the training process adopts a binary cross-entropy loss combined with a boundary loss function. This approach effectively improves the segmentation accuracy. Experimental results on two public datasets show that the segmentation findings of the proposed method have been significantly improved compared with those of the existing classical algorithms, with a 3.8412% improvement in the dice coefficient. This outcome verifies the effectiveness and feasibility of the proposed method.
更新日期/Last Update:
1900-01-01