一种基于CNN-Transformer的青光眼辅助诊断双编码分割网络模型-《中国医学物理学杂志》

文章信息/Info

Title:: CNN-Transformer-based dual-encoder segmentation network model for glaucoma auxiliary diagnosis

文章编号:: 1005-202X（2026）02-0268-08

作者:: 马宇张; 张伟; 邵浩辰; 甘肃中医药大学医学信息工程学院，甘肃兰州 730000

Author(s):: MA Yuzhang; ZHANG Wei; SHAO Haochen; School of Medical Information Engineering, Gansu University of Chinese Medicine, Lanzhou 730000, China

关键词:: 视杯视盘分割; 青光眼辅助诊断; Transformer; 特征融合; 注意力机制

Keywords:: Keywords: optic cup and disc segmentation auxiliary diagnosis of glaucoma Transformer feature fusion attention mechanism

分类号:: R318；TP391

DOI:: DOI:10.3969/j.issn.1005-202X.2026.02.018

文献标志码:: A

摘要:: 视杯与视盘的精准分割是青光眼早期筛查中形态学参数计算的关键环节，针对现有方法因局部-全局特征融合效率低、长距离依赖建模不足导致的边界模糊和分割精度受限问题，提出一种基于CNN-Transformer的青光眼辅助诊断双编码分割网络模型。首先，设计双分支互补特征融合模块替代传统跳跃连接，通过动态权重分配策略实现CNN局部细节与Transformer全局上下文的协同优化以提升特征融合效率。其次，在Transformer编码器中引入全局注意力增强模块，利用多头自注意力机制建模像素级长距离依赖关系，结合深度可分离卷积增强边界区域的上下文感知能力，有效缓解视杯/视盘边缘不连续问题。在REFUGE数据集上的实验表明，该方法在视盘分割任务中Dice系数和IoU较U-Net分别提升4.11%和5.62%；该方法在视杯分割任务中Dice系数和IoU较U-Net分别提升11.75%和19.30%。【关键词】视杯视盘分割；青光眼辅助诊断；Transformer；特征融合；注意力机制

Abstract:: Accurate segmentation of the optic cup and optic disc is a critical step for calculating morphological parameters in the early screening of glaucoma. To address the issues of boundary blurring and limited segmentation accuracy caused by the inefficiency of local-global feature fusion and inadequate modeling of long-distance dependency modeling in existing methods, this study proposes a dual-encoder segmentation network model based on CNN-Transformer for auxiliary diagnosis of glaucoma. Specifically, a dual-branch complementary feature fusion module is designed to replace the traditional skip connection, and a dynamic weight allocation strategy is adopted to achieve the synergistic optimization of the local details of the CNN and the global context of the Transformer, thereby improving the efficiency of feature fusion. Furthermore, a global attention enhancement module is introduced into the Transformer encoder, which models the pixel-level long-distance dependencies using the multi-head self-attention mechanism, and enhances the context awareness of the boundary regions by integrating the depth-separable convolution, thus effectively alleviating the discontinuity of optic cup/disc edge. Experiments on the REFUGE dataset show that compared with U-Net, the proposed method achieves a 4.11% improvement in Dice coefficient and 5.62% improvement in IoU for the optic disc segmentation task, while for the optic cup segmentation task, the corresponding improvements in the Dice coefficient and IoU reach 11.75% and 19.30%, respectively.

《中国医学物理学杂志》[ISSN:1005-202X/CN:44-1351/R]

文章信息/Info

备注/Memo

常用功能

导航/Navigate

工具/Tools

统计/Statistics