[1]马宇张,张伟,邵浩辰.一种基于CNN-Transformer的青光眼辅助诊断双编码分割网络模型[J].中国医学物理学杂志,2026,43(2):268-275.[doi:DOI:10.3969/j.issn.1005-202X.2026.02.018]
 MA Yuzhang,ZHANG Wei,SHAO Haochen.CNN-Transformer-based dual-encoder segmentation network model for glaucoma auxiliary diagnosis[J].Chinese Journal of Medical Physics,2026,43(2):268-275.[doi:DOI:10.3969/j.issn.1005-202X.2026.02.018]
点击复制

一种基于CNN-Transformer的青光眼辅助诊断双编码分割网络模型()

《中国医学物理学杂志》[ISSN:1005-202X/CN:44-1351/R]

卷:
43卷
期数:
2026年第2期
页码:
268-275
栏目:
医学人工智能
出版日期:
2026-02-27

文章信息/Info

Title:
CNN-Transformer-based dual-encoder segmentation network model for glaucoma auxiliary diagnosis
文章编号:
1005-202X(2026)02-0268-08
作者:
马宇张张伟邵浩辰
甘肃中医药大学医学信息工程学院, 甘肃 兰州 730000
Author(s):
MA Yuzhang ZHANG Wei SHAO Haochen
School of Medical Information Engineering, Gansu University of Chinese Medicine, Lanzhou 730000, China
关键词:
视杯视盘分割青光眼辅助诊断Transformer特征融合注意力机制
Keywords:
Keywords: optic cup and disc segmentation auxiliary diagnosis of glaucoma Transformer feature fusion attention mechanism
分类号:
R318;TP391
DOI:
DOI:10.3969/j.issn.1005-202X.2026.02.018
文献标志码:
A
摘要:
视杯与视盘的精准分割是青光眼早期筛查中形态学参数计算的关键环节,针对现有方法因局部-全局特征融合效率低、长距离依赖建模不足导致的边界模糊和分割精度受限问题,提出一种基于CNN-Transformer的青光眼辅助诊断双编码分割网络模型。首先,设计双分支互补特征融合模块替代传统跳跃连接,通过动态权重分配策略实现CNN局部细节与Transformer全局上下文的协同优化以提升特征融合效率。其次,在Transformer编码器中引入全局注意力增强模块,利用多头自注意力机制建模像素级长距离依赖关系,结合深度可分离卷积增强边界区域的上下文感知能力,有效缓解视杯/视盘边缘不连续问题。在REFUGE数据集上的实验表明,该方法在视盘分割任务中Dice系数和IoU较U-Net分别提升4.11%和5.62%;该方法在视杯分割任务中Dice系数和IoU较U-Net分别提升11.75%和19.30%。 【关键词】视杯视盘分割;青光眼辅助诊断;Transformer;特征融合;注意力机制
Abstract:
Accurate segmentation of the optic cup and optic disc is a critical step for calculating morphological parameters in the early screening of glaucoma. To address the issues of boundary blurring and limited segmentation accuracy caused by the inefficiency of local-global feature fusion and inadequate modeling of long-distance dependency modeling in existing methods, this study proposes a dual-encoder segmentation network model based on CNN-Transformer for auxiliary diagnosis of glaucoma. Specifically, a dual-branch complementary feature fusion module is designed to replace the traditional skip connection, and a dynamic weight allocation strategy is adopted to achieve the synergistic optimization of the local details of the CNN and the global context of the Transformer, thereby improving the efficiency of feature fusion. Furthermore, a global attention enhancement module is introduced into the Transformer encoder, which models the pixel-level long-distance dependencies using the multi-head self-attention mechanism, and enhances the context awareness of the boundary regions by integrating the depth-separable convolution, thus effectively alleviating the discontinuity of optic cup/disc edge. Experiments on the REFUGE dataset show that compared with U-Net, the proposed method achieves a 4.11% improvement in Dice coefficient and 5.62% improvement in IoU for the optic disc segmentation task, while for the optic cup segmentation task, the corresponding improvements in the Dice coefficient and IoU reach 11.75% and 19.30%, respectively.

备注/Memo

备注/Memo:
【收稿日期】2025-07-25 【基金项目】甘肃省教育厅创新基金(2022B-113) 【作者简介】马宇张,硕士研究生,研究方向:医学图像处理,E-mail: myzhhh@outlook.com 【通信作者】张伟,副教授,硕士生导师,研究方向:医学信号处理、计算机仿真,E-mail: 4865354@qq.com
更新日期/Last Update: 2026-01-27