基于上下文特征感知和双频上采样的食管早癌图像分割-《中国医学物理学杂志》

文章信息/Info

Title:: Early esophageal cancer image segmentation based on contextual feature awareness and dual frequency upsampling

作者:: 孟延宗1; 李小霞1; 2; 周颖玥1; 2; 文黎明3; 秦佳敏3; 刘爽利1; 2; 1.西南科技大学信息与工程学院，四川绵阳 621000； 2.特殊环境机器人技术四川省重点实验室，四川绵阳 621000； 3.四川绵阳四0四医院消化内科，四川绵阳 621000

Author(s):: MENG Yanzong1; LI Xiaoxia1; 2; ZHOU Yingyue1; 2; WEN Liming3; QIN Jiamin3; LIU Shuangli1; 2; 1. School of Information Engineering, Southwest University of Science and Technology, Mianyang 621000, China 2. Robot Technology Used for Special Environment Key Laboratory of Sichuan Province,Mianyang 621000, China 3. Department of Gastroenterology, Sichuan Mianyang 404 Hospital, Mianyang 621000, China

Keywords:: Keywords: early esophageal cancer contextual feature awareness attention mechanism dilated convolution dual frequency upsampling

摘要:: 目的：针对食管早癌图像分割过程中病灶边缘等细节信息丢失的问题，在U-net基础上提出一种基于上下文特征感知和双频上采样的食管早癌图像分割网络。方法：利用注意力机制和可分离空洞卷积改进上下文特征感知模块，获取全文上下文信息，提取更多特征细节。提出双频上采样模块，分别从高频和低频进行上采样，有效减少单一上采样因像素插值产生的锯齿效应和转置卷积造成的棋盘效应，减少细节信息的丢失。结果：本文方法的平均交并比、敏感度和特异性分别达到80.34%、87.47%和91.53%。结论：本文模型优于nnU-Net等主流语义分割模型，保留更多的细节信息，提高食管早癌图像分割精度。

Abstract:: Abstract: Objective To propose a network for early esophageal cancer image segmentation using U-net with contextual feature awareness module and dual frequency upsampling module which solves the problem of loss of detailed information such as lesion edges during image segmentation. Methods The contextual feature awareness module improved with the attention mechanism and separable dilated convolution was used to obtain full-text contextual information and extract more feature details. The dual frequency upsampling module was adopted for upsampling from high frequency and low frequency, thereby effectively reducing the aliasing effect caused by pixel interpolation, minimizing the checkerboard effect caused by transposed convolution during single upsampling, and avoiding the loss of detail information. Results The mean intersection over union, sensitivity and specificity of the proposed method reached 80.34%, 87.47%, and 91.53%, respectively. Conclusion The proposed model is superior to mainstream semantic segmentation models such as nnU-Net for it can retain more detailed information and improve the accuracy of early esophageal cancer image segmentation.