[1]张森,任真.基于CNN-Lasso-XGBoost的中医痛经方剂预测模型[J].中国医学物理学杂志,2026,43(5):695-700.[doi:DOI:10.3969/j.issn.1005-202X.2026.05.019]
 ZHANG Sen,REN Zhen.CNN-Lasso-XGBoost based prediction model for TCM formulas in the treatment of dysmenorrhea[J].Chinese Journal of Medical Physics,2026,43(5):695-700.[doi:DOI:10.3969/j.issn.1005-202X.2026.05.019]
点击复制

基于CNN-Lasso-XGBoost的中医痛经方剂预测模型()

《中国医学物理学杂志》[ISSN:1005-202X/CN:44-1351/R]

卷:
43卷
期数:
2026年第5期
页码:
695-700
栏目:
医学人工智能
出版日期:
2026-05-28

文章信息/Info

Title:
CNN-Lasso-XGBoost based prediction model for TCM formulas in the treatment of dysmenorrhea
文章编号:
1005-202X(2026)05-0695-06
作者:
张森任真
甘肃中医药大学医学信息工程学院, 甘肃 兰州 730101
Author(s):
ZHANG Sen REN Zhen
College of Medical Information Engineering, Gansu University of Chinese Medicine, Lanzhou 730101, China
关键词:
痛经方剂特征选择卷积神经网络预测模型
Keywords:
Keywords: formula for dysmenorrhea feature selection convolutional neural network prediction model
分类号:
R318;TP391.7
DOI:
DOI:10.3969/j.issn.1005-202X.2026.05.019
文献标志码:
A
摘要:
针对中医痛经诊疗中四诊信息特征复杂繁多与单一机器学习模型处理多特征数据精度不足的双重挑战,构建CNN-Lasso-XGBoost混合模型,卷积神经网络(CNN)通过3层卷积-池化结构对原始四诊数据进行多层非线性变换,捕捉症状间的局部关联特征。离散的四诊特征经分类后编码重构为二维矩阵输入CNN,Lasso回归通过L1正则化惩罚项对特征向量进行稀疏化处理,依据系数绝对值筛选出28个关键特征,有效去除与方剂分类无关的冗余信息,形成低维高效的特征空间。最终,XGBoost集成学习模型基于优化后的特征子集完成方剂分类,实验结果表明,基于CNN-Lasso-XGBoost方剂分类预测模型在痛经数据集上准确率达到93.93%,相较于其他机器学习模型准确率提升1.53%~4.66%,并且在肺癌公开数据集上模型准确率达到98.39%,优于部分现有模型,验证本文模型的有效性和泛化能力。 【关键词】痛经方剂;特征选择;卷积神经网络;预测模型
Abstract:
Abstract: A CNN-Lasso-XGBoost hybrid model is constructed to respond to the dual challenges of the complex and diverse characteristics of the four-diagnostic methods in traditional Chinese medicine for dysmenorrhea diagnosis, as well as the limited accuracy of single machine learning models in processing multi-feature data. Convolutional neural network (CNN) performs multi-layer nonlinear transformations of the original four-diagnostic data through a three-layer convolution-pooling structure, thereby capturing local correlation features among symptoms. The discrete four-diagnostic features are classified, encoded, and reconstructed into a two-dimensional matrix as the input of the CNN. Subsequently, Lasso regression uses an L1 regularization penalty to sparsify the feature vectors. Based on the absolute values of the coefficients, 28 key features are selected, which effectively eliminates redundant information unrelated to formula classification, and forms a low-dimensional and efficient feature space. Finally, the XGBoost ensemble learning model completes formula classification based on the optimized feature subsets. Experimental results reveal that the CNN-Lasso-XGBoost formula classification and prediction model achieves an accuracy of 93.93% on the dysmenorrhea dataset, showing an improvement of 1.53% to 4.66% compared with other machine learning models. Additionally, the proposed model yields an accuracy of 98.39% on a public lung cancer dataset, outperforming some existing models and thus verifying its effectiveness and generalization ability.

相似文献/References:

[1]李长胜,王瑜,肖洪兵,等.基于随机森林算法的阿尔茨海默症医学影像分类[J].中国医学物理学杂志,2020,37(8):1005.[doi:DOI:10.3969/j.issn.1005-202X.2020.08.013]
 LI Changsheng,WANG Yu,XIAO Hongbing,et al.Medical image classification for Alzheimers disease diagnosis based on random forest algorithm[J].Chinese Journal of Medical Physics,2020,37(5):1005.[doi:DOI:10.3969/j.issn.1005-202X.2020.08.013]
[2]李坤鹏,王泽朋,周玉,等.人工智能在肿瘤基因表达数据中的应用研究进展[J].中国医学物理学杂志,2024,41(3):389.[doi:DOI:10.3969/j.issn.1005-202X.2024.03.018]
 LI Kunpeng,WANG Zepeng,ZHOU Yu,et al.Review on application of artificial intelligence in tumor gene expression data analysis[J].Chinese Journal of Medical Physics,2024,41(5):389.[doi:DOI:10.3969/j.issn.1005-202X.2024.03.018]

备注/Memo

备注/Memo:
【收稿日期】2025-12-27 【基金项目】甘肃省联合科研基金(23JRRA1528);甘肃省自然科学基金(23JRRA1719) 【作者简介】张森,硕士研究生,研究方向:医学人工智能,E-mail: 13598312953@163.com 【通信作者】任真,副教授,硕士生导师,研究方向:机器学习,E-mail: rz@gszy.edu.cn
更新日期/Last Update: 2026-05-29