CNN-Lasso-XGBoost based prediction model for TCM formulas in the treatment of dysmenorrhea(PDF)
《中国医学物理学杂志》[ISSN:1005-202X/CN:44-1351/R]
- Issue:
- 2026年第5期
- Page:
- 695-700
- Research Field:
- 医学人工智能
- Publishing date:
Info
- Title:
- CNN-Lasso-XGBoost based prediction model for TCM formulas in the treatment of dysmenorrhea
- Author(s):
- ZHANG Sen; REN Zhen
- College of Medical Information Engineering, Gansu University of Chinese Medicine, Lanzhou 730101, China
- Keywords:
- Keywords: formula for dysmenorrhea feature selection convolutional neural network prediction model
- PACS:
- R318;TP391.7
- DOI:
- DOI:10.3969/j.issn.1005-202X.2026.05.019
- Abstract:
- Abstract: A CNN-Lasso-XGBoost hybrid model is constructed to respond to the dual challenges of the complex and diverse characteristics of the four-diagnostic methods in traditional Chinese medicine for dysmenorrhea diagnosis, as well as the limited accuracy of single machine learning models in processing multi-feature data. Convolutional neural network (CNN) performs multi-layer nonlinear transformations of the original four-diagnostic data through a three-layer convolution-pooling structure, thereby capturing local correlation features among symptoms. The discrete four-diagnostic features are classified, encoded, and reconstructed into a two-dimensional matrix as the input of the CNN. Subsequently, Lasso regression uses an L1 regularization penalty to sparsify the feature vectors. Based on the absolute values of the coefficients, 28 key features are selected, which effectively eliminates redundant information unrelated to formula classification, and forms a low-dimensional and efficient feature space. Finally, the XGBoost ensemble learning model completes formula classification based on the optimized feature subsets. Experimental results reveal that the CNN-Lasso-XGBoost formula classification and prediction model achieves an accuracy of 93.93% on the dysmenorrhea dataset, showing an improvement of 1.53% to 4.66% compared with other machine learning models. Additionally, the proposed model yields an accuracy of 98.39% on a public lung cancer dataset, outperforming some existing models and thus verifying its effectiveness and generalization ability.
Last Update: 2026-05-29