|Table of Contents|

 Clinical data-based model for gastric cancer screening(PDF)

《中国医学物理学杂志》[ISSN:1005-202X/CN:44-1351/R]

Issue:
2019年第9期
Page:
1095-1102
Research Field:
其他(激光医学等)
Publishing date:

Info

Title:
 Clinical data-based model for gastric cancer screening
Author(s):
 YANG Rong1 CHEN Yu2 GAO Hongmei1 CHEN Xianlai3 4
 1. Xiangya Hospital, Central South University, Changsha 410078, China; 2. Xiangya School of Medicine, Central South University, Changsha 410013, China; 3. Information Security and Big Data Research Institute, Central South University, Changsha 410083, China; 4. National Engineering Laboratory for Medical Big Data Application Technology, Central South University, Changsha 410083, China
Keywords:
 Keywords: stomach neoplasms model for disease screening clinical data decision tree
PACS:
R319
DOI:
DOI:10.3969/j.issn.1005-202X.2019.09.020
Abstract:
Abstract: Objective To establish an auxiliary screening model based on clinical data and machine learning for improving the early diagnosis of gastric cancer. Methods A total of 5 585 cases of gastric cancer (ICD code: C16*, group A) were selected as research subjects. In addition, 6 000 cases (group B) from 57 657 cases of non-gastric malignant tumors (ICD code: C*, except C16*) and 6 000 cases of non-malignant tumors (group C) from 47 225 healthy persons were randomly selected as controls. Demographical information (gender, age), laboratory tests (routine blood test, blood lipid/liver function, tumor-related markers, Hp, etc.) were extracted from clinical data. Pearson’s correlation analysis was used to analyze the relationship between each indicator and diagnosis; and independent sample t test was performed for detecting the differences in indicators among different groups. A total of 53 indicators such as gender, age, carcinoembryonic antigen (CEA), fecal occult blood were selected as decision variables. An auxiliary model was established for gastric cancer screening by decision tree algorithm C5.0. Results The indicators such as age, CEA and CA153 were significantly correlated with gastric cancer (P<0.05). For the inter-group of group A and B, group B and C, group A and C, the indicators with inter-group differences were different. A model with 51 rules for gastric cancer screening was obtained by data mining. The top 10 indicators ranked by importance in the model were as follow: CA199, CA153, CEA, etc. The accuracy of the model was 89.58% for training set and 89.14% for test set. The area under curve was 0.809 for the model. Conclusion Through the analysis of clinical data, the important indicators for the early diagnosis of gastric cancer can be determined. An auxiliary model for gastric cancer screening can be established based on clinical data using data mining. The established model has excellent assistant value for gastric cancer screening.

References:

Memo

Memo:
-
Last Update: 2019-09-24