Contrastive learning-driven multi-omics molecular subtyping of ovarian cancer(PDF)
《中国医学物理学杂志》[ISSN:1005-202X/CN:44-1351/R]
- Issue:
- 2026年第4期
- Page:
- 553-560
- Research Field:
- 医学人工智能
- Publishing date:
Info
- Title:
- Contrastive learning-driven multi-omics molecular subtyping of ovarian cancer
- Author(s):
- HAN Xiaoxin1; LIU Qingchen1; HU Yuchen1; KUANG Jingfan1; WANG Jianlin2
- 1. School of Medical Information Engineering, Gansu University of Chinese Medicine, Lanzhou 730000, China 2. Information Center, Lanzhou University First Hospital, Lanzhou 730013, China
- Keywords:
- Keywords: ovarian cancer molecular subtyping multi-omics data contrastive learning deep clustering
- PACS:
- R318;R73
- DOI:
- DOI:10.3969/j.issn.1005-202X.2026.04.020
- Abstract:
- Abstract: Molecular subtyping of ovarian cancer is essential for personalized treatment and prognostic assessment, but high tumor heterogeneity and the high-dimensional, low-sample size problem compromise the accuracy of traditional methods. An end-to-end multi-omics model called the contrastive deep clustering model (CDCM) which integrates contrastive learning and deep clustering is proposed to address the aforementioned challenges. CDCM fuses RNA-seq, CNV and DNA methylation data from TCGA-OV, and inputs them into an autoencoder to capture complex nonlinear interactions in the data. Subsequently, 4 data-augmentation strategies are used to construct positive and negative sample pairs, and a contrastive learning mechanism is employed to effectively enhance representation robustness and discriminability, thereby alleviating overfitting under the high-dimensional, low-sample size condition. Finally, a clustering loss based on the Students t-distribution is jointly optimized with the contrastive loss to directly drive samples toward cluster centers and obtain more separable, well-defined subtypes. Ablation experiments using XGBoost quantify the contributions of omics modality demonstrate that multi-omics integration can substantially improve subtyping performance. To enhance the models biological interpretability, XGBoost and WGCNA are combined to identify 12 candidate biomarkers associated with ovarian cancer, 10 of which have been validated in existing literature. CDCM outperforms classical models such as K-Means, and achieves a silhouette score of 0.579, a Calinski-Harabasz index of 344.85, and a survival-difference significance of -lg P=1.771, providing a new methodological avenue for precision diagnosis and treatment of ovarian cancer.
Last Update: 2026-04-29