[1]熊鹏,郑浩然.SpecEmbedding:一种面向化合物鉴定的深度学习嵌入方法[J].中国医学物理学杂志,2025,42(12):1660-1667.[doi:DOI:10.3969/j.issn.1005-202X.2025.12.017]
 XIONG Peng,ZHENG Haoran.SpecEmbedding: a deep learning based embedding approach for compound identification[J].Chinese Journal of Medical Physics,2025,42(12):1660-1667.[doi:DOI:10.3969/j.issn.1005-202X.2025.12.017]
点击复制

SpecEmbedding:一种面向化合物鉴定的深度学习嵌入方法()

《中国医学物理学杂志》[ISSN:1005-202X/CN:44-1351/R]

卷:
42
期数:
2025年第12期
页码:
1660-1667
栏目:
医学生物信息
出版日期:
2025-12-29

文章信息/Info

Title:
SpecEmbedding: a deep learning based embedding approach for compound identification
文章编号:
1005-202X(2025)12-1660-08
作者:
熊鹏郑浩然
中国科学技术大学计算机科学与技术学院, 安徽 合肥 230027
Author(s):
XIONG Peng ZHENG Haoran
School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China
关键词:
质谱化合物鉴定表征学习对比学习
Keywords:
Keywords: mass spectrum compound identification representation learning contrastive learning
分类号:
R318
DOI:
DOI:10.3969/j.issn.1005-202X.2025.12.017
文献标志码:
A
摘要:
为应对质谱图在化合物结构多样性和实验环境差异下所表现出的异质性问题,提出一种提升质谱图间可比性的表征方法。该方法命名为SpecEmbedding,融合正弦嵌入与监督对比学习策略,旨在将高维、复杂的质谱图转化为低维向量表示。在GNPS公共数据集上对该方法进行训练与评估,并将其与主流方法进行对比。实验结果显示,SpecEmbedding在测试集上Top-1命中率指标上达到84.38%,相较目前最优方法CLERMS提高6.3%。该方法能显著增强质谱图间的可比性,有效提升化合物鉴定任务中的准确性与鲁棒性。
Abstract:
Abstract: To address the heterogeneity of mass spectra caused by the structural diversity of compounds and variations in experimental conditions, a novel representation method named SpecEmbedding is proposed to enhance the comparability between mass spectra. SpecEmbedding integrates sinusoidal embedding and supervised contrastive learning strategy, aiming to transform high-dimensional and complex mass spectra into low-dimensional vector representations. This approach is trained and evaluated on the public GNPS dataset, with comparion performed against mainstream methods. Experimental results show that SpecEmbedding achieves a Top-1 hit rate of 84.38% on the test set, representing a 6.3% improvement over CLERMS, the current state-of-the-art method. These findings demonstrate that SpecEmbedding significantly improves the comparability between mass spectra while effectively enhancing accuracy and robustness of compound identification tasks.

备注/Memo

备注/Memo:
【收稿日期】2025-04-12 【基金项目】中国科学院战略性先导科技专项(XDB38000000) 【作者简介】熊鹏,硕士研究生,研究方向:生物信息学,E-mail: pengx@ mail.ustc.edu.cn 【通信作者】郑浩然,副教授,硕士生导师,研究方向:生物信息学,E-mail: hrzheng@ustc.edu.cn
更新日期/Last Update: 2025-12-29