[1]李盛青,苏前敏,黄继汉.基于BioBERT与BiLSTM的临床试验纳排标准命名实体识别[J].中国医学物理学杂志,2024,41(1):125-132.[doi:DOI:10.3969/j.issn.1005-202X.2024.01.018]
 LI Shengqing,SU Qianmin,HUANG Jihan.Named entity recognition of eligibility criteria for clinical trials based on BioBERT and BiLSTM[J].Chinese Journal of Medical Physics,2024,41(1):125-132.[doi:DOI:10.3969/j.issn.1005-202X.2024.01.018]
点击复制

基于BioBERT与BiLSTM的临床试验纳排标准命名实体识别()
分享到:

《中国医学物理学杂志》[ISSN:1005-202X/CN:44-1351/R]

卷:
41卷
期数:
2024年第1期
页码:
125-132
栏目:
医学人工智能
出版日期:
2024-01-23

文章信息/Info

Title:
Named entity recognition of eligibility criteria for clinical trials based on BioBERT and BiLSTM
文章编号:
1005-202X(2024)01-0125-08
作者:
李盛青1苏前敏1黄继汉2
1.上海工程技术大学电子电气工程学院, 上海 201620; 2.上海中医药大学药物临床研究中心, 上海 201203
Author(s):
LI Shengqing1 SU Qianmin1 HUANG Jihan2
1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China 2. Center for Drug Clinical Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
关键词:
纳排标准命名实体识别双向长短期记忆网络条件随机场临床试验
Keywords:
Keywords: eligibility criteria named entity recognition bidirectional long short-term memory network conditional random field clinical trial
分类号:
R318
DOI:
DOI:10.3969/j.issn.1005-202X.2024.01.018
文献标志码:
A
摘要:
目的:提出一种基于BioBERT预训练模型的纳排标准命名实体识别方法(BioBERT-Att-BiLSTM-CRF),可自动提取临床试验相关信息,为高效制定纳排标准提供帮助。方法:结合UMLS医学语义网络和专家定义方式,制定医学实体标注规则,并建立命名实体识别语料库以明确实体识别任务。BioBERT-Att-BiLSTM-CRF首先将文本转换为BioBERT向量并输入至双向长短期记忆网络以捕捉上下文语义特征;同时运用注意力机制来提取关键特征;最终采用条件随机场解码并输出最优标签序列。结果:BioBERT-Att-BiLSTM-CRF在纳排标准命名实体识别数据集上的效果优于其他基准模型。结论:使用BioBERT-Att-BiLSTM-CRF能更高效地提取临床试验的纳排标准相关信息,从而增强临床试验注册数据的科学性,并为临床试验纳排标准的制定提供帮助。
Abstract:
Abstract: Objective To present a named entity recognition method referred to as BioBERT-Att-BiLSTM-CRF for eligibility criteria based on the BioBERT pretrained model. The method can automatically extract relevant information from clinical trials and provide assistance in efficiently formulating eligibility criteria. Methods Based on the UMLS medical semantic network and expert-defined rules, the study established medical entity annotation rules and constructed a named entity recognition corpus to clarify the entity recognition task. BioBERT-Att-BiLSTM-CRF converted the text into BioBERT vectors and inputted them into a bidirectional long short-term memory network to capture contextual semantic features. Meanwhile, attention mechanisms were applied to extract key features, and a conditional random field was used for decoding and outputting the optimal label sequence. Results BioBERT-Att-BiLSTM-CRF outperformed other baseline models on the eligibility criteria named entity recognition dataset. Conclusion BioBERT-Att-BiLSTM-CRF can efficiently extract eligibility criteria-related information from clinical trials, thereby enhancing the scientific validity of clinical trial registration data and providing assistance in the formulation of eligibility criteria for clinical trials.

相似文献/References:

[1]马诗语,黄润才.基于ALBERT与BILSTM的糖尿病命名实体识别[J].中国医学物理学杂志,2021,38(11):1438.[doi:DOI:10.3969/j.issn.1005-202X.2021.11.021]
 MA Shiyu,HUANG Runcai.Named entity recognition of diabetes based on ALBERT and BILSTM[J].Chinese Journal of Medical Physics,2021,38(1):1438.[doi:DOI:10.3969/j.issn.1005-202X.2021.11.021]

备注/Memo

备注/Memo:
【收稿日期】2023-08-20 【作者简介】李盛青,硕士研究生,研究方向:人工智能技术,E-mail: lsq1118@126.com 【通信作者】苏前敏,博士,副教授,研究方向:医学数据挖掘、医学数据分析,E-mail: suqm@sues.edu.com
更新日期/Last Update: 2024-01-23