Leveraging Biomedical Ontological Knowledge to Improve Clinical Term Embeddings

File(s)
Date
2023-05-01Author
Abuzahra, Fuad Hatem
Department
Engineering
Advisor(s)
Rohit Kate
Metadata
Show full item recordAbstract
ABSTRACT Leveraging Biomedical Ontological Knowledge to Improve Clinical Term Embeddings by Fuad Abu Zahra The University of Wisconsin-Milwaukee, 2023 Under the Supervision of Dr. Rohit J. Kate This research is on obtaining and using word embeddings for natural language processing tasks in the biomedical domain. Word embeddings are vector representations of words commonly obtained from large text corpora. This research leverages the biomedical ontology of SNOMED CT as an alternate source for obtaining embeddings for clinical terms. The existing graph-based methods can only give embeddings for concepts (i.e., nodes of the graph) of an ontology, hence we developed a novel method to obtain embeddings for clinical words and terms from their concept embeddings. These embeddings were evaluated on benchmark datasets of clinical term similarity and on the clinical term normalization task and were found to work better than corpus-based embeddings. However, unlike corpus-based embeddings, the embeddings obtained from SNOMED CT do not incorporate linguistic knowledge as the method was not trained on text data. Therefore, we also developed two new methods to combine the two resources of embeddings – by generating a synthetic corpus out of SNOMED CT ontology and using it for additional training using corpus-based methods, and by fine-tuning a corpus-based system on SNOMED CT concept embeddings. The evaluation showed that the combined embeddings obtained using these methods perform better than either type of embeddings.
Subject
Bidirectional Encoder Representations from Transformers (BERT)
Clinical Ontology
Medical Ontology
Ontology Embeddings
SNOMED CT
Word Embeddings
Permanent Link
http://digital.library.wisc.edu/1793/93133Type
dissertation