Article Preview
TopIntroduction
Semantic Web (SW) (Bonatti et al., 2019; d’Amato, 2020; Noura et al., 2019; Osman, 2021) has been paid researchers' growing attention, which provides excellent convenience for people to link and process diverse data. Ontology (Storey, 2017; Wand & Weber, 2017; Verdonck & Gailly, 2018; Kazi & Kazi, 2019; Kuster, 2020) is SW’s kernel technique, and biomedical ontology formally defines the biomedical entities and their relationships. However, the same biomedical entity in different biomedical ontologies may be defined in diverse contexts or different terms, resulting in the problem of biomedicine semantic heterogeneity. To solve this heterogeneity problem, it is vital to determine mappings among heterogeneity entities to bridge the semantic gaps, which is the so-called biomedical ontology matching.
Since it is unrealistic to manually determine the mapping when the scale of ontology is enormous, various (semi)automatic ontology matching techniques (Xue & Chen, 2020a; Xue & Wang, 2015b) have been proposed. A variety of applications have been investigated successfully using the Evolutionary Algorithm (EA) (Huang et al., 2011; Pan et al., 2020; Liu, 2020) and Machine Learning (ML) (Chen, 2018; Chen et al., 2020; Lin et al, 2020a; Lin et al, 2020c). Also, EA and ML-based ontology matching techniques are regarded as promising approaches, e.g., Compact Interactive Memetic Algorithm (CIMA) (Xue & Liu, 2017), Uniform Compact Genetic Algorithm (UCGA) (Jiang & Xue, 2021), Decision Tree (DT) (Amrouch et al., 2016), Logistic Regression (LR) (Alboukaey & Joukhadar, 2018), Support Vector Machine (SVM) (Mao et al., 2011). A method based on ML-based was first proposed to match ontologies that similarity measure was expressed by a joint probability distribution of entities involved (Doan et al., 2004). Mao et al. deemed the ontology matching problem as a binary classification problem and utilized a non-instance learning-based ontology mapping approach through SVM to address the problem (Mao et al. 2011). Khoudja et al. adopted the neural network to integrate several top-ranked ontology matchers to enhance the quality of alignment (Khoudja et al. 2020). However, these matching techniques cannot determine superior alignment due to the plentiful semantic meaning and flexible representation of biomedical entities. Also, several models (Santos et al., 2020; Harrow et al., 2020; Lin et al, 2020b), while effective for solving ontology matching problem or sequence labeling task, are not designed and suitable for matching biomedical ontologies. Hence biomedical ontology matching problem is still an open challenge in terms of the alignment’s quality. To face this challenge, an Attention-based Bidirectional Long Short-Term Memory Network (At-BLSTM)-based matching technique is proposed, which makes use of the semantic relationships of entities to find the mappings. By introducing the attention mechanism and bidirectional idea into LSTM, At-BLSTM is able to connect future and past contexts of entity pairs and catch the significant part to enhance the accuracy of the model. In addition, our proposal further improves the alignments’ quality by introducing the character embedding technique, which takes into account the semantic and context information of entities. Furthermore, At-BLSTM has the capability of determining superior mappings to overcome the biomedical ontology matching problem.