Title: Enhancing Ontology Matching by Using Machine Learning, Graph Matching and Information Retrieval Techniques
Abstract: In recent years, ontologies have attracted a lot of attention in the Computer Science community, especially in the Semantic Web field. They serve as explicit conceptual knowledge models and provide the semantic vocabularies that make domain knowledge available for exchange and interpretation among information systems. However, due to the decentralized nature of the semantic web, ontologies are highly heterogeneous. This heterogeneity mainly causes the problem of variation in meaning or ambiguity in entity interpretation and, consequently, it prevents domain knowledge from sharing. Therefore, ontology matching, which discovers correspondences between semantically related entities of ontologies, becomes a crucial task in semantic web applications.
Several challenges to the field of ontology matching have been outlined in recent research. Among them, selection of the appropriate similarity measures as well as configuration tuning of their combination are known as fundamental issues that the community should deal with. In addition, verifying the semantic coherent of the discovered alignment is also known as a crucial task. Furthermore, the difficulty of the problem grows with the size of the ontologies.
To deal with these challenges, in this thesis, we propose a novel matching approach which combines different techniques coming from the fields of machine learning, graph matching and information retrieval in order to enhance the ontology matching quality. Indeed, we make use of information retrieval techniques to design new effective similarity measures for comparing labels and context profiles of entities at element level. We also apply a graph matching method named similarity propagation at structure level that effectively discovers mappings by exploring structural information of entities in the input ontologies. In terms of combination similarity measures at element level, we transform the ontology matching task into a classification task in machine learning. Besides, we propose a dynamic weighted sum method to automatically combine the matching results obtained from the element and structure level matchers. In order to remove inconsistent mappings, we design a new fast semantic filtering method. Finally, to deal with large scale ontology matching task, we propose two candidate selection methods to reduce computational space.
All these contributions have been implemented in a prototype named YAM++. To evaluate our approach, we adopt various tracks namely Benchmark, Conference, Multifarm, Anatomy, Library and Large Biomedical Ontologies from the OAEI campaign. The experimental results show that the proposed matching methods work effectively. Moreover, in comparison to other participants in OAEI campaigns, YAM++ showed to be highly competitive and gained a high ranking position.