Exploiting non-taxonomic relations for measuring semantic similarity and relatedness in WordNet
Mohannad AlMousaa, Rachid Benlamria, Richard Khouryb
Department of Software Engineering, Lakehead University, Thunder Bay, ON, P7B 5E1, Canada.
Various applications in computational linguistics and artificial intelligence employ semantic similarity to solve challenging tasks, such as word sense disambiguation, text classification, information retrieval, machine translation, and document clustering. To our knowledge, research to date rely solely on the taxonomic relation ISA to evaluate semantic similarity and relatedness between terms. This paper explores the benefits of using all types of non-taxonomic relations in large linked data, such as WordNet knowledge graph, to enhance existing semantic similarity and relatedness measures. We propose a holistic poly-relational approach based on a new relation-based information content and non-taxonomic-based weighted paths to devise a comprehensive semantic similarity and relatedness measure. To demonstrate the benefits of exploiting non-taxonomic relations in a knowledge graph, we used three strategies to deploy non-taxonomic relations at different granularity levels. We conduct experiments on four well-known gold standard datasets. The results of our proposed method demonstrate an improvement over the benchmark semantic similarity methods, including the state-of-the-art knowledge graph embedding techniques, that ranged from 3.8%23.8%, 1.3%18.3%, 31.8%117.2%, and 19.1%111.1%, on all gold standard datasets MC, RG, WordSim, and Mturk, respectively. These results demonstrate the robustness and scalability of the proposed semantic similarity and relatedness measure, significantly improving existing similarity measures.
Keywords: Semantic similarity and relatedness; Knowledge graph; Information content; WordNet.