Semantic Distance in WordNet: A Simplified and Improved Measure of Semantic Relatedness
In this study we investigate a special kind of semantic distance, called semantic relatedness. Lexical semantic relatedness measures have proved to be useful for a number of applications, such as word sense disambiguation and real-word spelling error correction. Most relatedness measures rely on the observation that the shortest path between nodes in a semantic network provides a representation of the relationship between two concepts. The strength of relatedness is computed in terms of this path.
This dissertation makes several significant contributions to the study of semantic relatedness. We describe a new measure that calculates semantic relatedness as a function of the shortest path in a semantic network. The proposed measure achieves better results than other standard measures and yet is much simpler than previous models. The proposed measure is shown to achieve a correlation of r = 0. 897 with the judgments of human test subjects using a standard benchmark data set, representing the best performance reported in the literature. We also provide a general formal description for a class of semantic distance measures — namely, those measures that compute semantic distance from the shortest path in a semantic network. Lastly, we suggest a new methodology for developing path-based semantic distance measures that would limit the possibility of unnecessary complexity in future measures.
School:University of Waterloo
School Location:Canada - Ontario
Source Type:Master's Thesis
Keywords:computer science relatedness similarity distance lexical semantic computational measure wordnet
Date of Publication:01/01/2006