Summary: | A <em>lexical analogy</em> is two pairs of words (<em>w</em><sub>1</sub>, <em>w</em><sub>2</sub>) and (<em>w</em><sub>3</sub>, <em>w</em><sub>4</sub>) such that the relation between <em>w</em><sub>1</sub> and <em>w</em><sub>2</sub> is identical or similar to the relation between <em>w</em><sub>3</sub> and <em>w</em><sub>4</sub>. For example, (<em>abbreviation</em>, <em>word</em>) forms a lexical analogy with (<em>abstract</em>, <em>report</em>), because in both cases the former is a shortened version of the latter. Lexical analogies are of theoretic interest because they represent a second order similarity measure: <em>relational similarity</em>. Lexical analogies are also of practical importance in many applications, including text-understanding and learning ontological relations. <BR> <BR> This thesis presents a novel system that generates lexical analogies from a corpus of text documents. The system is motivated by a well-established theory of analogy-making, and views lexical analogy generation as a series of three processes: identifying pairs of words that are semantically related, finding clues to characterize their relations, and generating lexical analogies by matching pairs of words with similar relations. The system uses a <em>dependency grammar</em> to characterize semantic relations, and applies machine learning techniques to determine their similarities. Empirical evaluation shows that the system performs remarkably well, generating lexical analogies at a precision of over 90%.
|