Summary: | 碩士 === 國立清華大學 === 資訊工程學系所 === 105 === We introduce a new method for automatically generating phrasal paraphrases based on synonyms extracted from the monolingual corpus. In our approach, each content word in a given phrase is replaced with synonyms and then validated using Ngrams. The method involves extracting and filtering synonymous relations based on surface patterns and word embedding. At run-time, content words in the given phrase are replaced with synonyms to derive candidate paraphrases, and re-ranking is performed on the candidates based on synonym measures, word embedding, and Ngram statistics. We present a prototype paraphrasing system, Rephraser2.0 available at http://ironman.nlpweb.org:13142/, that applies the method to a Web scale corpus. Our methodology clearly supports combining surface patterns and word embedding for generating paraphrases useful for language reference and second-language learning.
|