Answering List and Other questions

The importance of Question Answering is growing with the expansion of information and text documents on the web. Techniques in Question Answering have significantly improved during the last decade especially after the introduction of TREC Question Answering track. Most work in this field has been do...

Full description

Bibliographic Details
Main Author: Razmara, Majid
Format: Others
Published: 2008
Online Access:http://spectrum.library.concordia.ca/976071/1/MR45712.pdf
Razmara, Majid <http://spectrum.library.concordia.ca/view/creators/Razmara=3AMajid=3A=3A.html> (2008) Answering List and Other questions. Masters thesis, Concordia University.
Description
Summary:The importance of Question Answering is growing with the expansion of information and text documents on the web. Techniques in Question Answering have significantly improved during the last decade especially after the introduction of TREC Question Answering track. Most work in this field has been done on answering Factoid questions. In this thesis, however, we present and evaluate two approaches to answering List and Other types of questions which are as important but have not been investigated as much as Factoid questions. Although answering List questions is not a new research area, answering them automatically still remains a challenge. The median F-score of systems that participated at the TREC-2007 Question Answering track is still very low (0.085) while 74% of the questions had a median F-score of 0. In this thesis, we propose a novel approach to answering List questions. This approach is based on the hypothesis that the answer instances to a List question co-occur within sentences of the documents related to the question and the topic. We use a clustering method to group the candidate answers that co-occur more often. To pinpoint the right cluster, we use the target and the question keywords as spies . Using this approach, our system placed fourth among 21 teams in the TREC-2007 QA track with F-score 0.145. Other questions have been introduced in the TREC-QA track to retrieve other interesting facts about a topic. In our thesis, Other questions are answered using the notion of interest marking terms. To answer this type of questions, our system extracts, from Wikipedia articles, a list of interest marking terms related to the topic and uses them to extract and score sentences from the document collection where the answer should be found. Sentences are then re-ranked using universal interest-markers that are not specific to the topic. The top sentences are then returned as possible answers. To evaluate our approach, we participated in the TREC-2006 and TREC-2007 QA tracks. Using this approach, our system placed third in both years with F-score 0.199 and 0.281 respectively.