Summary: | 碩士 === 國立交通大學 === 資訊管理研究所 === 100 === As the number of speech and video documents is increasing on the Internet and portable devices, speech summarization has become more important in these years. In usual, the research domain focused on the domain of broadcast and news. Unfortunately, the method of automatic summarization used in the past may not suit to other speech domains (e.g. lecture speech). Therefore, this thesis focuses on the research of lecture speech domain. We analyze the features used in past research, choose the suitable features through experimental, and propose a three-phase Real-Time Speech Summarizer (RTSS). Phase one chooses independent features (e.g. centrality, resemblance to the title, sentence length, term frequency, and thematic word) and calculates the independent features-scores; phase two calculates the dependent feature such as position with above-mentioned independent features-scores; phase three compares the above-mentioned feature-scores, weighted average the function-scores to find the top score sentence, and get the summary. With the experimental, RTSS are evaluated by comparing the summary sentence set selecting from RTSS and five experts. RTSS is a useful that the Macro F-Measure score is 52%, and the Macro Accuracy is 70% that can help users to get the key information of speech.
|