Summary: | 碩士 === 國立成功大學 === 資訊工程學系碩博士班 === 92 === Automatic summarization of spoken document is a useful technology to many applications, such as information extraction and semantic compression. A good summarization system imitates human hearing and understanding. The procedures include speech collection, speech recognition, semantic analysis and understanding and speech summarization. There are some problems including speech recognition, key information extraction and sentence grammar.
This thesis proposes a new approach to solve the above problems. There are three steps in the automatic summarization: speech recognition, speech summarization and speech concatenation. The speech recognition transcribes spoken documents to transcriptions with segment information. We incorporate five knowledge scores with dynamic programming (DP) technique to analyze speech summarization. The summarization scores consist of a confidence score, a word significance score, probabilistic context free grammars and semantic dependency grammars. In order to keep the original voice, we extract the audio of summarized units and concatenate them. The speech concatenation method focus on the spectral fluency including spectral centroid, spectral flux, spectral rolloff and zero crossing rate and mel-frequency cepstral coefficients. DP is also used to search the minimum cost of concatenated units. The experiments prove that our summarized result can extract key information and concatenate fluency speech.
|