Summary: | 博士 === 國立成功大學 === 資訊工程學系碩博士班 === 100 === Sentence correction has been an important emerging issue in computer-assisted language learning and automatic speech recognition post-editing. However, existing approaches such as correction grammars and templates or statistical machine translation are still not robust enough to tackle the common errors in sentences produced by second language learners and speech recognition outputs. In this dissertation, techniques based on language models, prosodic information and contextual information and are proposed to address the error correction problem of these two kinds of erroneous texts in natural language processing.
For non-native sentence correction, we present an approach using the proposed language modeling method based on relative positional information, which is suitable for the errors made by learners of Chinese as a Second Language. Four error types considered for correction in this dissertation are Lexical Choice, Redundancy, Omission, and Word Order. Methods for generating correction candidates for these four error types are proposed for sentence correction. Dynamic programming is then applied to yield the best corrected sentence from generated candidates.
For speech recognition outputs, a prosodic word based correction candidate generation method is proposed. The prosodic words and the corresponding mis-recognized word fragments are obtained from a speech database to construct a mis-recognized word fragment table for the extracted prosodic words. For each word fragment in a recognized word sequence, the potential prosodic words which are likely to be misrecognized as input word fragments are retrieved from the table for prosodic word candidate expansion. The prosodic word-based contextual information, considering substitution score, concatenation score and fitness score, is then employed using dynamic programming to find the best word fragment sequence over the whole sentence as the corrected output.
Specifically for the substitution errors in ASR outputs, the distances between ASR outputs and the potentially correct alternatives are estimated based on a weighted context-dependent syllable cluster-based kernel feature matrix followed by multidimensional scaling (MDS)-based distance rescaling. These distances are then used to construct an alternative syllable lattice and the dynamic programming is used to obtain the most likely correct substitution errors with respect to the original ASR results.
Experimental results show that compared to a state-of-the-art phrase-based statistical machine translation method for non-native sentences and correction-pairs method for ASR outputs, the error correction performance of the proposed approaches improved significantly.
|