Speech Recognition for People with Dysphasia in Chinese

碩士 === 國立臺北大學 === 資訊工程學系 === 106 === As the advance of technology, it is increasingly speech recognition tools on mobile devices, such as Google Voice and Apple Siri, those have been widely used and have the high recognition rate of normal people’s speech. However, these speech recognition tools can...

Full description

Bibliographic Details
Main Authors: LIN, BO-YU, 林柏榆
Other Authors: CHANG, YUE-SHAN
Format: Others
Language:zh-TW
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/569rpn
Description
Summary:碩士 === 國立臺北大學 === 資訊工程學系 === 106 === As the advance of technology, it is increasingly speech recognition tools on mobile devices, such as Google Voice and Apple Siri, those have been widely used and have the high recognition rate of normal people’s speech. However, these speech recognition tools cannot work well for people with “Dysphasia”, and these tools have very low recognition rate for these people. It is an important issue to develop a speech recognition tool for people with Dysphasia. Because the Chinese syllables with tone cannot be effectively recognized for Dysphasia, we will integrate the speech data and use the basic syllable as a main reference to combine all the different tone of basic syllable in the same data set. These days, there are various open source for speech recognition developed by deep learning, so we use the popular open source “Tensorflow” to develop our acoustic model. Then, we use Google’s sample of speech recognition which is trained by convolutional neural network with KWS(CNN-KWS) as basis, using the shortcut connection to improve the convolutional neural network, it has been proved by experiments that the accuracy can be effectively increased. Because of the shortage of speech data for Dysphasia, the (AM) has the low recognition rate. We use Data Augmentation to transform the training of acoustic models, and based on this training method to create a novel model called “Syllable Stratification Acoustic Model (SSAM)”. Finally, we also design a complete Chinese speech recognition system for Dysphasia. We use the “Syllable Stratification Acoustic Model” to replace the traditional acoustic model that we utilize new model to generate the recognized syllables, this new acoustic model can solve the problem of insufficient data. In system, the making sentences module is used to create the possible sentences, we use the adjustment module to allocate weights between words. At last, clients can communicate with server through the RESTful API to achieve the speech recognition system for people with Dysphasia.