Improving The Feature Representation of Speech Signals by Self-Organization

碩士 === 國立臺灣大學 === 資訊工程研究所 === 82 === In this paper, a continuous Mandarin speech recognition system in which neural network is applied to the designs of specific modules is described. Fifteen-five Melscale spectral coefficients are used to...

Full description

Bibliographic Details
Main Authors:	Chien, Shuen-Der, 簡順德
Other Authors:	Liou, Cheng-Yuan
Format:	Others
Language:	zh-TW
Published:	1994
Online Access:	http://ndltd.ncl.edu.tw/handle/27455552395855493397

id	ndltd-TW-082NTU00392055
record_format	oai_dc
spelling	ndltd-TW-082NTU003920552016-07-18T04:09:33Z http://ndltd.ncl.edu.tw/handle/27455552395855493397 Improving The Feature Representation of Speech Signals by Self-Organization 語音特徵的時序扭曲校正方法 Chien, Shuen-Der 簡順德碩士國立臺灣大學資訊工程研究所 82 In this paper, a continuous Mandarin speech recognition system in which neural network is applied to the designs of specific modules is described. Fifteen-five Melscale spectral coefficients are used to represent the spectral features of spoken utterances. The prototype for each word is modeled by a one-dimensional self-organization feature map that consists of 100 equally spaced neurons(cells). With the topology map developed on the linear array of neurons, the precedence relations among the sequential spectral features are preserved. Hence, the mechanism of linear array of neurons is able to cope with the time alignment problem implicitly. Two perception energies E1 and E2 are experimentally designed for implementation of pattern matching. The first perception energy E1, which evaluates the similarity of distance between a prototype and a word utterance, is obtained from the accumulation of total excitations on the feature map during a word utterance. The other perception energy E2, which evaluates the similarity of timing between a prototype and a time-warped word utterance, is devised by properly fitting a precedence curve on the sequential excitation patterns of feature map among an utterance duration. Furthermore, two novel self- organizing algorithms, relaxation and topological adjustment of one-dimensional prototype on two-dimensional space, are proposed to improve the resolutions of E1 and E2. The relaxation process improves the resolution of E1 by more fine- tuned training. By properly selecting key neurons as a new prototype among a fine-tuned 2-D map, the computation load is not increased. Moreover, the topological adjustment process improves the resolution of E2 by narrowing the slope range of a sequential exciting curve. The concepts and methods presented in this paper are simulated on a personal computer with a modern DSP board and the result is quite satisfactory. Liou, Cheng-Yuan 劉長遠 1994 學位論文 ; thesis 59 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺灣大學 === 資訊工程研究所 === 82 === In this paper, a continuous Mandarin speech recognition system in which neural network is applied to the designs of specific modules is described. Fifteen-five Melscale spectral coefficients are used to represent the spectral features of spoken utterances. The prototype for each word is modeled by a one-dimensional self-organization feature map that consists of 100 equally spaced neurons(cells). With the topology map developed on the linear array of neurons, the precedence relations among the sequential spectral features are preserved. Hence, the mechanism of linear array of neurons is able to cope with the time alignment problem implicitly. Two perception energies E1 and E2 are experimentally designed for implementation of pattern matching. The first perception energy E1, which evaluates the similarity of distance between a prototype and a word utterance, is obtained from the accumulation of total excitations on the feature map during a word utterance. The other perception energy E2, which evaluates the similarity of timing between a prototype and a time-warped word utterance, is devised by properly fitting a precedence curve on the sequential excitation patterns of feature map among an utterance duration. Furthermore, two novel self- organizing algorithms, relaxation and topological adjustment of one-dimensional prototype on two-dimensional space, are proposed to improve the resolutions of E1 and E2. The relaxation process improves the resolution of E1 by more fine- tuned training. By properly selecting key neurons as a new prototype among a fine-tuned 2-D map, the computation load is not increased. Moreover, the topological adjustment process improves the resolution of E2 by narrowing the slope range of a sequential exciting curve. The concepts and methods presented in this paper are simulated on a personal computer with a modern DSP board and the result is quite satisfactory.
author2	Liou, Cheng-Yuan
author_facet	Liou, Cheng-Yuan Chien, Shuen-Der 簡順德
author	Chien, Shuen-Der 簡順德
spellingShingle	Chien, Shuen-Der 簡順德 Improving The Feature Representation of Speech Signals by Self-Organization
author_sort	Chien, Shuen-Der
title	Improving The Feature Representation of Speech Signals by Self-Organization
title_short	Improving The Feature Representation of Speech Signals by Self-Organization
title_full	Improving The Feature Representation of Speech Signals by Self-Organization
title_fullStr	Improving The Feature Representation of Speech Signals by Self-Organization
title_full_unstemmed	Improving The Feature Representation of Speech Signals by Self-Organization
title_sort	improving the feature representation of speech signals by self-organization
publishDate	1994
url	http://ndltd.ncl.edu.tw/handle/27455552395855493397
work_keys_str_mv	AT chienshuender improvingthefeaturerepresentationofspeechsignalsbyselforganization AT jiǎnshùndé improvingthefeaturerepresentationofspeechsignalsbyselforganization AT chienshuender yǔyīntèzhēngdeshíxùniǔqūxiàozhèngfāngfǎ AT jiǎnshùndé yǔyīntèzhēngdeshíxùniǔqūxiàozhèngfāngfǎ
_version_	1718352402059362304

Improving The Feature Representation of Speech Signals by Self-Organization

Similar Items