Hidden Markov Models for Audio to Visual Mapping

碩士 === 國立東華大學 === 資訊工程學系 === 93 === Hidden Markov Models (HMMs) is a powerful statistics-probability-model. Recently, the applications are speech and pattern recognition, and some researches develop Audio-Visual Speech Recognized System (AV-ASR). On the other hand, We are talking about using HMMs ma...

Full description

Bibliographic Details
Main Authors:	Guang-Yi Wang, 王光一
Other Authors:	Mau-Tsuen Yang
Format:	Others
Language:	en_US
Published:	2005
Online Access:	http://ndltd.ncl.edu.tw/handle/30753519038304625119

id	ndltd-TW-093NDHU5392065
record_format	oai_dc
spelling	ndltd-TW-093NDHU53920652016-06-06T04:11:19Z http://ndltd.ncl.edu.tw/handle/30753519038304625119 Hidden Markov Models for Audio to Visual Mapping 隱藏馬可夫模型在音訊到視訊特徵對應之應用 Guang-Yi Wang 王光一碩士國立東華大學資訊工程學系 93 Hidden Markov Models (HMMs) is a powerful statistics-probability-model. Recently, the applications are speech and pattern recognition, and some researches develop Audio-Visual Speech Recognized System (AV-ASR). On the other hand, We are talking about using HMMs mapping between different kinds of signal. In this thesis, we know how to translate audio signals to FAPs and how to adjust parameters of model for different languages and talking styles via Virtual Talking Head. There are three parts in my system, signal processing, training of model, and synthesis. First, in signal processing, the Mel-scale Frequency Cepstral Coefficients (MFCC) and the Facial Animation Parameter (FAP) are used to catch feature vectors form audio and video. Second, in training, we are discussing both parameter of HMMs and parameter of Gaussian Mixture Model (GMM). Finally, it will be put in Facial Animation Engine (FAE) to create new video after we have got the parameters of audio corresponding to FAP. In experiment, we wish the talking head could not only imitate talking and singing, but also simulate many language talking modules. This paper can apply to E-Learning, Online guide, real-time virtual conference and so on. Mau-Tsuen Yang 楊茂村 2005 學位論文 ; thesis 55 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立東華大學 === 資訊工程學系 === 93 === Hidden Markov Models (HMMs) is a powerful statistics-probability-model. Recently, the applications are speech and pattern recognition, and some researches develop Audio-Visual Speech Recognized System (AV-ASR). On the other hand, We are talking about using HMMs mapping between different kinds of signal. In this thesis, we know how to translate audio signals to FAPs and how to adjust parameters of model for different languages and talking styles via Virtual Talking Head. There are three parts in my system, signal processing, training of model, and synthesis. First, in signal processing, the Mel-scale Frequency Cepstral Coefficients (MFCC) and the Facial Animation Parameter (FAP) are used to catch feature vectors form audio and video. Second, in training, we are discussing both parameter of HMMs and parameter of Gaussian Mixture Model (GMM). Finally, it will be put in Facial Animation Engine (FAE) to create new video after we have got the parameters of audio corresponding to FAP. In experiment, we wish the talking head could not only imitate talking and singing, but also simulate many language talking modules. This paper can apply to E-Learning, Online guide, real-time virtual conference and so on.
author2	Mau-Tsuen Yang
author_facet	Mau-Tsuen Yang Guang-Yi Wang 王光一
author	Guang-Yi Wang 王光一
spellingShingle	Guang-Yi Wang 王光一 Hidden Markov Models for Audio to Visual Mapping
author_sort	Guang-Yi Wang
title	Hidden Markov Models for Audio to Visual Mapping
title_short	Hidden Markov Models for Audio to Visual Mapping
title_full	Hidden Markov Models for Audio to Visual Mapping
title_fullStr	Hidden Markov Models for Audio to Visual Mapping
title_full_unstemmed	Hidden Markov Models for Audio to Visual Mapping
title_sort	hidden markov models for audio to visual mapping
publishDate	2005
url	http://ndltd.ncl.edu.tw/handle/30753519038304625119
work_keys_str_mv	AT guangyiwang hiddenmarkovmodelsforaudiotovisualmapping AT wángguāngyī hiddenmarkovmodelsforaudiotovisualmapping AT guangyiwang yǐncángmǎkěfūmóxíngzàiyīnxùndàoshìxùntèzhēngduìyīngzhīyīngyòng AT wángguāngyī yǐncángmǎkěfūmóxíngzàiyīnxùndàoshìxùntèzhēngduìyīngzhīyīngyòng
_version_	1718295944109228032

Hidden Markov Models for Audio to Visual Mapping

Similar Items