Hidden Markov Models for Audio to Visual Mapping
碩士 === 國立東華大學 === 資訊工程學系 === 93 === Hidden Markov Models (HMMs) is a powerful statistics-probability-model. Recently, the applications are speech and pattern recognition, and some researches develop Audio-Visual Speech Recognized System (AV-ASR). On the other hand, We are talking about using HMMs ma...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2005
|
Online Access: | http://ndltd.ncl.edu.tw/handle/30753519038304625119 |
id |
ndltd-TW-093NDHU5392065 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-093NDHU53920652016-06-06T04:11:19Z http://ndltd.ncl.edu.tw/handle/30753519038304625119 Hidden Markov Models for Audio to Visual Mapping 隱藏馬可夫模型在音訊到視訊特徵對應之應用 Guang-Yi Wang 王光一 碩士 國立東華大學 資訊工程學系 93 Hidden Markov Models (HMMs) is a powerful statistics-probability-model. Recently, the applications are speech and pattern recognition, and some researches develop Audio-Visual Speech Recognized System (AV-ASR). On the other hand, We are talking about using HMMs mapping between different kinds of signal. In this thesis, we know how to translate audio signals to FAPs and how to adjust parameters of model for different languages and talking styles via Virtual Talking Head. There are three parts in my system, signal processing, training of model, and synthesis. First, in signal processing, the Mel-scale Frequency Cepstral Coefficients (MFCC) and the Facial Animation Parameter (FAP) are used to catch feature vectors form audio and video. Second, in training, we are discussing both parameter of HMMs and parameter of Gaussian Mixture Model (GMM). Finally, it will be put in Facial Animation Engine (FAE) to create new video after we have got the parameters of audio corresponding to FAP. In experiment, we wish the talking head could not only imitate talking and singing, but also simulate many language talking modules. This paper can apply to E-Learning, Online guide, real-time virtual conference and so on. Mau-Tsuen Yang 楊茂村 2005 學位論文 ; thesis 55 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立東華大學 === 資訊工程學系 === 93 === Hidden Markov Models (HMMs) is a powerful statistics-probability-model. Recently, the applications are speech and pattern recognition, and some researches develop Audio-Visual Speech Recognized System (AV-ASR). On the other hand,
We are talking about using HMMs mapping between different kinds of signal. In this thesis, we know how to translate audio signals to FAPs and how to adjust parameters of model for different languages and talking styles via Virtual Talking Head. There are three parts in my system, signal processing, training of model, and synthesis. First, in signal processing, the Mel-scale Frequency Cepstral Coefficients (MFCC) and the Facial Animation Parameter (FAP) are used to catch feature vectors form audio and video. Second, in training, we are discussing both parameter of HMMs and parameter of Gaussian Mixture Model (GMM). Finally, it will be put in Facial Animation Engine (FAE) to create new video after we have got the parameters of audio corresponding to FAP. In experiment, we wish the talking head could not only imitate talking and singing, but also simulate many language talking modules. This paper can apply to E-Learning, Online guide, real-time virtual conference and so on.
|
author2 |
Mau-Tsuen Yang |
author_facet |
Mau-Tsuen Yang Guang-Yi Wang 王光一 |
author |
Guang-Yi Wang 王光一 |
spellingShingle |
Guang-Yi Wang 王光一 Hidden Markov Models for Audio to Visual Mapping |
author_sort |
Guang-Yi Wang |
title |
Hidden Markov Models for Audio to Visual Mapping |
title_short |
Hidden Markov Models for Audio to Visual Mapping |
title_full |
Hidden Markov Models for Audio to Visual Mapping |
title_fullStr |
Hidden Markov Models for Audio to Visual Mapping |
title_full_unstemmed |
Hidden Markov Models for Audio to Visual Mapping |
title_sort |
hidden markov models for audio to visual mapping |
publishDate |
2005 |
url |
http://ndltd.ncl.edu.tw/handle/30753519038304625119 |
work_keys_str_mv |
AT guangyiwang hiddenmarkovmodelsforaudiotovisualmapping AT wángguāngyī hiddenmarkovmodelsforaudiotovisualmapping AT guangyiwang yǐncángmǎkěfūmóxíngzàiyīnxùndàoshìxùntèzhēngduìyīngzhīyīngyòng AT wángguāngyī yǐncángmǎkěfūmóxíngzàiyīnxùndàoshìxùntèzhēngduìyīngzhīyīngyòng |
_version_ |
1718295944109228032 |