Automatic Singer Identification Systems Trained Using Speech Data

碩士 === 國立臺北科技大學 === 電子工程系 === 106 === Intuitively, a singer identification (SID) system is trained using the singing voice of each singer to be identified. However, in most cases,it is difficult to obtain a singers a cappella in advance, especially for popular singers, because popular music usually...

Full description

Bibliographic Details
Main Authors: Zhe-Ping Su, 蘇哲平
Other Authors: Wei-Ho Tsai
Format: Others
Language:zh-TW
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/ucd57z
id ndltd-TW-106TIT05427063
record_format oai_dc
spelling ndltd-TW-106TIT054270632019-07-04T06:00:00Z http://ndltd.ncl.edu.tw/handle/ucd57z Automatic Singer Identification Systems Trained Using Speech Data 以說話聲建立歌手識別系統 Zhe-Ping Su 蘇哲平 碩士 國立臺北科技大學 電子工程系 106 Intuitively, a singer identification (SID) system is trained using the singing voice of each singer to be identified. However, in most cases,it is difficult to obtain a singers a cappella in advance, especially for popular singers, because popular music usually contains background accompaniment during most or all vocal passages, which makes SID difficult. Still, we may have a chance to acquire a popular singers spoken data, for example, in a concert or a press conference, in which the speech usually does not contain background accompaniment, and therefore can be used directly to analyze the voice characteristics. Accordingly, this thesis focuses on studying whether an SID system established on the basis of a singers speech voice can still work when he/she sings. Our experiment results show that the SID accuracy achieved with Gaussian Mixture Modeling (GMM) on the Mel-Frequency Cepstral Coefficients (MFCC) is quite low, which only 38.67% by using speech voices. If we use i-Vector with Linear Discriminant Analysis (LDA), the SID accuracy can be improved to 64%. Wei-Ho Tsai 蔡偉和 2018 學位論文 ; thesis 52 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺北科技大學 === 電子工程系 === 106 === Intuitively, a singer identification (SID) system is trained using the singing voice of each singer to be identified. However, in most cases,it is difficult to obtain a singers a cappella in advance, especially for popular singers, because popular music usually contains background accompaniment during most or all vocal passages, which makes SID difficult. Still, we may have a chance to acquire a popular singers spoken data, for example, in a concert or a press conference, in which the speech usually does not contain background accompaniment, and therefore can be used directly to analyze the voice characteristics. Accordingly, this thesis focuses on studying whether an SID system established on the basis of a singers speech voice can still work when he/she sings. Our experiment results show that the SID accuracy achieved with Gaussian Mixture Modeling (GMM) on the Mel-Frequency Cepstral Coefficients (MFCC) is quite low, which only 38.67% by using speech voices. If we use i-Vector with Linear Discriminant Analysis (LDA), the SID accuracy can be improved to 64%.
author2 Wei-Ho Tsai
author_facet Wei-Ho Tsai
Zhe-Ping Su
蘇哲平
author Zhe-Ping Su
蘇哲平
spellingShingle Zhe-Ping Su
蘇哲平
Automatic Singer Identification Systems Trained Using Speech Data
author_sort Zhe-Ping Su
title Automatic Singer Identification Systems Trained Using Speech Data
title_short Automatic Singer Identification Systems Trained Using Speech Data
title_full Automatic Singer Identification Systems Trained Using Speech Data
title_fullStr Automatic Singer Identification Systems Trained Using Speech Data
title_full_unstemmed Automatic Singer Identification Systems Trained Using Speech Data
title_sort automatic singer identification systems trained using speech data
publishDate 2018
url http://ndltd.ncl.edu.tw/handle/ucd57z
work_keys_str_mv AT zhepingsu automaticsingeridentificationsystemstrainedusingspeechdata
AT sūzhépíng automaticsingeridentificationsystemstrainedusingspeechdata
AT zhepingsu yǐshuōhuàshēngjiànlìgēshǒushíbiéxìtǒng
AT sūzhépíng yǐshuōhuàshēngjiànlìgēshǒushíbiéxìtǒng
_version_ 1719220727010295808