Study and Analysis of Speech Model-Based Voice Conversion

碩士 === 國立中正大學 === 資訊工程研究所 === 107 === The target of voice conversion based on speech model-based is to transfer source voice to target voice. Through the training process, converting function parameter is generated. And then, the speech voice will just like the target voice by acoustic feature an...

Full description

Bibliographic Details
Main Authors:	Chang, Wen-Han, 張文瀚
Other Authors:	Lin, Tay-Jyi
Format:	Others
Language:	zh-TW
Published:	2019
Online Access:	http://ndltd.ncl.edu.tw/handle/kerybc

id	ndltd-TW-107CCU00392019
record_format	oai_dc
spelling	ndltd-TW-107CCU003920192019-11-01T05:28:07Z http://ndltd.ncl.edu.tw/handle/kerybc Study and Analysis of Speech Model-Based Voice Conversion 基於語音模型之語音轉換探討與分析 Chang, Wen-Han 張文瀚碩士國立中正大學資訊工程研究所 107 The target of voice conversion based on speech model-based is to transfer source voice to target voice. Through the training process, converting function parameter is generated. And then, the speech voice will just like the target voice by acoustic feature analysis, mapping, and synthesis in sequence. This thesis will analyze how the speech model-based method achieves voice conversion by exploring Sprocket which is the baseline system of Voice Conversion Challenge 2018. The voice conversion part of Sprocket can be divided into three stages. First, analysis of acoustic feature is used to obtain the main factors that cause the difference of each person's voice. Second, through special processing on the acoustic features, the acoustic features are converted to the target’s acoustic features. Finally, the acoustic features are used to generate the speech of the target person through synthesis. This article will focus on exploring for the main algorithms. Through these algorithms we could realize how to find out the acoustic features and important information of speech. And also we could analysis the results of different models’ execution time. After carefully studying the Sprocket architecture and using related algorithms and functions, we design experiment about analysis, conversion, and synthesis. We choose different algorithms to reduce the execution time but not cause damage to the speech quality. Based on the Python in the original Sprocket, this experiment will try to development and design in C, and then replace and optimize the algorithm and function based on C implementation. After lots of optimizations and adjustments, the experiment finally achieving 2.785 times improvement in the overall execution time of the Sprocket. We achieve the goal of speeding up the voice conversion’s execution time without significantly affecting difference in converting quality by subjective hearing. Lin, Tay-Jyi 林泰吉 2019 學位論文 ; thesis 50 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立中正大學 === 資訊工程研究所 === 107 === The target of voice conversion based on speech model-based is to transfer source voice to target voice. Through the training process, converting function parameter is generated. And then, the speech voice will just like the target voice by acoustic feature analysis, mapping, and synthesis in sequence. This thesis will analyze how the speech model-based method achieves voice conversion by exploring Sprocket which is the baseline system of Voice Conversion Challenge 2018. The voice conversion part of Sprocket can be divided into three stages. First, analysis of acoustic feature is used to obtain the main factors that cause the difference of each person's voice. Second, through special processing on the acoustic features, the acoustic features are converted to the target’s acoustic features. Finally, the acoustic features are used to generate the speech of the target person through synthesis. This article will focus on exploring for the main algorithms. Through these algorithms we could realize how to find out the acoustic features and important information of speech. And also we could analysis the results of different models’ execution time. After carefully studying the Sprocket architecture and using related algorithms and functions, we design experiment about analysis, conversion, and synthesis. We choose different algorithms to reduce the execution time but not cause damage to the speech quality. Based on the Python in the original Sprocket, this experiment will try to development and design in C, and then replace and optimize the algorithm and function based on C implementation. After lots of optimizations and adjustments, the experiment finally achieving 2.785 times improvement in the overall execution time of the Sprocket. We achieve the goal of speeding up the voice conversion’s execution time without significantly affecting difference in converting quality by subjective hearing.
author2	Lin, Tay-Jyi
author_facet	Lin, Tay-Jyi Chang, Wen-Han 張文瀚
author	Chang, Wen-Han 張文瀚
spellingShingle	Chang, Wen-Han 張文瀚 Study and Analysis of Speech Model-Based Voice Conversion
author_sort	Chang, Wen-Han
title	Study and Analysis of Speech Model-Based Voice Conversion
title_short	Study and Analysis of Speech Model-Based Voice Conversion
title_full	Study and Analysis of Speech Model-Based Voice Conversion
title_fullStr	Study and Analysis of Speech Model-Based Voice Conversion
title_full_unstemmed	Study and Analysis of Speech Model-Based Voice Conversion
title_sort	study and analysis of speech model-based voice conversion
publishDate	2019
url	http://ndltd.ncl.edu.tw/handle/kerybc
work_keys_str_mv	AT changwenhan studyandanalysisofspeechmodelbasedvoiceconversion AT zhāngwénhàn studyandanalysisofspeechmodelbasedvoiceconversion AT changwenhan jīyúyǔyīnmóxíngzhīyǔyīnzhuǎnhuàntàntǎoyǔfēnxī AT zhāngwénhàn jīyúyǔyīnmóxíngzhīyǔyīnzhuǎnhuàntàntǎoyǔfēnxī
_version_	1719285055514214400

Study and Analysis of Speech Model-Based Voice Conversion

Similar Items