A Voice Separation System Based on Median Filtering and a few Improvements

碩士 === 國立臺灣科技大學 === 資訊工程系 === 102 === In this thesis, we study some relevant problems about voice separation that subtracts music spectrum from mixed spectrum. To extract the music spectrogram from the mixed spectrogram, we adopt the concepts, searching nearest neighbor frames and median filtering....

Full description

Bibliographic Details
Main Authors:	Yu-Min Jiang, 姜育民
Other Authors:	Hung-yan Gu
Format:	Others
Language:	zh-TW
Published:	2014
Online Access:	http://ndltd.ncl.edu.tw/handle/q8e9vq

id	ndltd-TW-102NTUS5392019
record_format	oai_dc
spelling	ndltd-TW-102NTUS53920192019-05-15T21:13:20Z http://ndltd.ncl.edu.tw/handle/q8e9vq A Voice Separation System Based on Median Filtering and a few Improvements 基於中值濾波及數項改進之語音分離系統 Yu-Min Jiang 姜育民碩士國立臺灣科技大學資訊工程系 102 In this thesis, we study some relevant problems about voice separation that subtracts music spectrum from mixed spectrum. To extract the music spectrogram from the mixed spectrogram, we adopt the concepts, searching nearest neighbor frames and median filtering. As the achievement, we have not only proposed some methods to improve the separation performance, but also implemented an on-line voice separation system. First, for the number of nearest neighbor frames to keep and the mask parameter value, we have run a few calibration experiments. By using the best values, the average SDR (source to distortion ratio) is raised 0.94dB. Next, for selecting the nearest neighbor frames, spectrum magnitude is changed from linear scale to logarithmic scale to calculate the spectral distance between two frames. Also, we have attempted to equalize a spectrum by using its average magnitude. According to the results of the experiments, using logarithmic magnitude to calculate the spectral distance may raise the average SDR considerably, i.e. 0.97dB. In addition, a spectral-flatness measure is used to detect the frames of drum sound. Then, the spectrum bins of these frames are reassigned to music spectrogram. Consequently, the separated voice can get rid of the interference of the drum sound, and the average SDR is raised 0.02dB. As to the removed spectrum bins in the drum-sound frames, it is found that filling or without filling the empty spectrums will not have noticeable difference. Moreover, we have attempted to remove the low frequency bins of the spectrum in order to reduce the interference from the low frequency music signal. By removing low frequency bins, the average SDR is further raised 1.01dB. Overall, using logarithmic magnitude spectrum to calculate spectral distance, removing drum sound, and removing low frequency bins can have the quality of the separated voice being considerably promoted, and the average SDR is raise from 2.48dB to 5.42dB. Hung-yan Gu 古鴻炎 2014 學位論文 ; thesis 63 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺灣科技大學 === 資訊工程系 === 102 === In this thesis, we study some relevant problems about voice separation that subtracts music spectrum from mixed spectrum. To extract the music spectrogram from the mixed spectrogram, we adopt the concepts, searching nearest neighbor frames and median filtering. As the achievement, we have not only proposed some methods to improve the separation performance, but also implemented an on-line voice separation system. First, for the number of nearest neighbor frames to keep and the mask parameter value, we have run a few calibration experiments. By using the best values, the average SDR (source to distortion ratio) is raised 0.94dB. Next, for selecting the nearest neighbor frames, spectrum magnitude is changed from linear scale to logarithmic scale to calculate the spectral distance between two frames. Also, we have attempted to equalize a spectrum by using its average magnitude. According to the results of the experiments, using logarithmic magnitude to calculate the spectral distance may raise the average SDR considerably, i.e. 0.97dB. In addition, a spectral-flatness measure is used to detect the frames of drum sound. Then, the spectrum bins of these frames are reassigned to music spectrogram. Consequently, the separated voice can get rid of the interference of the drum sound, and the average SDR is raised 0.02dB. As to the removed spectrum bins in the drum-sound frames, it is found that filling or without filling the empty spectrums will not have noticeable difference. Moreover, we have attempted to remove the low frequency bins of the spectrum in order to reduce the interference from the low frequency music signal. By removing low frequency bins, the average SDR is further raised 1.01dB. Overall, using logarithmic magnitude spectrum to calculate spectral distance, removing drum sound, and removing low frequency bins can have the quality of the separated voice being considerably promoted, and the average SDR is raise from 2.48dB to 5.42dB.
author2	Hung-yan Gu
author_facet	Hung-yan Gu Yu-Min Jiang 姜育民
author	Yu-Min Jiang 姜育民
spellingShingle	Yu-Min Jiang 姜育民 A Voice Separation System Based on Median Filtering and a few Improvements
author_sort	Yu-Min Jiang
title	A Voice Separation System Based on Median Filtering and a few Improvements
title_short	A Voice Separation System Based on Median Filtering and a few Improvements
title_full	A Voice Separation System Based on Median Filtering and a few Improvements
title_fullStr	A Voice Separation System Based on Median Filtering and a few Improvements
title_full_unstemmed	A Voice Separation System Based on Median Filtering and a few Improvements
title_sort	voice separation system based on median filtering and a few improvements
publishDate	2014
url	http://ndltd.ncl.edu.tw/handle/q8e9vq
work_keys_str_mv	AT yuminjiang avoiceseparationsystembasedonmedianfilteringandafewimprovements AT jiāngyùmín avoiceseparationsystembasedonmedianfilteringandafewimprovements AT yuminjiang jīyúzhōngzhílǜbōjíshùxiànggǎijìnzhīyǔyīnfēnlíxìtǒng AT jiāngyùmín jīyúzhōngzhílǜbōjíshùxiànggǎijìnzhīyǔyīnfēnlíxìtǒng AT yuminjiang voiceseparationsystembasedonmedianfilteringandafewimprovements AT jiāngyùmín voiceseparationsystembasedonmedianfilteringandafewimprovements
_version_	1719110873719504896

A Voice Separation System Based on Median Filtering and a few Improvements

Similar Items