Blind Source Separation Using a Fast Time Frequency Mask Technique
碩士 === 國立中央大學 === 電機工程學系 === 102 === The goal of BSS is solving cocktail party problem. Imagine a room with a number of persons and microphones for recording. When people are speaking at the same time, each microphone registers a different mixture of individual speaker's audio signals. And the...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/70298060585414257731 |
id |
ndltd-TW-102NCU05442084 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102NCU054420842015-10-13T23:55:41Z http://ndltd.ncl.edu.tw/handle/70298060585414257731 Blind Source Separation Using a Fast Time Frequency Mask Technique 一個加速時頻域遮罩之盲訊號分離演算法 Pei-yun Liu 劉佩昀 碩士 國立中央大學 電機工程學系 102 The goal of BSS is solving cocktail party problem. Imagine a room with a number of persons and microphones for recording. When people are speaking at the same time, each microphone registers a different mixture of individual speaker's audio signals. And the task of BSS is to untangle these mixtures into their sources. There are various applications including mobile telephony, multiuser communication systems, voice reinforce in noisy environment. The mixtures recorded by microphones will be transformed to frequency domain with STFT (Short-Time Fourier Transform). Owing to the characteristics, sparseness and the disjointness ,of the source signal, we can obtain those features from the mixtures during feature extraction step. The features are represented as complex number. Afterwards, by utilizing K-meansalgorithm, we divide those features into N group, where N is the number of sources. Prior to transform the separated signal back to time domain, we adopt mask design to label the target signal, for example, if the target signal is a speech signal, we will label it one, otherwise zero. To solve the convolutive blind source separation (BSS) problem, this thesis presents a new method which utilizing a fast time frequency mask technique. We first define two features, which are normalized level-ratio and phase-difference. Next, we reduce the variance of feature in order to obtain lower iterations of K-Means clustering. Afterwards, with K-means algorithm, we cluster the features by assigning them to the nearest group. In the end, according to the clustered features, a time frequency mask is generated. The method is not only easy, but also faster without reducing the quality of the target signal. In real environment,we also evaluate the separated signal in terms of SDR (signal to distortion ratio) and SIR (signal to interference ratio). Tsung-han Tsai 蔡宗漢 2014 學位論文 ; thesis 66 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中央大學 === 電機工程學系 === 102 === The goal of BSS is solving cocktail party problem. Imagine a room with a number of persons and microphones for recording. When people are speaking at the same time, each microphone registers a different mixture of individual speaker's audio signals. And the task of BSS is to untangle these mixtures into their sources. There are various applications including
mobile telephony, multiuser communication systems, voice reinforce in noisy environment.
The mixtures recorded by microphones will be transformed to frequency domain with STFT (Short-Time Fourier Transform). Owing to the characteristics, sparseness and the disjointness ,of the source signal, we can obtain those features from the mixtures during feature extraction
step. The features are represented as complex number. Afterwards, by utilizing K-meansalgorithm, we divide those features into N group, where N is the number of sources. Prior to transform the separated signal back to time domain, we adopt mask design to label the target signal, for example, if the target signal is a speech signal, we will label it one, otherwise zero.
To solve the convolutive blind source separation (BSS) problem, this thesis presents a new method which utilizing a fast time frequency mask technique. We first define two features, which are normalized level-ratio and phase-difference. Next, we reduce the variance of feature in order to obtain lower iterations of K-Means clustering. Afterwards, with K-means algorithm, we cluster the features by assigning them to the nearest group. In the end, according to the clustered features, a time frequency mask is generated. The method is not only easy, but also faster without reducing the quality of the target signal. In real environment,we also evaluate the separated signal in terms of SDR (signal to distortion ratio) and SIR (signal to interference ratio).
|
author2 |
Tsung-han Tsai |
author_facet |
Tsung-han Tsai Pei-yun Liu 劉佩昀 |
author |
Pei-yun Liu 劉佩昀 |
spellingShingle |
Pei-yun Liu 劉佩昀 Blind Source Separation Using a Fast Time Frequency Mask Technique |
author_sort |
Pei-yun Liu |
title |
Blind Source Separation Using a Fast Time Frequency Mask Technique |
title_short |
Blind Source Separation Using a Fast Time Frequency Mask Technique |
title_full |
Blind Source Separation Using a Fast Time Frequency Mask Technique |
title_fullStr |
Blind Source Separation Using a Fast Time Frequency Mask Technique |
title_full_unstemmed |
Blind Source Separation Using a Fast Time Frequency Mask Technique |
title_sort |
blind source separation using a fast time frequency mask technique |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/70298060585414257731 |
work_keys_str_mv |
AT peiyunliu blindsourceseparationusingafasttimefrequencymasktechnique AT liúpèiyún blindsourceseparationusingafasttimefrequencymasktechnique AT peiyunliu yīgèjiāsùshípínyùzhēzhàozhīmángxùnhàofēnlíyǎnsuànfǎ AT liúpèiyún yīgèjiāsùshípínyùzhēzhàozhīmángxùnhàofēnlíyǎnsuànfǎ |
_version_ |
1718088073479192576 |