Blind Source Separation Using a Fast Time Frequency Mask Technique

碩士 === 國立中央大學 === 電機工程學系 === 102 === The goal of BSS is solving cocktail party problem. Imagine a room with a number of persons and microphones for recording. When people are speaking at the same time, each microphone registers a different mixture of individual speaker's audio signals. And the...

Full description

Bibliographic Details
Main Authors: Pei-yun Liu, 劉佩昀
Other Authors: Tsung-han Tsai
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/70298060585414257731
id ndltd-TW-102NCU05442084
record_format oai_dc
spelling ndltd-TW-102NCU054420842015-10-13T23:55:41Z http://ndltd.ncl.edu.tw/handle/70298060585414257731 Blind Source Separation Using a Fast Time Frequency Mask Technique 一個加速時頻域遮罩之盲訊號分離演算法 Pei-yun Liu 劉佩昀 碩士 國立中央大學 電機工程學系 102 The goal of BSS is solving cocktail party problem. Imagine a room with a number of persons and microphones for recording. When people are speaking at the same time, each microphone registers a different mixture of individual speaker's audio signals. And the task of BSS is to untangle these mixtures into their sources. There are various applications including mobile telephony, multiuser communication systems, voice reinforce in noisy environment. The mixtures recorded by microphones will be transformed to frequency domain with STFT (Short-Time Fourier Transform). Owing to the characteristics, sparseness and the disjointness ,of the source signal, we can obtain those features from the mixtures during feature extraction step. The features are represented as complex number. Afterwards, by utilizing K-meansalgorithm, we divide those features into N group, where N is the number of sources. Prior to transform the separated signal back to time domain, we adopt mask design to label the target signal, for example, if the target signal is a speech signal, we will label it one, otherwise zero. To solve the convolutive blind source separation (BSS) problem, this thesis presents a new method which utilizing a fast time frequency mask technique. We first define two features, which are normalized level-ratio and phase-difference. Next, we reduce the variance of feature in order to obtain lower iterations of K-Means clustering. Afterwards, with K-means algorithm, we cluster the features by assigning them to the nearest group. In the end, according to the clustered features, a time frequency mask is generated. The method is not only easy, but also faster without reducing the quality of the target signal. In real environment,we also evaluate the separated signal in terms of SDR (signal to distortion ratio) and SIR (signal to interference ratio). Tsung-han Tsai 蔡宗漢 2014 學位論文 ; thesis 66 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中央大學 === 電機工程學系 === 102 === The goal of BSS is solving cocktail party problem. Imagine a room with a number of persons and microphones for recording. When people are speaking at the same time, each microphone registers a different mixture of individual speaker's audio signals. And the task of BSS is to untangle these mixtures into their sources. There are various applications including mobile telephony, multiuser communication systems, voice reinforce in noisy environment. The mixtures recorded by microphones will be transformed to frequency domain with STFT (Short-Time Fourier Transform). Owing to the characteristics, sparseness and the disjointness ,of the source signal, we can obtain those features from the mixtures during feature extraction step. The features are represented as complex number. Afterwards, by utilizing K-meansalgorithm, we divide those features into N group, where N is the number of sources. Prior to transform the separated signal back to time domain, we adopt mask design to label the target signal, for example, if the target signal is a speech signal, we will label it one, otherwise zero. To solve the convolutive blind source separation (BSS) problem, this thesis presents a new method which utilizing a fast time frequency mask technique. We first define two features, which are normalized level-ratio and phase-difference. Next, we reduce the variance of feature in order to obtain lower iterations of K-Means clustering. Afterwards, with K-means algorithm, we cluster the features by assigning them to the nearest group. In the end, according to the clustered features, a time frequency mask is generated. The method is not only easy, but also faster without reducing the quality of the target signal. In real environment,we also evaluate the separated signal in terms of SDR (signal to distortion ratio) and SIR (signal to interference ratio).
author2 Tsung-han Tsai
author_facet Tsung-han Tsai
Pei-yun Liu
劉佩昀
author Pei-yun Liu
劉佩昀
spellingShingle Pei-yun Liu
劉佩昀
Blind Source Separation Using a Fast Time Frequency Mask Technique
author_sort Pei-yun Liu
title Blind Source Separation Using a Fast Time Frequency Mask Technique
title_short Blind Source Separation Using a Fast Time Frequency Mask Technique
title_full Blind Source Separation Using a Fast Time Frequency Mask Technique
title_fullStr Blind Source Separation Using a Fast Time Frequency Mask Technique
title_full_unstemmed Blind Source Separation Using a Fast Time Frequency Mask Technique
title_sort blind source separation using a fast time frequency mask technique
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/70298060585414257731
work_keys_str_mv AT peiyunliu blindsourceseparationusingafasttimefrequencymasktechnique
AT liúpèiyún blindsourceseparationusingafasttimefrequencymasktechnique
AT peiyunliu yīgèjiāsùshípínyùzhēzhàozhīmángxùnhàofēnlíyǎnsuànfǎ
AT liúpèiyún yīgèjiāsùshípínyùzhēzhàozhīmángxùnhàofēnlíyǎnsuànfǎ
_version_ 1718088073479192576