Noise Robust Pitch Tracking by Subband Autocorrelation Classification

Speech pitch tracking is one of the elementary tasks of the Computational Auditory Scene Analysis (CASA). While a human can easily listen to the voiced pitch in highly noisy recordings, the performance of automatic speech pitch tracking degrades in unknown noisy audio conditions. Traditional pitch t...

Full description

Bibliographic Details
Main Author:	Lee, Byung Suk
Language:	English
Published:	2012
Subjects:	Electrical engineering Computer science
Online Access:	https://doi.org/10.7916/D8SJ1SPJ

id	ndltd-columbia.edu-oai-academiccommons.columbia.edu-10.7916-D8SJ1SPJ
record_format	oai_dc
spelling	ndltd-columbia.edu-oai-academiccommons.columbia.edu-10.7916-D8SJ1SPJ2019-05-09T15:13:54ZNoise Robust Pitch Tracking by Subband Autocorrelation ClassificationLee, Byung Suk2012ThesesElectrical engineeringComputer scienceSpeech pitch tracking is one of the elementary tasks of the Computational Auditory Scene Analysis (CASA). While a human can easily listen to the voiced pitch in highly noisy recordings, the performance of automatic speech pitch tracking degrades in unknown noisy audio conditions. Traditional pitch trackers use either autocorrelation or the Fourier transform to calculate periodicity, which works well for clean recordings. For noisy recordings, however, the accuracy of these pitch trackers degrades in general. For example, the information in parts of the frequency spectrum may be lost due to analog radio band transmission and/or contain additive noise of various kinds. Instead of explicitly using the most obvious features of autocorrelation, we propose a trained classier-based approach, which we call Subband Autocorrelation Classification (SAcC). A multi-layer perceptron (MLP) classier is trained on the principal components of the autocorrelations of subbands from an auditory filterbank. The output of the MLP classifier is temporally smoothed to produce the pitch track by finding the Viterbi path of a Hidden Markov Model (HMM). Training on various types of noisy speech recordings leads to a great increase in performance over state-of-the-art algorithms, according to both the traditional Gross Pitch Error (GPE) measure, and a proposed novel Pitch Tracking Error (PTE) which more fully reflects the accuracy of both pitch estimation/extraction and voicing detection in a single measure. To verify the generalization and specificity of SAcC, we test SAcC on a real world problem that has a large-scale noisy speech corpus. The data is from the DARPA Robust Automatic Transcription of Speech (RATS) program. The experiments on the performance evaluation of SAcC pitch tracking confirm the generalization power of SAcC across various unknown noise conditions and distinct speech corpora. We also report the use of SAcC output adds a significant improvement to a Speaker Identification (SID) system for RATS as well, suggesting the potential contribution of SAcC pitch tracking in the higher-level tasks.Englishhttps://doi.org/10.7916/D8SJ1SPJ
collection	NDLTD
language	English
sources	NDLTD
topic	Electrical engineering Computer science
spellingShingle	Electrical engineering Computer science Lee, Byung Suk Noise Robust Pitch Tracking by Subband Autocorrelation Classification
description	Speech pitch tracking is one of the elementary tasks of the Computational Auditory Scene Analysis (CASA). While a human can easily listen to the voiced pitch in highly noisy recordings, the performance of automatic speech pitch tracking degrades in unknown noisy audio conditions. Traditional pitch trackers use either autocorrelation or the Fourier transform to calculate periodicity, which works well for clean recordings. For noisy recordings, however, the accuracy of these pitch trackers degrades in general. For example, the information in parts of the frequency spectrum may be lost due to analog radio band transmission and/or contain additive noise of various kinds. Instead of explicitly using the most obvious features of autocorrelation, we propose a trained classier-based approach, which we call Subband Autocorrelation Classification (SAcC). A multi-layer perceptron (MLP) classier is trained on the principal components of the autocorrelations of subbands from an auditory filterbank. The output of the MLP classifier is temporally smoothed to produce the pitch track by finding the Viterbi path of a Hidden Markov Model (HMM). Training on various types of noisy speech recordings leads to a great increase in performance over state-of-the-art algorithms, according to both the traditional Gross Pitch Error (GPE) measure, and a proposed novel Pitch Tracking Error (PTE) which more fully reflects the accuracy of both pitch estimation/extraction and voicing detection in a single measure. To verify the generalization and specificity of SAcC, we test SAcC on a real world problem that has a large-scale noisy speech corpus. The data is from the DARPA Robust Automatic Transcription of Speech (RATS) program. The experiments on the performance evaluation of SAcC pitch tracking confirm the generalization power of SAcC across various unknown noise conditions and distinct speech corpora. We also report the use of SAcC output adds a significant improvement to a Speaker Identification (SID) system for RATS as well, suggesting the potential contribution of SAcC pitch tracking in the higher-level tasks.
author	Lee, Byung Suk
author_facet	Lee, Byung Suk
author_sort	Lee, Byung Suk
title	Noise Robust Pitch Tracking by Subband Autocorrelation Classification
title_short	Noise Robust Pitch Tracking by Subband Autocorrelation Classification
title_full	Noise Robust Pitch Tracking by Subband Autocorrelation Classification
title_fullStr	Noise Robust Pitch Tracking by Subband Autocorrelation Classification
title_full_unstemmed	Noise Robust Pitch Tracking by Subband Autocorrelation Classification
title_sort	noise robust pitch tracking by subband autocorrelation classification
publishDate	2012
url	https://doi.org/10.7916/D8SJ1SPJ
work_keys_str_mv	AT leebyungsuk noiserobustpitchtrackingbysubbandautocorrelationclassification
_version_	1719045602915909632

Noise Robust Pitch Tracking by Subband Autocorrelation Classification

Similar Items