A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting

This paper presents a pitch-detection algorithm (PDA) for application to signals containing continuous speech. The core of the method is based on merged normalized forward-backward correlation (MNFBC) working in the time domain with the ability to make basic voicing decisions. In addition, the Viter...

Full description

Bibliographic Details
Main Author: J. Bartošek
Format: Article
Language:English
Published: CTU Central Library 2011-01-01
Series:Acta Polytechnica
Subjects:
PDA
Online Access:https://ojs.cvut.cz/ojs/index.php/ap/article/view/1422
id doaj-dadd66f701984d54996468fd6e76e3df
record_format Article
spelling doaj-dadd66f701984d54996468fd6e76e3df2020-11-24T22:48:18ZengCTU Central LibraryActa Polytechnica1210-27091805-23632011-01-015151422A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal ForgettingJ. BartošekThis paper presents a pitch-detection algorithm (PDA) for application to signals containing continuous speech. The core of the method is based on merged normalized forward-backward correlation (MNFBC) working in the time domain with the ability to make basic voicing decisions. In addition, the Viterbi traceback procedure is used for post-processing the MNFBC output considering the three best fundamental frequency (F0) candidates in each step. This should make the final pitch contour smoother, and should also prevent octave errors. In transition probabilities computation between F0 candidates, two major improvements were made over existing post-processing methods. Firstly, we compare pitch distance in musical cent units. Secondly, temporal forgetting is applied in order to avoid penalizing pitch jumps after prosodic pauses of one speaker or changes in pitch connected with turn-taking in dialogs. Results computed on a pitchreference database definitely show the benefit of the first improvement, but they have not yet proved any benefits of temporal modification. We assume this only happened due to the nature of the reference corpus, which had a small amount of suprasegmental content.https://ojs.cvut.cz/ojs/index.php/ap/article/view/1422PDAfundamental frequencyViterbitemporal forgettingMNFBCspeech processing
collection DOAJ
language English
format Article
sources DOAJ
author J. Bartošek
spellingShingle J. Bartošek
A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting
Acta Polytechnica
PDA
fundamental frequency
Viterbi
temporal forgetting
MNFBC
speech processing
author_facet J. Bartošek
author_sort J. Bartošek
title A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting
title_short A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting
title_full A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting
title_fullStr A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting
title_full_unstemmed A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting
title_sort pitch detection algorithm for continuous speech signals using viterbi traceback with temporal forgetting
publisher CTU Central Library
series Acta Polytechnica
issn 1210-2709
1805-2363
publishDate 2011-01-01
description This paper presents a pitch-detection algorithm (PDA) for application to signals containing continuous speech. The core of the method is based on merged normalized forward-backward correlation (MNFBC) working in the time domain with the ability to make basic voicing decisions. In addition, the Viterbi traceback procedure is used for post-processing the MNFBC output considering the three best fundamental frequency (F0) candidates in each step. This should make the final pitch contour smoother, and should also prevent octave errors. In transition probabilities computation between F0 candidates, two major improvements were made over existing post-processing methods. Firstly, we compare pitch distance in musical cent units. Secondly, temporal forgetting is applied in order to avoid penalizing pitch jumps after prosodic pauses of one speaker or changes in pitch connected with turn-taking in dialogs. Results computed on a pitchreference database definitely show the benefit of the first improvement, but they have not yet proved any benefits of temporal modification. We assume this only happened due to the nature of the reference corpus, which had a small amount of suprasegmental content.
topic PDA
fundamental frequency
Viterbi
temporal forgetting
MNFBC
speech processing
url https://ojs.cvut.cz/ojs/index.php/ap/article/view/1422
work_keys_str_mv AT jbartosek apitchdetectionalgorithmforcontinuousspeechsignalsusingviterbitracebackwithtemporalforgetting
AT jbartosek pitchdetectionalgorithmforcontinuousspeechsignalsusingviterbitracebackwithtemporalforgetting
_version_ 1725678644313980928