A Pitch Detection Algorithm for Continuous Speech Signals Using Viterbi Traceback with Temporal Forgetting
This paper presents a pitch-detection algorithm (PDA) for application to signals containing continuous speech. The core of the method is based on merged normalized forward-backward correlation (MNFBC) working in the time domain with the ability to make basic voicing decisions. In addition, the Viter...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
CTU Central Library
2011-01-01
|
Series: | Acta Polytechnica |
Subjects: | |
Online Access: | https://ojs.cvut.cz/ojs/index.php/ap/article/view/1422 |
Summary: | This paper presents a pitch-detection algorithm (PDA) for application to signals containing continuous speech. The core of the method is based on merged normalized forward-backward correlation (MNFBC) working in the time domain with the ability to make basic voicing decisions. In addition, the Viterbi traceback procedure is used for post-processing the MNFBC output considering the three best fundamental frequency (F0) candidates in each step. This should make the final pitch contour smoother, and should also prevent octave errors. In transition probabilities computation between F0 candidates, two major improvements were made over existing post-processing methods. Firstly, we compare pitch distance in musical cent units. Secondly, temporal forgetting is applied in order to avoid penalizing pitch jumps after prosodic pauses of one speaker or changes in pitch connected with turn-taking in dialogs. Results computed on a pitchreference database definitely show the benefit of the first improvement, but they have not yet proved any benefits of temporal modification. We assume this only happened due to the nature of the reference corpus, which had a small amount of suprasegmental content. |
---|---|
ISSN: | 1210-2709 1805-2363 |