Three factors are critical in order to synthesize intelligible noise-vocoded Japanese speech

Factor analysis (principal component analysis followed by varimax rotation) had shown that three common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages/dialects [Ueda et al. (2010). Fechner Day 2010, Padua.] The present study inves...

Full description

Bibliographic Details
Main Authors: Takuya eKishida, Yoshitaka eNakajima, Kazuo eUeda, Gerard Bastiaan Remijn
Format: Article
Language:English
Published: Frontiers Media S.A. 2016-04-01
Series:Frontiers in Psychology
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fpsyg.2016.00517/full
Description
Summary:Factor analysis (principal component analysis followed by varimax rotation) had shown that three common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages/dialects [Ueda et al. (2010). Fechner Day 2010, Padua.] The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain suitable factors for resynthesizing speech sounds as 20-critical-band noise-vocoded speech, which were used for intelligibility test. The modified factor analysis ensured that the resynthesized speech sounds did not accompany steady background noise caused by the process of data reducing. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, basically the same 3 to 4 factors were revealed to be common to these languages. How many power-fluctuation factors were needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by twelve native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were determined when the number of factors was 1-9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The twelve listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 9.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.1%. The results indicated that, if the number of factors is 3 or more, elementary linguistic information is preserved in noise-vocoded speech.
ISSN:1664-1078