Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder
WaveNet, which learns directly from speech waveform samples, has been used as an alternative to vocoders and achieved very high-quality synthetic speech in terms of both naturalness and speaker similarity even in multi-speaker text-to-speech synthesis systems. However, the WaveNet vocoder uses acous...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8471179/ |