Using semantic embeddings for initiating and planning articulatory speech synthesis

Abstract:

We present a method to both resynthesize and produce speech with the articulatory speech sythesizer VocalTractLab. We extend the recurrent gradientbased motor inference model for speech resynthesis with two generative adversarial networks (GANs). As a result, we are able to synthesize articulatory speech starting from the semantic level with distributed word embedding vector representations from fastText.


Year: 2022
In session: Articulatory Synthesis
Pages: 32 to 42