ESSV Konferenz Elektronische Sprachsignalverarbeitung

Title: Informationsstruktur in der Sprachsynthese: Früher Fokus und postfokale Gegebenheit

Authors: Frank Kügler, Bernadett Smolibocki, Manfred Stede, Sebastian Varges


Even though speech synthesis nowadays is of acceptable quality for many purposes, straightforward text-to-speech (TTS) systems do not produce optimal results in cases where contextual and other pragmatic factors play an important role for prosodic realization. For instance, in systems giving product comparisons and recommendations, an appropriate intonation is required to signal contrasting entities; and in longer discourse, given and new entities need to be distinguished prosodically. In our project, such notions of information structure (IS) are used to extend an existing text generator for product comparison/recommendation with a speech synthesis component (MARY TTS). In this paper, we concentrate on one particular IS phenomenon: post-focal givenness. The purpose of the paper is twofold: First, we explain the architecture of our system and the IS extensions we made MARY TTS (MARY+IS); second, we show that an appropriate prosodic marking of post-focal givenness indeed leads to increased hearer acceptability ratings.

Year: 2013
In session: Sprachsynthese
Pages: 56 to 63