Multilingual Voice Analysis: Towards Prosodic Correlates of Voice Preference


Finding an appropriate corporate voice is usally a time-consuming and laborious task. Additional constraints come into play if the recorded voice is the basis of a TTS system. Hence a corporate voice approach should retrofit conventional requirements like high intelligibility and naturalness of the synthetic speech signal. The speaker selection is performed in several steps. After a preselection out of several candidates, the most promising speakers are recorded in a professional studio. These recordings together with resynthesised samples are then ranked by different listener groups. The results of the subjective ranking are compared to objective measurements. First investigations show correlations between prosodic features and human judgement. Another conclusion is that although voice preference is a highly subjective decision and language-dependent, there are cross-language skills among native/non-native listeners and listeners without any skill in a specific language.

Year: 2009
In session: Sprachsynthese und Emotionsmodellierung
Pages: 215 to 221