ESSV Konferenz Elektronische Sprachsignalverarbeitung

Title: Prosodic Correlates of Voice Preference in Mandarin Chinese and German: A Cross-linguistic Comparison

Authors: Hongwei Ding, Rüdiger Hoffmann, Oliver Jokisch


To find a pleasant voice for speech synthesis is of vital importance to the system. In this study we investigated the influence of prosodic parameters on voice preference from a cross-linguistic perspective. We conducted two voice preference tests. 24 German and 50 Chinese listeners took part in the first test on German speaker selection; while 25 Chinese and 10 German listeners participated in the second one on Chinese speaker selection. Then we employed Momel algorithm to extract 18 purely acoustic parameters, which are supposed to capture the prosodic pitch change patterns of these speakers. The results showed that there were strong correlations between the ranking scores of German and Chinese listeners on both German and Chinese speakers. However, the Chinese listeners showed more correlations between their voice preference and the melody metrics than the German listeners. The Chinese listeners preferred larger pitch changes in Mandarin Chinese. The experiment results suggested that a pleasant voice can have some prosodic characteristics, which makes an agreeable impression on both native and non-native listeners. But listeners may also rely on information of other prosodic parameters (such as duration, speech rate, intensity) as well as phonation types and formant spaces in the selection of a preferred voice, which will be investigated in the future. The current study may shed some light on talent voice selection in a speech synthesis system.

Year: 2017
In session: Poster
Pages: 83 to 90