ESSV Konferenz Elektronische Sprachsignalverarbeitung

Title: Towards ordinal classification of voice quality features with acoustic parameters

Authors: Felix Schaeffler, Matthias Eichner, Janet Beck


The human voice is capable of fine-grained variation that results in listener attributions of various psychological, social and biological factors. The complexity of this process is reflected in the number and richness of terms that are used to describe human voices. In this paper we argue that any application that attempts a mapping of the acoustic voice signal onto voice descriptor labels would benefit from an intermediate auditory-phonetic level. As a point of departure we explore the relationships between acoustic parameters and some specific perceptual features derived from Vocal Profile Analysis (VPA), a phonetically motivated voice quality analysis scheme. Perceptual analysis of voice samples from 133 speakers was carried out using VPA for three key phonation features (creakiness, whisperiness, harshness). We extracted eleven acoustic parameters from the samples and used stepwise linear regression to identify acoustic parameters with predictive value. Samples from female speakers were used to derive regression equations which were then used to predict VPA ratings of male voices. Results show significant predictors for all three phonation features and indicate that predictions for the three phonation types rely mainly on different parameters. If a tolerance of ± 1 scalar degree for the perceptual analysis is accepted, then classification accuracy lies at or above 90% for all three phonation fea-tures.

Year: 2019
In session: Prosodie
Pages: 288 to 295