Characterization and Prediction of Dialogue Acts Using Prosodic Features


This study investigates the classification of dialogue acts using prosodic features. Prosodic elements contain information about the type of dialogue acts. For this reason they can be used to increase the performance of speech recognition and speech synthesis systems. In the present study, two feature sets and two clas- sifiers were compared. The first feature set comprised standard prosodic features, namely the F0 arithmetic mean, and its standard deviation. The overall dialogue act related F0 mean values support Ohala’s Frequency Code hypothesis. For the second feature set, intonation features accounting for F0 shapes were derived by polynomial intonation stylization. To compare the usability of both feature sets for classification, the sample of dialogue acts was classified using k-nearest neighbor and random tree classifiers. For both feature sets and classifiers accuracies around 77% were achieved. We found no significant difference between the classifiers and feature sets. One can conclude that standard F0 features already contain a rea- sonable amount of information about dialogue acts, and that a more elaborated F0 analysis is not beneficial for the given data.

Year: 2016
In session: Phonetik und Prosodie
Pages: 160 to 167