Machine Learning-Assisted Affect Labelling of Speech Data


This paper addresses the assisted annotation of emotions in affective speech data recorded in natural “in the wild” surroundings. Here, affective states with low expressiveness are encountered which makes manual annotation difficult and very time-consuming even for expert human annotators. Further, the training of an automatic emotion recognition system in such a setup requires high amounts of annotated data. We present a machine-learning-assisted semi-automatic annotation procedure, adopted from speech recognition. We give annotation time estimates and evaluate our approach on data of real-life in-vehicle emotions which are prototypical for natural surroundings. The time necessary for the complete data annotation could be substantially reduced to around 80% of the time needed for the fully manual annotation. At the same time, the quality of the obtained annotation remains the same as of the fully manual approach, in contrast to other currently available approaches such as Active Learning or Semi-Supervised Learning. Having shown the time saving effect, our approach is generally highly useful for annotation processes with high annotation effort.

Year: 2020
In session: Poster
Pages: 199 to 205