Spoken Language Identification by Means of Acosutic Mid-level Descriptors


We introduce an acoustic mid-level feature (MLD) set derived from openSMILE low-level descriptors for the purpose of language characterisation and identification. The four languages targeted in this study are Georgian, Pashto, Kurmanji Kurdish, and Turkish. Language-dependent differences of these features will be discussed in terms of language typology. Furthermore, language identification by feed forward neural networks is comparatively evaluated for the MLDs and for openSMILE functionals, as well as for varying segment of analysis lengths. The best result 76.3% UAR was achieved for a joint feature set and for a minimum speech chunk length of 8 seconds.

Year: 2020
In session: Acoustic Signals
Pages: 125 to 132