Automatic Detection of Disfluencies in L1 and L2 Child Speech

Abstract:

Language Proficiency Assessments (LPAs) for preschool children often rely on manual evaluation methods that suffer from subjectivity and low inter-rater reliability. Moreover, current approaches typically overlook speech fluency which is an important indicator of language proficiency and can be linked to overall linguistic development. To address these limitations, we propose a more consistent and comprehensive LPA framework that incorporates automatic speech fluency assessment through disfluency detection. Disfluencies such as repetitions, repairs, restarts, partial words and filler particles significantly influence perceived fluency and thus serve as key indicators for modeling. Using spontaneous speech data collected from 167 German-speaking preschoolers via the “Wuschel-App,” we annotated a subset of utterances with fine-grained disfluency labels and fine-tuned a BERT-based token-level sequence labeling model, inspired by Romana et al.’s [1] architecture. Despite pronounced class imbalance in the dataset, our best model achieves a token-level accuracy of 96.0% and a macro F1 score of 65.7%, substantially outperforming the majority-class baseline (accuracy = 56.86%). Error analysis reveals that misclassifications primarily affect the rarest disfluency categories. We anticipate that expanding the annotated corpus will improve performance on these underrepresented classes, enabling finer-grained modeling and, ultimately, a robust text-signal combined fluency-aware LPA system.


Year: 2026
In session: Posters
Pages: 208 to 215