Cross lingual transfer learning does not improve aphasic speech recognition

Abstract:

In addressing the particular linguistic challenges posed by patients suffering from aphasia, a language disorder, this paper proposes a fine-tuning approach to enhance the speech recognition capabilities of existing models. The available aphasic research data in German is highly limited. To address this constraint, we propose a cross-lingual transfer approach to utilize English data to improve performance in German. This advancement aims to support the development of a therapy platform tailored for patients with aphasia. For the base speech recognition model, we choose to use OpenAI’s Whisper model, and for fine-tuning, we make use of TalkBank’s AphasiaBank. The experimental findings demonstrate that the transcription of aphasic audio with Whisper is less successful than non-aphasic audio. However, fine-tuning the transcription in the respective language resulted in an enhancement of its quality. In contrast, fine-tuning the transcription in another language and expecting a transfer of the learned aphasic speech properties led to a deterioration in its quality.


Year: 2025
In session: Recognition in HMI and Therapeutic Applications
Pages: 77 to 84