Enhancing ASR for German Medical Domain without Fine-Tuning

Abdullah Al Foysal; Ronald Böck

Enhancing ASR for German Medical Domain without Fine-Tuning

Authors: Abdullah Al Foysal, Ronald Böck

Abstract:

Speech recognition in medical context is important but also challenging. Especially the adaptation of speech models is a concern directly influencing the performance of models and thus, the application of such technology in medical working processes. This issue is related to the availability of speech samples for fine-tuning the systems, which is often problematic to regulatory aspects. Since, however, speech processing provides benefits for medical personnel to optimise working processes, we propose a pipeline, allowing adaption of speech processing as well as automatic output formatting. We decided to establish a post-processing approach, using pre-trained (not necessarily medically updated) speech models, being combined with lexicon- and processing techniques to allow adaptation to medical technical terms. Furthermore, the pipeline comprises handling of spoken formatting commands. The entire system is working (close to) real-time. In the paper, we also demonstrate our approach in a first study.

Year: 2026
In session: Speech Signal Recognition and Enhancement
Pages: 24 to 31