An Automatic Method for Speech Breathing Annotation


Breathing is central to speech planning and production; however, speech breathing is difficult to monitor and quantify without laborious and subjective manual annotation. Here, we describe a method for automatically detecting the beginning and end time points of speech-associated inhalations measured with inductive plethysmography, or breath belts. Unlike simpler approaches to breath detection, the technique introduced here employs slope analysis to improve temporal precision. First, inhalation events are identified by searching for roughly continuous, positive sloping segments. Inhalations are then rejected or modified based on slope height, duration, and grade, as well as contextual factors, such as the height or duration of neighbouring breaths. Finally, the respiratory time series can be optionally corroborated with acoustic recordings to further improve results. This approach is validated by two independent annotators using spontaneous and read English speech contributed by 10 individual speakers, including relatively noisy data. From a signal detection perspective, we estimate performance at 95% on average. The mean median error of detected breaths, when compared to human annotation, is 22.50 ms (IQR 37.71 ms). By comparison, a peak-finding method without acoustic calibration yields 91% accuracy with substantially larger errors (mean median 167.90 ms, IQR 381.45 ms). In conclusion, the proposed automatic method provides robust and temporally accurate annotation of the speech breathing time series.

Year: 2023
In session: Phonetics
Pages: 103 to 110