Artificial Bandwidth Extension using a Glottal Excitation Model

Abstract:

The historical bandwidth of telephone speech (0.3 kHz to 3.4 kHz), which is still used today for speech transmission (e.g. in the AMR-codec [1]) leads to reduced intelligibility and naturalness of the transmitted speech. New mobile devices may use artificial bandwidth extension (ABE) to improve the received narrow-band (NB) speech quality. Aiming to reconstruct missing frequency components of NB speech on the receiving end, ABE often adopts the source-filter-model of human speech to reconstruct excitation and spectral envelope of the speech signal separately. In the extension of the excitation, no existing method exploits the fact that the wide-band (WB) excitation for vowel sounds can be modeled by parametric functions with nearly no perceptible differences [2]. This work investigated the possibility to extract optimal model parameters from the NB speech to use them for high quality ABE of the excitation for vowels. The proposed algorithm objectively meets or exceeds a state-of-the-art reference algorithm, but is currently subjectively slightly inferior.


Year: 2021
In session: Postersession 1
Pages: 95 to 103