ESSV Konferenz Elektronische Sprachsignalverarbeitung

Title: Multi-condition Deep Neural Network Training

Authors: Matthew Gibson, Christian Plahl, Puming Zhan, Gary Cook


Multi-condition training (MCT) aims to deliver robust acoustic models by incorporating data associated with conditions which are weakly represented in the training dataset. In the case of acoustic modelling for speech recognition, tran- scribed speech associated with a diverse range of conditions is often unavailable. This lack of availability is addressed by corrupting existing ‘clean’ speech. This work examines the relationships between the details of the corruption technique and the effectiveness of the resulting MCT process. The work also demonstrates that MCT can be very effective when a large degree of mismatch exists between training set and test set conditions, but that its impact is limited when a smaller extent of mismatch is present.

Year: 2018
In session: Poster
Pages: 77 to 84