DNN Online Adaptation for Automatic Speech Recognition

Abstract:

Although DNN-HMM based ASR systems can provide better accuracy than GMM-HMM based ASR systems in general, their performance still suffers from mismatches between the training and testing conditions. Online adaptation is a very effective way to make an ASR system more robust to a variety of environments and speaker characteristics. However, given large number of DNN parameters and only a limited amount of adaptation data, it is very challenging to perform DNN online adaptation effectively. In this paper, we propose two methods, namely i-vector and KLDivergence regularized Linear Hidden Network, for performing DNN online adaptation for real-time speech recognition systems. The proposed methods were evaluated on a voice search data set. Over 3% relative word error rate reduction (WERR) was achieved from each of the proposed methods alone. A further relative WERR of over 2% was achieved from combining them.


Year: 2018
In session: Signal Processing
Pages: 46 to 53