In the last few years, language modeling techniques based on exponential models have consistently outperformed traditional n-gram models. Such techniques include L1-Regularized Maximum Entropy (L1-MaxEnt), and both Feedforward and Recurrent Neural Network Language Models (RNNLM). While more accurate, these models are also much more expensive to train and use. This presents a problem for low latency applications where it is desirable to find n-gram approximations that can be used in the first pass of a speech recognition system. In this talk I will present Bottleneck Neural Network Language Models, a novel feedforward architecture designed to be achieve low perplexity while allowing for n-gram approximations. This model is similar to MaxEnt models in the sense that its input is a rich set of features, however these features are processed though a non linear hidden layer to encourage generalization. In the talk, I will compare this architecture to other exponential models and I will present an effective algorithm for creating n-gram approximations. Results will be presented on standard data sets and on a voice-mail to text state of the art ASR system.
Bottleneck Neural Network Language Models
March 24, 2015
1:00 pm