In the last few years, language modeling techniques based on exponential models have consistently outperformed traditional n-gram models. Such techniques include L1-Regularized Maximum Entropy (L1-MaxEnt), and both Feedforward and Recurrent Neural Network Language Models (RNNLM). While more accurate, these models are also much more expensive to train and use. This presents a problem for low latency applications where it is desirable to find n-gram approximations that can be used in the first pass of a speech recognition system. In this talk I will present Bottleneck Neural Network Language Models, a novel feedforward architecture designed to be achieve low perplexity while allowing for n-gram approximations. This model is similar to MaxEnt models in the sense that its input is a rich set of features, however these features are processed though a non linear hidden layer to encourage generalization. In the talk, I will compare this architecture to other exponential models and I will present an effective algorithm for creating n-gram approximations. Results will be presented on standard data sets and on a voice-mail to text state of the art ASR system.
Bottleneck Neural Network Language Models
March 24, 2015
1:00 pm
Diamantino Caseiro
Diamantino Caseiro received a M.Sc. degree in electrical and computer engineering from Instituto Superior Tecnico (IST), Lisbon, Portugal in 1998, and a Ph.D in computer science, also from IST, in 2003. He was an assistant professor in the computer engineering department of this University from 2004 to 2007 and a member of the Spoken Language Systems Laboratory of INESC-ID from 1996 to 2007, where he specialized in weighted finite-state transducers and search algorithms for automatic speech recognition (ASR). From 2008 to 2014, he was a Principal Research Scientist at AT&T Labs Research, being responsible for language modeling, finite-state transducers and search for ASR. Since 2014 he has been a Senior Research Scientist at Google, working on search for ASR and on massive scale maximum entropy language modeling.GoogleSeminários
Últimos seminários
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…
Enhancing Uncertainty Estimation in Neural Networks
May 6, 2025Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present…
Improving Evaluation Metrics for Vision-and-Language Models
April 22, 2025Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics…



