Semi-Supervised Learning of Sequence Models with the Method of Moments

May 2, 2017

1:00 pm

In this talk I will present work presented at EMNLP 2016, about a fast and scalable method for semi-supervised learning of sequence models.
The proposed method is based on anchor words and moment matching techniques to retrieve the hidden assignment in a Hidden Markov structure. We can handle feature-based observations. Unlike other semi-supervised methods, we propose a more efficient approach where no additional decoding passes are necessary on the unlabeled data and no graph needs to be constructed—only one pass is necessary to collect moment statistics.
We demonstrate the effectiveness of this approach on Twitter part-of-speech tagging experiments and show that our method can learn from very few annotated sentences.

Zita Marinho

Zita Marinho is a PhD student in Robotics Institute, under the CMU/Portugal doctoral program. She is affiliated with Institute for Robotics and Systems, and Instituto de Telecomunicações at IST. She received a M.S. degree in Robotics from CMU 2015, and a M.S. degree in Physics Engineering from Instituto Superior Tecnico, Universidade de Lisboa, Portugal 2010. Prior to her PhD, she was an intern at the European Space Agency, ESOC Darmstadt, Germany 2011-2012. Currently, as a PhD student, she is jointly advised by Andre Martins at Unbabel/Instituto de Telecomunicações, Geoffrey Gordon at the Machine Learning Department/CMU and Siddhartha Srinivasa at the Robotics Institute/CMU. Her research interests focus on machine learning methods using semi-supervision. She is interested in studying algorithms for learning with large amounts of data and little supervised information. Her PhD thesis focuses on spectral methods for learning in Natural Language and Robotics.IST/ISR, CMU/RI

Seminários

Últimos seminários

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025
Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…
Enhancing Uncertainty Estimation in Neural Networks
May 6, 2025
Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present…
Improving Evaluation Metrics for Vision-and-Language Models
April 22, 2025
Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics…

Semi-Supervised Learning of Sequence Models with the Method of Moments

Zita Marinho

Seminários

Últimos seminários

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Speech as a Biomarker for Disease Detection

Enhancing Uncertainty Estimation in Neural Networks

Improving Evaluation Metrics for Vision-and-Language Models