In this talk I will introduce a combination of moment based predictive models with deep reinforcement learning architectures, Recurrent Predictive State Policy (RPSP) networks. Predictive state serves as an equivalent representation of a belief state. Therefore, the policy component of the RPSP-network can be purely reactive, simplifying training while still allowing optimal behaviour. We show the efficacy of RPSP-networks under partial observability on a set of robotic control tasks from OpenAI Gym. We empirically show that RPSP-networks perform well compared with memory-preserving networks such as GRUs, as well as finite memory models. This work was done in collaboration with Ahmed Hefny at CMU.
Kernel and Moment Based Prediction and Planning
March 6, 2018
1:00 pm
Zita Marinho
Zita Marinho is PhD finalist in Robotics Institute, under the CMU/Portugal doctoral program. She is affiliated with Institute for Robotics and Systems, and Instituto de Telecomunicações at IST. She is currently working in Sacoor Brothers as a Data Scientist. She received a M.S. degree in Robotics from CMU 2015, and a M.S. degree in Physics Engineering from Instituto Superior Técnico, Universidade de Lisboa, Portugal 2010. As a PhD student, she was jointly advised by André Martins at Unbabel/Instituto de Telecomunicações, Geoffrey Gordon and Siddhartha Srinivasa at CMU. Her research interests focus on machine learning methods using semi-supervision. She is interested in studying algorithms for learning with large amounts of data and little supervised information. Her PhD thesis focuses on spectral methods for learning in Natural Language and Robotics.ITSeminários
Últimos seminários
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…
Enhancing Uncertainty Estimation in Neural Networks
May 6, 2025Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present…
Improving Evaluation Metrics for Vision-and-Language Models
April 22, 2025Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics…