Exploring uncertainty in MT tasks with Conformal Prediction

As (large) language models find applications across an increasingly broad spectrum of tasks, the necessity for reliable confidence estimates—or uncertainty quantification (UQ)—on their predictions is critical. However, the selection of appropriate and efficient UQ methods presents a considerable challenge, particularly in scenarios where access to the model’s parameters and its training process is restricted. In this presentation, we will delve into conformal prediction, a method that allows us to compute confidence intervals in the form of prediction sets, with established guarantees over coverage of the ground truth. I will discuss both regression and generation tasks within Natural Language Processing (NLP), using machine translation evaluation and machine translation (MT) paradigms respectively. With a focus on MT evaluation, we will explore how conformal prediction can guide the selection of fitting UQ methods, yielding meaningful confidence intervals that assist in identifying and addressing biases inherent in these approaches. Turning to generation, we face an additional challenge due to the sequential nature of these tasks, where the dependence on preceding tokens must be considered. This talk will address how we adapt our approach to accommodate such dependencies, and show how our method ‘Non-Exchangeable Conformal Language Generation with Nearest Neighbors’ can be used post-hoc for an arbitrary model without extra training to supply token-level, calibrated prediction sets. Finally, we will see how such prediction sets can be used for sampling in machine translation and language modelling, showing encouraging results in generation quality.

Chryssa Zerva

Chrysoula (Chryssa) Zerva holds the position of Assistant Professor in Artificial Intelligence at Instituto Superior Técnico, located in Lisbon, Portugal. She is a member of the European Laboratory for Learning and Intelligent Systems (ELLIS) and also participates in LUMLIS, the Lisbon ELLIS unit. She is a co-principal Investigator in the NextGenAI (Center for Responsible AI) project, aiming to advance trustworthy, sustainable, fair and transparent artificial intelligence and also participating in the UTTER project. Overall, her work and research interests converge on the elucidation of uncertainty in machine learning with a focus on NLP and language generation. Beyond uncertainty, she is interested in several aspects that promote trustworthy and responsible AI and NLP applications, namely explainability, context-awareness, fairness and transparency. She obtained her PhD in 2019 from the University of Manchester where she conducted research on her thesis "Automated Identification of Textual Uncertainty" under the supervision of Prof. Sophia Ananiadou. She was subsequently awarded the EPSRC doctoral prize fellowship, dedicated to investigating information propagation and misinformation detection. In 2021 she joined Instituto de Telecomunicações in Lisbon, as a post-doc for the DeepSPIN project, under the supervision of Prof. André Martins and she worked on a range of diverse facets of Natural Language Processing (NLP), including uncertainty quantification, machine translation, and quality estimation.IST