Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine translation systems in addressing discourse phenomena, such as pronoun resolution and lexical cohesion at the document level. In the seminar, … Read More
S16 (2024-2025)

Speech as a Biomarker for Disease Detection
Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong potential for the development of low-cost, remote testing tools for several diseases. In fact, speech encodes information about a plethora … Read More

Enhancing Uncertainty Estimation in Neural Networks
Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present our work entitled Error-Driven Uncertainty Aware Training (EUAT), which aims to enhance the ability of neural classifiers to estimate their … Read More

Improving Evaluation Metrics for Vision-and-Language Models
Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics such as CLIPScore have advanced automated caption evaluation, most existing work on learned evaluation metrics remains limited to pointwise English-centric … Read More

Pushing the Limits of Sparse Attention: From Theory to Practical Efficiency
Adaptive sparse attention mechanisms have emerged as a powerful alternative to dense attention in transformers, offering more interpretability for sequence modeling. Despite this, their widespread adoption has been limited by computational inefficiencies and insufficient understanding of their theoretical properties compared … Read More
From Llama 3 to Deepseek R1 and beyond: a year of LLMs in retrospective
The year 2023 was ripe in open-source LLMs, and the community managed to surpass the original ChatGPT model. The wave continued in 2024-2025, and the gap to the best closed-source models is now reduced to a few months. This talk … Read More
Did AI See This? Detecting Copyrighted Data in Large-Scale Models’ Training
Large-scale models are trained on massive amounts of data, yet the secrecy surrounding training datasets makes it difficult to determine whether specific content was included. In this talk, I introduce two novel approaches for addressing this challenge in the context … Read More
xCOMET, Tower, EuroLLM: Open & Multilingual LLMs for Europe
Today, LLMs are Swiss knives and machine translation (MT) one of their tools. Is this the end of MT research? In this talk, I argue that the connection between LLM and MT research is two-way. I present some of our … Read More