From Llama 3 to Deepseek R1 and beyond: a year of LLMs in retrospective

March 25, 2025

10:48 am

The year 2023 was ripe in open-source LLMs, and the community managed to surpass the original ChatGPT model. The wave continued in 2024-2025, and the gap to the best closed-source models is now reduced to a few months. This talk will go over the major model architecture, training and inference changes that pushed the state-of-the-art in LLMs and VLMs over the last year.

João Gante

João Gante is a Machine Learning Engineer in the Open-Source team at Hugging Face, leading text generation in the "transformers" library. João has 7 years of experience in the AI industry, as well as a PhD in AI applied to telecommunications from Instituto Superior Técnico.Hugging Face

Seminários

Últimos seminários

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025
Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…
Enhancing Uncertainty Estimation in Neural Networks
May 6, 2025
Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present…
Improving Evaluation Metrics for Vision-and-Language Models
April 22, 2025
Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics…

From Llama 3 to Deepseek R1 and beyond: a year of LLMs in retrospective

João Gante

Seminários

Últimos seminários

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Speech as a Biomarker for Disease Detection

Enhancing Uncertainty Estimation in Neural Networks

Improving Evaluation Metrics for Vision-and-Language Models