Speech-to-text summarization is a time-saving technique used to filter and keep pace with the daily influx of broadcast news uploaded online. The emergence of powerful deep learning-based language models, boasting impressive text generation capabilities, has directed research attention towards summarization systems capable of producing concise paraphrased versions of document content, commonly referred to as abstractive summaries. The application of end-to-end modelling for speech-to-text abstractive summarization shows promise by enabling the generation of rich latent representations that directly exploit non-verbal and acoustic information extracted from the audio source. Nevertheless, the unavailability of publicly accessible extensive corpora specific to the broadcast news domain, containing paired audio and summary data, poses a challenge for fully supervised approaches to end-to-end modeling. In this presentation, the speaker will discuss his work on a strategy that leverages external data through transfer learning from a pre-trained text-to-text abstractive summarizer.
Towards End-to-end Speech-to-text Abstractive Summarization
June 6, 2023
1:00 pm
Raul Monteiro
Raul Monteiro is an NLP researcher at Priberam Labs. He obtained a Master's degree (MSc) in Engineering Physics from Instituto Superior Técnico in 2023. He conducted his master's thesis in collaboration with Priberam, concentrating on the domain of Speech-to-text Summarization. His research interests primarily revolve around Deep Learning and Speech Processing, with particular focus on Speech Summarization and Spoken Named Entity Recognition.PriberamSeminários
Últimos seminários
Cost-Sensitive Learning to Defer to Multiple Experts
March 2, 2026Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Fair Federated Learning under Group-Specific Distributed Concept Drift
February 24, 2026Machine learning models can become unfair when different groups experience changes in data over time, a phenomenon called group-specific concept…
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…

