Towards End-to-end Speech-to-text Abstractive Summarization

Speech-to-text summarization is a time-saving technique used to filter and keep pace with the daily influx of broadcast news uploaded online. The emergence of powerful deep learning-based language models, boasting impressive text generation capabilities, has directed research attention towards summarization systems capable of producing concise paraphrased versions of document content, commonly referred to as abstractive summaries. The application of end-to-end modelling for speech-to-text abstractive summarization shows promise by enabling the generation of rich latent representations that directly exploit non-verbal and acoustic information extracted from the audio source. Nevertheless, the unavailability of publicly accessible extensive corpora specific to the broadcast news domain, containing paired audio and summary data, poses a challenge for fully supervised approaches to end-to-end modeling. In this presentation, the speaker will discuss his work on a strategy that leverages external data through transfer learning from a pre-trained text-to-text abstractive summarizer.

Raul Monteiro

Raul Monteiro is an NLP researcher at Priberam Labs. He obtained a Master's degree (MSc) in Engineering Physics from Instituto Superior Técnico in 2023. He conducted his master's thesis in collaboration with Priberam, concentrating on the domain of Speech-to-text Summarization. His research interests primarily revolve around Deep Learning and Speech Processing, with particular focus on Speech Summarization and Spoken Named Entity Recognition.Priberam