Speech-to-text summarization is a time-saving technique used to filter and keep pace with the daily influx of broadcast news uploaded online. The emergence of powerful deep learning-based language models, boasting impressive text generation capabilities, has directed research attention towards summarization systems capable of producing concise paraphrased versions of document content, commonly referred to as abstractive summaries. The application of end-to-end modelling for speech-to-text abstractive summarization shows promise by enabling the generation of rich latent representations that directly exploit non-verbal and acoustic information extracted from the audio source. Nevertheless, the unavailability of publicly accessible extensive corpora specific to the broadcast news domain, containing paired audio and summary data, poses a challenge for fully supervised approaches to end-to-end modeling. In this presentation, the speaker will discuss his work on a strategy that leverages external data through transfer learning from a pre-trained text-to-text abstractive summarizer.
Menu
Towards End-to-end Speech-to-text Abstractive Summarization
June 6, 2023
1:00 pm