Speech as a Biomarker for Disease Detection

May 20, 2025

11:00 am

Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong potential for the development of low-cost, remote testing tools for several diseases. In fact, speech encodes information about a plethora of diseases, which go beyond the so-called speech and language disorders, and include neurodegenerative, psychiatric, and respiratory diseases.

Recent advances in speech processing and machine learning have enabled the automatic detection of these diseases. Despite promising results, this research area faces challenges, primarily due to dataset limitations and the overlap of speech-affecting diseases, which often coexist and produce similar speech manifestations.

These challenges guide our latest research, where we discuss the characterization of normative speech. Similar to common blood tests, we explore reference intervals for interpretable speech features (acoustic and linguistic) as a first step toward adopting speech analysis for multidisease screening. We leverage deviations from these references to detect Alzheimer’s and Parkinson’s diseases using different classifiers, namely Neural Additive Models for enhanced interpretability.

Additionally, we explore bridging black-box models and interpretability by using large language models to annotate high-level, low-dimensional, interpretable characteristics of speech transcriptions, termed macro-descriptors—such as text coherence and lexical diversity. Using only four macro-descriptors, we outperformed conventional text-based Alzheimer’s disease detection methods.

Catarina Botelho

Catarina Botelho received her B.Sc. and M.Sc. degrees in Biomedical Engineering from Instituto Superior Técnico (IST), University of Lisbon, in 2018, and completed her Ph.D. in Electrical and Computer Engineering at IST and INESC-ID in 2024. Her MSc and PhD work has been distinguished by two awards from IST and University of Lisbon. Currently, she is a researcher at INESC-ID, contributing to the Accelerat.AI project. She has held positions as a research intern at Google AI, Toronto, and as a visitor researcher at the Cognitive Systems Lab, University of Bremen. She was involved in the student advisory committee of the International Speech Communication Association (ISCA-SAC), since 2020 to 2023, acting as Coordinator in 2022. Her scientific interests lie on speech and language technology for healthcare.INESC-ID

Seminários

Últimos seminários

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025
Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…
Enhancing Uncertainty Estimation in Neural Networks
May 6, 2025
Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present…
Improving Evaluation Metrics for Vision-and-Language Models
April 22, 2025
Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics…

Speech as a Biomarker for Disease Detection

Catarina Botelho

Seminários

Últimos seminários

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Speech as a Biomarker for Disease Detection

Enhancing Uncertainty Estimation in Neural Networks

Improving Evaluation Metrics for Vision-and-Language Models