Clinical documents and textual annotations within electronic health records contain rich information for clinical research and medical practice. Natural Language Processing (NLP) can play an important role in unlocking patient information from clinical narratives. Specifically, the International Classification of Diseases (ICD) coding system has been adopted worldwide as a universal standard for precise documentation in the healthcare domain. Since manual ICD coding is very expensive, time-consuming, and error-prone, deep learning algorithms have been proposed to automate this task. In this presentation, we present a practical application of automatic ICD coding for assigning codes for causes of death, by analyzing free-text descriptions in death certificates, together with the associated autopsy reports and clinical bulletins, from the Portuguese Ministry of Health. We also present a novel approach for ICD coding. Experiments in the most used public available dataset show that the proposed approach outperforms the current state-of-the-art models in ICD coding and also leads to properly calibrated classification results, which can effectively inform downstream tasks such as text quantification.
Deep Models for ICD Coding and Quantification from Clinical Text
May 21, 2024
10:07 am
Isabel Coutinho
Isabel Coutinho received a MSc degree in Biomedical Engineering, from Instituto Superior Técnico, Universidade de Lisboa. She is currently a third-year PhD student at the same institution and a junior researcher at the Human Language Technologies Lab of INESC-ID. Her research interests focus in Natural Language Processing, specifically applied to the health domain.Seminários
Últimos seminários
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…
Enhancing Uncertainty Estimation in Neural Networks
May 6, 2025Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present…
Improving Evaluation Metrics for Vision-and-Language Models
April 22, 2025Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics…



