Priberam

Deep Models for ICD Coding and Quantification from Clinical Text

Clinical documents and textual annotations within electronic health records contain rich information for clinical research and medical practice. Natural Language Processing (NLP) can play an important role in unlocking patient information from clinical narratives. Specifically, the International Classification of Diseases (ICD) coding system has been adopted worldwide as a universal standard for precise documentation in the healthcare domain. Since manual ICD coding is very expensive, time-consuming, and error-prone, deep learning algorithms have been proposed to automate this task. In this presentation, we present a practical application of automatic ICD coding for assigning codes for causes of death, by analyzing free-text descriptions in death certificates, together with the associated autopsy reports and clinical bulletins, from the Portuguese Ministry of Health. We also present a novel approach for ICD coding. Experiments in the most used public available dataset show that the proposed approach outperforms the current state-of-the-art models in ICD coding and also leads to properly calibrated classification results, which can effectively inform downstream tasks such as text quantification.

Isabel Coutinho

Isabel Coutinho received a MSc degree in Biomedical Engineering, from Instituto Superior Técnico, Universidade de Lisboa. She is currently a third-year PhD student at the same institution and a junior researcher at the Human Language Technologies Lab of INESC-ID. Her research interests focus in Natural Language Processing, specifically applied to the health domain.