The Explanation Game: Towards Prediction Explainability through Sparse Communication

June 23, 2020

1:00 pm

Explainability is a topic of growing importance in NLP. In this work, we provide a unified perspective of explainability as a communication problem between an explainer and a layperson about a classifier’s decision. We use this framework to compare several prior approaches for extracting explanations, including gradient methods, representation erasure, and attention mechanisms, in terms of their communication success. In addition, we reinterpret these methods at the light of classical feature selection, and we use this as inspiration to propose new embedded methods for explainability, through the use of selective, sparse attention. Experiments in text classification and natural language inference, using different configurations of explainers and laypeople (including both machines and humans), reveal an advantage of attention-based explainers over gradient and erasure methods. Human experiments show promising results on text classification with post-hoc explainers trained to optimize communication success.

Marcos Treviso

Marcos is a Ph.D. student in the DeepSPIN Project, supervised by André Martins. His main interests include semi-parametric models and explainability of neural networks. Previously, he obtained an M.Sc. degree in Computer Science and Computational Mathematics at the University of São Paulo (USP), having worked with NLP and Machine Learning for sentence segmentation and disfluency detection. Marcos was also a research AI Intern at Unbabel in 2018, where he contributed to the OpenKiwi project.DeepSPIN/IT

Seminários

Últimos seminários

Cost-Sensitive Learning to Defer to Multiple Experts
March 2, 2026
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Fair Federated Learning under Group-Specific Distributed Concept Drift
February 24, 2026
Machine learning models can become unfair when different groups experience changes in data over time, a phenomenon called group-specific concept…
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025
Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…

The Explanation Game: Towards Prediction Explainability through Sparse Communication

Marcos Treviso

Seminários

Últimos seminários

Cost-Sensitive Learning to Defer to Multiple Experts

Fair Federated Learning under Group-Specific Distributed Concept Drift

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Speech as a Biomarker for Disease Detection