Priberam

Causal DiConStruct

Model interpretability plays a central role in human-AI decision-making systems. Ideally, explanations should be expressed through semantic concepts and their causal relations in an interpretable way for the human experts. Additionally, explanation methods should be efficient, and not compromise the performance of the prediction task. Despite the rapid advances in AI explainability in the last few years, these problems still continue to not be addressed in practice. Furthermore, mainstream methods for local concept explainability do not produce causal explanations and incur in a tradeoff between explainability and prediction performance. To fill this gap in AI explainability, we present Causal DiConStruct, an explanation method that is both concept-based and causal, with the goal of creating more interpretable local explanations in the form of structural causal models and concept attributions. Our explainer learns exogenous variables and structural assignments in two separate components, and works for any black-box machine learning model by approximating its predictions while producing the respective explanations. Because Causal DiConStruct is a distillation model, it generates explanations efficiently while not impacting the black-box prediction task. We validate our method on an image dataset and a tabular dataset and show that Causal DiConStruct approximates the black-box models with higher fidelity while obtaining more diverse concept attributions than other concept explainability baselines.

Ricardo Moreira

Ricardo Moreira studied mechanical engineering at Instituto Superior Técnica, Lisboa, where he also did 1 year of research into image processing with medical applications. Then he joined Feedzai to the customer success team where he worked for 2 years developing machine learning models for fraud detection. In the last 2 years, he started a PhD in Causality for Machine learning and joined Feedzai's research team where he's been involved in developing novel methods for feature monitoring and model explainability.Feedzai