IteROAR: Quantifying the Interpretation of Feature Importance Methods

S. M. Palacio, F. Raue, T. Karayil, J. Hees, A. Dengel

Technical Report|January 2021

ExplainabilityDeep LearningFeature Importance

Abstract

We present IteROAR, a method for quantifying the interpretation quality of feature importance methods in deep learning, providing a systematic framework for evaluating how well these methods explain model predictions.

Overview

IteROAR addresses the challenge of evaluating feature importance methods in deep learning. As models become more complex, understanding which features drive their predictions is crucial for trust and transparency.

Key Contributions

We propose a systematic framework for quantifying the interpretation quality of feature importance methods, enabling researchers to compare and evaluate different explanation techniques in a principled manner.