The explanation game: a formal framework for interpretable machine learning

David S. Watson; Luciano Floridi

The explanation game: a formal framework for interpretable machine learning

Synthese 198 (10):1–⁠32 (2020) Copy BIBT_EX

Abstract

We propose a formal framework for interpretable machine learning. Combining elements from statistical learning, causal interventionism, and decision theory, we design an idealised explanation game in which players collaborate to find the best explanation for a given algorithmic prediction. Through an iterative procedure of questions and answers, the players establish a three-dimensional Pareto frontier that describes the optimal trade-offs between explanatory accuracy, simplicity, and relevance. Multiple rounds are played at different levels of abstraction, allowing the players to explore overlapping causal patterns of variable granularity and scope. We characterise the conditions under which such a game is almost surely guaranteed to converge on a optimal explanation surface in polynomial time, and highlight obstacles that will tend to prevent the players from advancing beyond certain explanatory thresholds. The game serves a descriptive and a normative function, establishing a conceptual space in which to analyse and compare existing proposals, as well as design new and improved solutions.

View on PhilPapers

Author Profiles

Luciano Floridi

Yale University

David Watson

University College London

Archival history

Archival date: 2021-06-08
View all versions

Keywords

Algorithmic explainability, Explanation game, Interpretable machine learning, Pareto frontier, Relevance

Reprint years

DOI

10.1007/s11229-020-02629-9

Analytics

Added to PP
2020-04-03

Downloads
725 (#38,658)

6 months
278 (#9,805)

Historical graph of downloads since first upload

This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.

How can I increase my downloads?

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

The explanation game: a formal framework for interpretable machine learning

Abstract

Author Profiles

Archival history

Categories

Keywords

Reprint years

DOI

Analytics