Abstract
Since its emergence in the 1960s, Artifcial Intelligence (AI) has grown to conquer many
technology products and their felds of application. Machine learning, as a major part of
the current AI solutions, can learn from the data and through experience to reach high
performance on various tasks. This growing success of AI algorithms has led to a need
for interpretability to understand opaque models such as deep neural networks. Various requirements have been raised from diferent domains, together with numerous tools
to debug, justify outcomes, and establish the safety, fairness and reliability of the models. This variety of tasks has led to inconsistencies in the terminology with, for instance,
terms such as interpretable, explainable and transparent being often used interchangeably in methodology papers. These words, however, convey diferent meanings and are
“weighted" diferently across domains, for example in the technical and social sciences.
In this paper, we propose an overarching terminology of interpretability of AI systems that
can be referred to by the technical developers as much as by the social sciences community
to pursue clarity and efciency in the defnition of regulations for ethical and reliable AI
development. We show how our taxonomy and defnition of interpretable AI difer from
the ones in previous research and how they apply with high versatility to several domains
and use cases, proposing a—highly needed—standard for the communication among interdisciplinary areas of AI.