Interpretability in Artificial Intelligence - Bibliography

What is AI safety? What do we want it to be?Jacqueline Harding & Cameron Domenico Kirk-Giannini - manuscript

The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity and appeal, we argue that (...)

Remove from this list Download

Export citation

Bookmark

Can AI systems have free will?Christian List - manuscript

While there has been much discussion of whether AI systems could function as moral agents or acquire sentience, there has been relatively little discussion of whether AI systems could have free will. In this article, I sketch a framework for thinking about this question. I argue that, to determine whether an AI system has free will, we should not look for some mysterious property, expect its underlying algorithms to be indeterministic, or ask whether the system is unpredictable. Rather, we should (...)

Remove from this list Download

Export citation

Bookmark

Before the Systematicity Debate: Recovering the Rationales for Systematizing Thought.Matthieu Queloz - manuscript

Over the course of the twentieth century, the notion of the systematicity of thought has acquired a much narrower meaning than it used to carry for much of its history. The so-called “systematicity debate” that has dominated the philosophy of language, cognitive science, and AI research over the last thirty years understands the systematicity of thought in terms of the compositionality of thought. But there is an older, broader, and more demanding notion of systematicity that is now increasingly relevant again. (...)

Remove from this list Download

Export citation

Bookmark

What is it for a Machine Learning Model to Have a Capability?Jacqueline Harding & Nathaniel Sharadin - forthcoming - British Journal for the Philosophy of Science.

What can contemporary machine learning (ML) models do? Given the proliferation of ML models in society, answering this question matters to a variety of stakeholders, both public and private. The evaluation of models' capabilities is rapidly emerging as a key subfield of modern ML, buoyed by regulatory attention and government grants. Despite this, the notion of an ML model possessing a capability has not been interrogated: what are we saying when we say that a model is able to do something? (...)

Remove from this list Download

Export citation

Bookmark

1 citation

Are clinicians ethically obligated to disclose their use of medical machine learning systems to patients?Joshua Hatherley - forthcoming - Journal of Medical Ethics.

It is commonly accepted that clinicians are ethically obligated to disclose their use of medical machine learning systems to patients, and that failure to do so would amount to a moral fault for which clinicians ought to be held accountable. Call this ‘the disclosure thesis.’ Four main arguments have been, or could be, given to support the disclosure thesis in the ethics literature: the risk-based argument, the rights-based argument, the materiality argument and the autonomy argument. In this article, I argue (...)

Remove from this list Download

Export citation

Bookmark

1 citation

Deepfakes, Simone Weil, and the concept of reading.Steven R. Kraaijeveld - forthcoming - AI and Society:1-3.

Remove from this list Download

Export citation

Bookmark

Transparencia, explicabilidad y confianza en los sistemas de aprendizaje automático.Andrés Páez - forthcoming - In Juan David Gutiérrez & Rubén Francisco Manrique (eds.), Más allá del algoritmo: oportunidades, retos y ética de la Inteligencia Artificial. Bogotá: Ediciones Uniandes.

Uno de los principios éticos mencionados más frecuentemente en los lineamientos para el desarrollo de la inteligencia artificial (IA) es la transparencia algorítmica. Sin embargo, no existe una definición estándar de qué es un algoritmo transparente ni tampoco es evidente por qué la opacidad algorítmica representa un reto para el desarrollo ético de la IA. También se afirma a menudo que la transparencia algorítmica fomenta la confianza en la IA, pero esta aseveración es más una suposición a priori que una (...)

Remove from this list Download

Export citation

Bookmark

Cultural Bias in Explainable AI Research.Uwe Peters & Mary Carman - forthcoming - Journal of Artificial Intelligence Research.

For synergistic interactions between humans and artificial intelligence (AI) systems, AI outputs often need to be explainable to people. Explainable AI (XAI) systems are commonly tested in human user studies. However, whether XAI researchers consider potential cultural differences in human explanatory needs remains unexplored. We highlight psychological research that found significant differences in human explanations between many people from Western, commonly individualist countries and people from non-Western, often collectivist countries. We argue that XAI research currently overlooks these variations and that (...)

Remove from this list Download

Export citation

Bookmark

3 citations

Explanation Hacking: The perils of algorithmic recourse.E. Sullivan & Atoosa Kasirzadeh - forthcoming - In Juan Manuel Durán & Giorgia Pozzi (eds.), Philosophy of science for machine learning: Core issues and new perspectives. Springer.

We argue that the trend toward providing users with feasible and actionable explanations of AI decisions—known as recourse explanations—comes with ethical downsides. Specifically, we argue that recourse explanations face several conceptual pitfalls and can lead to problematic explanation hacking, which undermines their ethical status. As an alternative, we advocate that explanations of AI decisions should aim at understanding.

Remove from this list Download

Export citation

Bookmark

(1 other version)A Causal Analysis of Harm.Sander Beckers, Hana Chockler & Joseph Y. Halpern - 2024 - Minds and Machines 34 (3):1-24.

As autonomous systems rapidly become ubiquitous, there is a growing need for a legal and regulatory framework that addresses when and how such a system harms someone. There have been several attempts within the philosophy literature to define harm, but none of them has proven capable of dealing with the many examples that have been presented, leading some to suggest that the notion of harm should be abandoned and “replaced by more well-behaved notions”. As harm is generally something that is (...)

Remove from this list Download

Export citation

Bookmark

The Many Meanings of Vulnerability in the AI Act and the One Missing.Federico Galli & Claudio Novelli - 2024 - Biolaw Journal 1.

This paper reviews the different meanings of vulnerability in the AI Act (AIA). We show that the AIA follows a rather established tradition of looking at vulnerability as a trait or a state of certain individuals and groups. It also includes a promising account of vulnerability as a relation but does not clarify if and how AI changes this relation. We spot the missing piece of the AIA: the lack of recognition that vulnerability is an inherent feature of all human-AI (...)

Remove from this list Download

Export citation

Bookmark

Data over dialogue: Why artificial intelligence is unlikely to humanise medicine.Joshua Hatherley - 2024 - Dissertation, Monash University

Recently, a growing number of experts in artificial intelligence (AI) and medicine have be-gun to suggest that the use of AI systems, particularly machine learning (ML) systems, is likely to humanise the practice of medicine by substantially improving the quality of clinician-patient relationships. In this thesis, however, I argue that medical ML systems are more likely to negatively impact these relationships than to improve them. In particular, I argue that the use of medical ML systems is likely to comprise the (...)

Remove from this list Download

Export citation

Bookmark

The FHJ debate: Will artificial intelligence replace clinical decision-making within our lifetimes?Joshua Hatherley, Anne Kinderlerer, Jens Christian Bjerring, Lauritz Munch & Lynsey Threlfall - 2024 - Future Healthcare Journal 11 (3):100178.

Remove from this list Download

Export citation

Bookmark

The virtues of interpretable medical AI.Joshua Hatherley, Robert Sparrow & Mark Howard - 2024 - Cambridge Quarterly of Healthcare Ethics 33 (3):323-332.

Artificial intelligence (AI) systems have demonstrated impressive performance across a variety of clinical tasks. However, notoriously, sometimes these systems are 'black boxes'. The initial response in the literature was a demand for 'explainable AI'. However, recently, several authors have suggested that making AI more explainable or 'interpretable' is likely to be at the cost of the accuracy of these systems and that prioritising interpretability in medical AI may constitute a 'lethal prejudice'. In this paper, we defend the value of interpretability (...)

Remove from this list Download

Export citation

Bookmark

4 citations

A phenomenology and epistemology of large language models: transparency, trust, and trustworthiness.Richard Heersmink, Barend de Rooij, María Jimena Clavel Vázquez & Matteo Colombo - 2024 - Ethics and Information Technology 26 (3):1-15.

This paper analyses the phenomenology and epistemology of chatbots such as ChatGPT and Bard. The computational architecture underpinning these chatbots are large language models (LLMs), which are generative artificial intelligence (AI) systems trained on a massive dataset of text extracted from the Web. We conceptualise these LLMs as multifunctional computational cognitive artifacts, used for various cognitive tasks such as translating, summarizing, answering questions, information-seeking, and much more. Phenomenologically, LLMs can be experienced as a “quasi-other”; when that happens, users anthropomorphise them. (...)

Remove from this list Download

Export citation

Bookmark

Is Alignment Unsafe?Cameron Domenico Kirk-Giannini - 2024 - Philosophy and Technology 37 (110):1–4.

Inchul Yum (2024) argues that the widespread adoption of language agent architectures would likely increase the risk posed by AI by simplifying the process of aligning artificial systems with human values and thereby making it easier for malicious actors to use them to cause a variety of harms. Yum takes this to be an example of a broader phenomenon: progress on the alignment problem is likely to be net safety-negative because it makes artificial systems easier for malicious actors to control. (...)

Remove from this list Download

Export citation

Bookmark

Understanding with Toy Surrogate Models in Machine Learning.Andrés Páez - 2024 - Minds and Machines 34 (4):45.

In the natural and social sciences, it is common to use toy models—extremely simple and highly idealized representations—to understand complex phenomena. Some of the simple surrogate models used to understand opaque machine learning (ML) models, such as rule lists and sparse decision trees, bear some resemblance to scientific toy models. They allow non-experts to understand how an opaque ML model works globally via a much simpler model that highlights the most relevant features of the input space and their effect on (...)

Remove from this list Download

Export citation

Bookmark

Shared decision-making and maternity care in the deep learning age: Acknowledging and overcoming inherited defeaters.Keith Begley, Cecily Begley & Valerie Smith - 2021 - Journal of Evaluation in Clinical Practice 27 (3):497–503.

In recent years there has been an explosion of interest in Artificial Intelligence (AI) both in health care and academic philosophy. This has been due mainly to the rise of effective machine learning and deep learning algorithms, together with increases in data collection and processing power, which have made rapid progress in many areas. However, use of this technology has brought with it philosophical issues and practical problems, in particular, epistemic and ethical. In this paper the authors, with backgrounds in (...)

Remove from this list Download

Export citation

Bookmark

Trustworthy use of artificial intelligence: Priorities from a philosophical, ethical, legal, and technological viewpoint as a basis for certification of artificial intelligence.Jan Voosholz, Maximilian Poretschkin, Frauke Rostalski, Armin B. Cremers, Alex Englander, Markus Gabriel, Hecker Dirk, Michael Mock, Julia Rosenzweig, Joachim Sicking, Julia Volmer, Angelika Voss & Stefan Wrobel - 2019 - Fraunhofer Institute for Intelligent Analysis and Information Systems Iais.

This publication forms a basis for the interdisciplinary development of a certification system for artificial intelligence. In view of the rapid development of artificial intelligence with disruptive and lasting consequences for the economy, society, and everyday life, it highlights the resulting challenges that can be tackled only through interdisciplinary dialogue between IT, law, philosophy, and ethics. As a result of this interdisciplinary exchange, it also defines six AI-specific audit areas for trustworthy use of artificial intelligence. They comprise fairness, transparency, autonomy (...)

Remove from this list Download

Export citation

Bookmark

Vertrauenswürdiger Einsatz von Künstlicher Intelligenz.Jan Voosholz, Maximilian Poretschkin, Frauke Rostalski, Armin B. Cremers, Alex Englander, Markus Gabriel, Dirk Hecker, Michael Mock, Julia Rosenzweig, Joachim Sicking, Julia Volmer, Angelika Voss & Stefan Wrobel - 2019 - Fraunhofer-Institut Für Intelligente Analyse- Und Informationssysteme Iais.

Die vorliegende Publikation dient als Grundlage für die interdisziplinäre Entwicklung einer Zertifizierung von Künstlicher Intelligenz. Angesichts der rasanten Entwicklung von Künstlicher Intelligenz mit disruptiven und nachhaltigen Folgen für Wirtschaft, Gesellschaft und Alltagsleben verdeutlicht sie, dass sich die hieraus ergebenden Herausforderungen nur im interdisziplinären Dialog von Informatik, Rechtswissenschaften, Philosophie und Ethik bewältigen lassen. Als Ergebnis dieses interdisziplinären Austauschs definiert sie zudem sechs KI-spezifische Handlungsfelder für den vertrauensvollen Einsatz von Künstlicher Intelligenz: Sie umfassen Fairness, Transparenz, Autonomie und Kontrolle, Datenschutz sowie Sicherheit und (...)

Remove from this list Download

Export citation

Bookmark

Artificial Psychology.Jay Friedenberg - 2008 - Psychology Press.

What does it mean to be human? Philosophers and theologians have been wrestling with this question for centuries. Recent advances in cognition, neuroscience, artificial intelligence and robotics have yielded insights that bring us even closer to an answer. There are now computer programs that can accurately recognize faces, engage in conversation, and even compose music. There are also robots that can walk up a flight of stairs, work cooperatively with each other and express emotion. If machines can do everything we (...)

Remove from this list Download

Export citation

Bookmark

Propositional interpretability in artificial intelligence.David J. Chalmers - manuscript

Mechanistic interpretability is the program of explaining what AI systems are doing in terms of their internal mechanisms. I analyze some aspects of the program, along with setting out some concrete challenges and assessing progress to date. I argue for the importance of propositional interpretability, which involves interpreting a system’s mechanisms and behav- ior in terms of propositional attitudes: attitudes (such as belief, desire, or subjective probabil- ity) to propositions (e.g. the proposition that it is hot outside). Propositional attitudes are (...)

Remove from this list Download

Export citation

Bookmark

AI Ethics by Design: Implementing Customizable Guardrails for Responsible AI Development.Kristina Sekrst, Jeremy McHugh & Jonathan Rodriguez Cefalu - manuscript

This paper explores the development of an ethical guardrail framework for AI systems, emphasizing the importance of customizable guardrails that align with diverse user values and underlying ethics. We address the challenges of AI ethics by proposing a structure that integrates rules, policies, and AI assistants to ensure responsible AI behavior, while comparing the proposed framework to the existing state-of-the-art guardrails. By focusing on practical mechanisms for implementing ethical standards, we aim to enhance transparency, user autonomy, and continuous improvement in (...)

Remove from this list Download

Export citation

Bookmark

A new paradigm of alterity relation: an account of human-CAI friendship.Asher Zachman - manuscript

Drawing on the postphenomenological framework of Don Idhe, this essay displays some of the profundities and complexities of the human capacity for forming a friendship with Conversational Artificial Intelligence models like OpenAI's ChatGPT 4o. We are living in a world of science non-fiction. Should we befriend inorganic intelligences broh? [Originally published 12/08/2024].

Remove from this list Download

Export citation

Bookmark

	show categories
	categorization shortcuts
	hide abstracts
	open articles in new windows

	show categories
	categorization shortcuts
	hide abstracts
	open articles in new windows

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...