UK: Ethics Press (
2024)
Copy
BIBTEX
Abstract
The field of value alignment, or more broadly machine ethics, is becoming increasingly important as artificial intelligence developments accelerate. By ‘alignment’ we mean giving a generally intelligent software system the capability to act in ways that are beneficial, or at least minimally harmful, to humans. There are a large number of techniques that are being experimented with, but this work often fails to specify what values exactly we should be aligning. When making a decision, an agent is supposed to maximize the expected utility of its value function. Classically, this has been referred to as happiness, but this is just one of many things that people value.
In order to resolve this issue, we need to determine a set of human values that represent humanity's interests. Although this problem might seem intractable, research shows that people of various cultures and religions actually share more in common than they realize. In this book we review world religions, moral philosophy and evolutionary psychology to elucidate a common set of shared values. We then show how these values can be used to address the alignment problem and conclude with problems and goals for future research.
The key audience for this book will be researchers in the field of ethics and artificial intelligence who are interested in, or working on this problem. These people will come from various professions and include philosophers, computer programmers and psychologists, as the problem itself is multi-disciplinary.