In Vincent C. Müller (ed.), Philosophy and Theory of Artificial Intelligence 2021. pp. 119-135 (2022)
AbstractThe “Value Alignment Problem” is the challenge of how to align the values of artificial intelligence with human values, whatever they may be, such that AI does not pose a risk to the existence of humans. A fundamental feature of how the problem is currently understood is that AI systems do not take the same things to be relevant as humans, whether turning humans into paperclips in order to “make more paperclips” or eradicating the human race to “solve climate change”. Specifically, existing approaches approaches to alignment appear to be concerned with how AI might *solve* problems in the relevant way. This paper presents and explores an approach to alignment rooted in the Enactive Theory of mind. It offers an alternative conception of the alignment as “how do we make relevant to AI what is relevant to humans?” In this conception, alignment is concerned with building AI so that it *discerns* and defines the problem in the relevant way. In this way, the Alignment Problem is shown to be the same problem as the Frame Problem. The paper concludes with a consideration of tradeoffs between these conceptions of the alignment problem.
Added to PP
Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.How can I increase my downloads?