Machines learning values

In S. Matthew Liao (ed.), Ethics of Artificial Intelligence. New York, USA: Oxford University Press (2020)
  Copy   BIBTEX

Abstract

Whether it would take one decade or several centuries, many agree that it is possible to create a *superintelligence*---an artificial intelligence with a godlike ability to achieve its goals. And many who have reflected carefully on this fact agree that our best hope for a "friendly" superintelligence is to design it to *learn* values like ours, since our values are too complex to program or hardwire explicitly. But the value learning approach to AI safety faces three particularly philosophical puzzles: first, it is unclear how any intelligent system could learn its final values, since to judge one supposedly "final" value against another seems to require a further background standard for judging. Second, it is unclear how to determine the content of a system's values based on its physical or computational structure. Finally, there is the distinctly ethical question of which values we should best aim for the system to learn. I outline a potential answer to these interrelated puzzles, centering on a "miktotelic" proposal for blending a complex, learnable final value out of many simpler ones.

Author's Profile

Steve Petersen
Niagara University

Analytics

Added to PP
2020-06-19

Downloads
594 (#25,512)

6 months
111 (#31,403)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?