Machines learning values

In S. Matthew Liao (ed.), Ethics of Artificial Intelligence. Oxford University Press (2020)
  Copy   BIBTEX

Abstract

Whether it would take one decade or several centuries, many agree that it is possible to create a *superintelligence*---an artificial intelligence with a godlike ability to achieve its goals. And many who have reflected carefully on this fact agree that our best hope for a "friendly" superintelligence is to design it to *learn* values like ours, since our values are too complex to program or hardwire explicitly. But the value learning approach to AI safety faces three particularly philosophical puzzles: first, it is unclear how any intelligent system could learn its final values, since to judge one supposedly "final" value against another seems to require a further background standard for judging. Second, it is unclear how to determine the content of a system's values based on its physical or computational structure. Finally, there is the distinctly ethical question of which values we should best aim for the system to learn. I outline a potential answer to these interrelated puzzles, centering on a "miktotelic" proposal for blending a complex, learnable final value out of many simpler ones.

Author's Profile

Steve Petersen
Niagara University

Analytics

Added to PP
2020-06-19

Downloads
723 (#26,088)

6 months
143 (#36,641)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?