Machines learning values

In S. Matthew Liao (ed.), Ethics of Artificial Intelligence. New York, USA: Oxford University Press (2020)
  Copy   BIBTEX


Whether it would take one decade or several centuries, many agree that it is possible to create a *superintelligence*---an artificial intelligence with a godlike ability to achieve its goals. And many who have reflected carefully on this fact agree that our best hope for a "friendly" superintelligence is to design it to *learn* values like ours, since our values are too complex to program or hardwire explicitly. But the value learning approach to AI safety faces three particularly philosophical puzzles: first, it is unclear how any intelligent system could learn its final values, since to judge one supposedly "final" value against another seems to require a further background standard for judging. Second, it is unclear how to determine the content of a system's values based on its physical or computational structure. Finally, there is the distinctly ethical question of which values we should best aim for the system to learn. I outline a potential answer to these interrelated puzzles, centering on a "miktotelic" proposal for blending a complex, learnable final value out of many simpler ones.

Author's Profile

Steve Petersen
Niagara University


Added to PP

561 (#24,982)

6 months
104 (#31,036)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?