Machines learning values

In S. Matthew Liao (ed.), Ethics of Artificial Intelligence. New York, USA: Oxford University Press (2020)
Download Edit this record How to cite View on PhilPapers
Whether it would take one decade or several centuries, many agree that it is possible to create a *superintelligence*---an artificial intelligence with a godlike ability to achieve its goals. And many who have reflected carefully on this fact agree that our best hope for a "friendly" superintelligence is to design it to *learn* values like ours, since our values are too complex to program or hardwire explicitly. But the value learning approach to AI safety faces three particularly philosophical puzzles: first, it is unclear how any intelligent system could learn its final values, since to judge one supposedly "final" value against another seems to require a further background standard for judging. Second, it is unclear how to determine the content of a system's values based on its physical or computational structure. Finally, there is the distinctly ethical question of which values we should best aim for the system to learn. I outline a potential answer to these interrelated puzzles, centering on a "miktotelic" proposal for blending a complex, learnable final value out of many simpler ones.
PhilPapers/Archive ID
Upload history
Archival date: 2020-06-19
View other versions
Added to PP index

Total views
278 ( #24,027 of 2,455,788 )

Recent downloads (6 months)
105 ( #5,648 of 2,455,788 )

How can I increase my downloads?

Downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.