Machines learning values

Steve Petersen

Machines learning values

In S. Matthew Liao, Ethics of Artificial Intelligence. Oxford University Press (2020) Copy BIBT_EX

Abstract

Whether it would take one decade or several centuries, many agree that it is possible to create a *superintelligence*---an artificial intelligence with a godlike ability to achieve its goals. And many who have reflected carefully on this fact agree that our best hope for a "friendly" superintelligence is to design it to *learn* values like ours, since our values are too complex to program or hardwire explicitly. But the value learning approach to AI safety faces three particularly philosophical puzzles: first, it is unclear how any intelligent system could learn its final values, since to judge one supposedly "final" value against another seems to require a further background standard for judging. Second, it is unclear how to determine the content of a system's values based on its physical or computational structure. Finally, there is the distinctly ethical question of which values we should best aim for the system to learn. I outline a potential answer to these interrelated puzzles, centering on a "miktotelic" proposal for blending a complex, learnable final value out of many simpler ones.

View on PhilPapers

Author's Profile

Steve Petersen

Niagara University

Archival history

Archival date: 2020-06-19
View all versions

Keywords

ethics of ai value alignment ai safety superintelligence ethical rationalism specificationism coherence

Reprint years

Analytics

Added to PP
2020-06-19

Downloads
925 (#24,985)

6 months
200 (#16,542)

Historical graph of downloads since first upload

This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.

How can I increase my downloads?

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Machines learning values

Abstract

Author's Profile

Archival history

Categories

Keywords

Reprint years

Analytics