Instrumental Divergence

Philosophical Studies:1-27 (2024)
  Copy   BIBTEX

Abstract

The thesis of instrumental convergence holds that a wide range of ends have common means: for instance, self preservation, desire preservation, self improvement, and resource acquisition. Bostrom contends that instrumental convergence gives us reason to think that "the default outcome of the creation of machine superintelligence is existential catastrophe". I use the tools of decision theory to investigate whether this thesis is true. I find that, even if intrinsic desires are randomly selected, instrumental rationality induces biases towards certain kinds of choices. Firstly, a bias towards choices which leave less up to chance. Secondly, a bias towards desire preservation, in line with Bostrom's conjecture. And thirdly, a bias towards choices which afford more choices later on. I do not find biases towards any other of the convergent instrumental means on Bostrom's list. I conclude that the biases induced by instrumental rationality at best weakly support Bostrom's conclusion that machine superintelligence is likely to lead to existential catastrophe.

Author's Profile

J. Dmitri Gallow
University of Southern California

Analytics

Added to PP
2024-02-14

Downloads
615 (#34,714)

6 months
313 (#4,974)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?