Switch to: References

Add citations

You must login to add citations.
  1. Off-Switching Not Guaranteed.Sven Neth - forthcoming - Philosophical Studies:1-13.
    Hadfield-Menell et al. (2017) propose the Off-Switch Game, a model of Human-AI cooperation in which AI agents always defer to humans because they are uncertain about our preferences. I explain two reasons why AI agents might not defer. First, AI agents might not value learning. Second, even if AI agents value learning, they might not be certain to learn our actual preferences.
    Download  
     
    Export citation  
     
    Bookmark  
  • Promotionalism, Orthogonality, and Instrumental Convergence.Nathaniel Sharadin - forthcoming - Philosophical Studies:1-31.
    Suppose there are no in-principle restrictions on the contents of arbitrarily intelligent agents’ goals. According to “instrumental convergence” arguments, potentially scary things follow. I do two things in this paper. First, focusing on the influential version of the instrumental convergence argument due to Nick Bostrom, I explain why such arguments require an account of “promotion,” i.e., an account of what it is to “promote” a goal. Then, I consider whether extant accounts of promotion in the literature -- in particular, probabilistic (...)
    Download  
     
    Export citation  
     
    Bookmark