Search results for `shutdownability`

Order:

Order

More results on PhilPapers

1018
The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists.Elliott Thornley - forthcoming - Philosophical Studies:1-28.details
I explain the shutdown problem: the problem of designing artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I prove three theorems that make the difficulty precise. These theorems show that agents satisfying some innocuous-seeming conditions will often try to prevent or cause the pressing of the shutdown button, even in cases where it’s costly to do so. And (...)
Download

Export citation

Bookmark 2 citations
771
The Shutdown Problem: Incomplete Preferences as a Solution.Elliott Thornley - manuscriptdetails
I explain and motivate the shutdown problem: the problem of creating artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I then propose a solution: train agents to have incomplete preferences. Specifically, I propose that we train agents to lack a preference between every pair of different-length trajectories. I suggest a way to train such agents using reinforcement learning: (...)
Download

Export citation

Bookmark 1 citation
181
Towards Shutdownable Agents via Stochastic Choice.Elliott Thornley, Alexander Roman, Christos Ziakas, Leyton Ho & Louis Thomson - 2024 - Global Priorities Institute Working Paper.details
Some worry that advanced artificial agents may resist being shut down. The Incomplete Preferences Proposal (IPP) is an idea for ensuring that doesn't happen. A key part of the IPP is using a novel 'Discounted REward for Same-Length Trajectories (DREST)' reward function to train agents to (1) pursue goals effectively conditional on each trajectory-length (be 'USEFUL'), and (2) choose stochastically between different trajectory-lengths (be 'NEUTRAL' about trajectory-lengths). In this paper, we propose evaluation metrics for USEFULNESS and NEUTRALITY. We use a (...)
Download

Export citation

Bookmark
41
Energy & throughput tradeoff in WSN with network coding.Nastooh Taheri Javan - 2013 - 2013 International Conference on Ict Convergence (Ictc) 1 (1):304-309.details
Recently, network coding emerged as a promising technology that can provide significant improvements in throughput and energy efficiency of wireless networks. Many implementations of network coding in wireless networks, such as COPE, encourage nodes to overhear to improve the coding opportunities so that they can create better opportunities for coding at the transmitter node through overhearing more packages. In this paper, we have shown that all overheard packets are not necessarily useful for coding; thus, a node can go to sleep (...)
Download

Export citation

Bookmark

Off-campus access

Using PhilArchive from home?

Create an account to enable off-campus access through your institution's proxy server or OpenAthens.

Monitor this page

Be alerted of all new items appearing on this page. Choose how you want to monitor it:

RSS feed

About us

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

	show categories
	categorization shortcuts
	hide abstracts
	open articles in new windows

	show categories
	categorization shortcuts
	hide abstracts
	open articles in new windows

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Results for 'shutdownability'