Switch to: References

Citations of:

Language Agents Reduce the Risk of Existential Catastrophe

Simon Goldstein & Cameron Domenico Kirk-Giannini

AI and Society:1-11 (2023)

Add citations

You must login to add citations.

Artificial Intelligence: Arguments for Catastrophic Risk.Adam Bales, William D'Alessandro & Cameron Domenico Kirk-Giannini - 2024 - Philosophy Compass 19 (2):e12964.details Recent progress in artificial intelligence (AI) has drawn attention to the technology’s transformative potential, including what some see as its prospects for causing large-scale harm. We review two influential arguments purporting to show how AI could pose catastrophic risks. The first argument — the Problem of Power-Seeking — claims that, under certain assumptions, advanced AI systems are likely to engage in dangerous power-seeking behavior in pursuit of their goals. We review reasons for thinking that AI systems might seek power, that (...) Download Export citation Bookmark 9 citations
What is AI safety? What do we want it to be?Jacqueline Harding & Cameron Domenico Kirk-Giannini - manuscriptdetails The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity and appeal, we argue that (...) Download Export citation Bookmark
Risk and artificial general intelligence.Federico L. G. Faroldi - forthcoming - AI and Society:1-9.details Artificial General Intelligence (AGI) is said to pose many risks, be they catastrophic, existential and otherwise. This paper discusses whether the notion of risk can apply to AGI, both descriptively and in the current regulatory framework. The paper argues that current definitions of risk are ill-suited to capture supposed AGI existential risks, and that the risk-based framework of the EU AI Act is inadequate to deal with truly general, agential systems. Download Export citation Bookmark 2 citations
Language Agents and Malevolent Design.Inchul Yum - 2024 - Philosophy and Technology 37 (104):1-19.details Language agents are AI systems capable of understanding and responding to natural language, potentially facilitating the process of encoding human goals into AI systems. However, this paper argues that if language agents can achieve easy alignment, they also increase the risk of malevolent agents building harmful AI systems aligned with destructive intentions. The paper contends that if training AI becomes sufficiently easy or is perceived as such, it enables malicious actors, including rogue states, terrorists, and criminal organizations, to create powerful (...) Download Export citation Bookmark 1 citation
Deception and manipulation in generative AI.Christian Tarsney - forthcoming - Philosophical Studies.details Large language models now possess human-level linguistic abilities in many contexts. This raises the concern that they can be used to deceive and manipulate on unprecedented scales, for instance spreading political misinformation on social media. In future, agentic AI systems might also deceive and manipulate humans for their own purposes. In this paper, first, I argue that AI-generated content should be subject to stricter standards against deception and manipulation than we ordinarily apply to humans. Second, I offer new characterizations of (...) Download Export citation Bookmark
Is Alignment Unsafe?Cameron Domenico Kirk-Giannini - 2024 - Philosophy and Technology 37 (110):1–4.details Inchul Yum (2024) argues that the widespread adoption of language agent architectures would likely increase the risk posed by AI by simplifying the process of aligning artificial systems with human values and thereby making it easier for malicious actors to use them to cause a variety of harms. Yum takes this to be an example of a broader phenomenon: progress on the alignment problem is likely to be net safety-negative because it makes artificial systems easier for malicious actors to control. (...) Download Export citation Bookmark
AI takeover and human disempowerment.Adam Bales - forthcoming - Philosophical Quarterly.details Some take seriously the possibility of artificial intelligence (AI) takeover, where AI systems seize power in a way that leads to human disempowerment. Assessing the likelihood of takeover requires answering empirical questions about the future of AI technologies and the context in which AI will operate. In many cases, philosophers are poorly placed to answer these questions. However, some prior questions are more amenable to philosophical techniques. What does it mean to speak of AI empowerment and human disempowerment? And what (...) Download Export citation Bookmark
Multiple unnatural attributes of AI undermine common anthropomorphically biased takeover speculations.Preston W. Estep - forthcoming - AI and Society:1-16.details Accelerating advancements in artificial intelligence (AI) have increased concerns about serious risks, including potentially catastrophic risks to humanity. Prevailing trends of AI R&D are leading to increasing humanization of AI, to the emergence of concerning behaviors, and toward possible recursive self-improvement. There has been increasing speculation that these factors increase the risk of an AI takeover of human affairs, and possibly even human extinction. The most extreme of such speculations result at least partly from anthropomorphism, but since AIs are being (...) Download Export citation Bookmark

1