Switch to: References

Add citations

You must login to add citations.
  1. Artificial Intelligence: Arguments for Catastrophic Risk.Adam Bales, William D'Alessandro & Cameron Domenico Kirk-Giannini - 2024 - Philosophy Compass 19 (2):e12964.
    Recent progress in artificial intelligence (AI) has drawn attention to the technology’s transformative potential, including what some see as its prospects for causing large-scale harm. We review two influential arguments purporting to show how AI could pose catastrophic risks. The first argument — the Problem of Power-Seeking — claims that, under certain assumptions, advanced AI systems are likely to engage in dangerous power-seeking behavior in pursuit of their goals. We review reasons for thinking that AI systems might seek power, that (...)
    Download  
     
    Export citation  
     
    Bookmark   9 citations  
  • What is AI safety? What do we want it to be?Jacqueline Harding & Cameron Domenico Kirk-Giannini - manuscript
    The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity and appeal, we argue that (...)
    Download  
     
    Export citation  
     
    Bookmark  
  • Risk and artificial general intelligence.Federico L. G. Faroldi - forthcoming - AI and Society:1-9.
    Artificial General Intelligence (AGI) is said to pose many risks, be they catastrophic, existential and otherwise. This paper discusses whether the notion of risk can apply to AGI, both descriptively and in the current regulatory framework. The paper argues that current definitions of risk are ill-suited to capture supposed AGI existential risks, and that the risk-based framework of the EU AI Act is inadequate to deal with truly general, agential systems.
    Download  
     
    Export citation  
     
    Bookmark   2 citations  
  • Language Agents and Malevolent Design.Inchul Yum - 2024 - Philosophy and Technology 37 (104):1-19.
    Language agents are AI systems capable of understanding and responding to natural language, potentially facilitating the process of encoding human goals into AI systems. However, this paper argues that if language agents can achieve easy alignment, they also increase the risk of malevolent agents building harmful AI systems aligned with destructive intentions. The paper contends that if training AI becomes sufficiently easy or is perceived as such, it enables malicious actors, including rogue states, terrorists, and criminal organizations, to create powerful (...)
    Download  
     
    Export citation  
     
    Bookmark   1 citation  
  • Deception and manipulation in generative AI.Christian Tarsney - forthcoming - Philosophical Studies.
    Large language models now possess human-level linguistic abilities in many contexts. This raises the concern that they can be used to deceive and manipulate on unprecedented scales, for instance spreading political misinformation on social media. In future, agentic AI systems might also deceive and manipulate humans for their own purposes. In this paper, first, I argue that AI-generated content should be subject to stricter standards against deception and manipulation than we ordinarily apply to humans. Second, I offer new characterizations of (...)
    Download  
     
    Export citation  
     
    Bookmark  
  • Is Alignment Unsafe?Cameron Domenico Kirk-Giannini - 2024 - Philosophy and Technology 37 (110):1–4.
    Inchul Yum (2024) argues that the widespread adoption of language agent architectures would likely increase the risk posed by AI by simplifying the process of aligning artificial systems with human values and thereby making it easier for malicious actors to use them to cause a variety of harms. Yum takes this to be an example of a broader phenomenon: progress on the alignment problem is likely to be net safety-negative because it makes artificial systems easier for malicious actors to control. (...)
    Download  
     
    Export citation  
     
    Bookmark  
  • AI takeover and human disempowerment.Adam Bales - forthcoming - Philosophical Quarterly.
    Some take seriously the possibility of artificial intelligence (AI) takeover, where AI systems seize power in a way that leads to human disempowerment. Assessing the likelihood of takeover requires answering empirical questions about the future of AI technologies and the context in which AI will operate. In many cases, philosophers are poorly placed to answer these questions. However, some prior questions are more amenable to philosophical techniques. What does it mean to speak of AI empowerment and human disempowerment? And what (...)
    Download  
     
    Export citation  
     
    Bookmark  
  • Multiple unnatural attributes of AI undermine common anthropomorphically biased takeover speculations.Preston W. Estep - forthcoming - AI and Society:1-16.
    Accelerating advancements in artificial intelligence (AI) have increased concerns about serious risks, including potentially catastrophic risks to humanity. Prevailing trends of AI R&D are leading to increasing humanization of AI, to the emergence of concerning behaviors, and toward possible recursive self-improvement. There has been increasing speculation that these factors increase the risk of an AI takeover of human affairs, and possibly even human extinction. The most extreme of such speculations result at least partly from anthropomorphism, but since AIs are being (...)
    Download  
     
    Export citation  
     
    Bookmark