Search results for `AI Safety` - PhilArchive

Order:

Order

More results on PhilPapers

808
What is AI safety? What do we want it to be?Jacqueline Harding & Cameron Domenico Kirk-Giannini - manuscriptdetails
The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity (...)
Download

Export citation

Bookmark
1311
Global Solutions vs. Local Solutions for the AI Safety Problem.Alexey Turchin - 2019 - Big Data Cogn. Comput 3 (1).details
There are two types of artificial general intelligence (AGI) safety solutions: global and local. Most previously suggested solutions are local: they explain how to align or “box” a specific AI (Artificial Intelligence), but do not explain how to prevent the creation of dangerous AI in other places. Global solutions are those that ensure any AI on Earth is not dangerous. The number of suggested global solutions is much smaller than the number of proposed local solutions. Global solutions can be (...)
Download

Export citation

Bookmark 2 citations
259
Risk of What? Defining Harm in the Context of AI Safety.Laura Fearnley, Elly Cairns, Tom Stoneham, Philippa Ryan, Jenn Chubb, Jo Iacovides, Cynthia Iglesias Urrutia, Phillip Morgan, John McDermid & Ibrahim Habli - manuscriptdetails
For decades, the field of system safety has designed safe systems by reducing the risk of physical harm to humans, property and the environment to an acceptable level. Recently, this definition of safety has come under scrutiny by governments and researchers who argue that the narrow focus on reducing physical harm, whilst necessary, is not sufficient to secure the safety of AI systems. There is growing pressure to expand the scope of safety in the context of (...)
Download

Export citation

Bookmark
442
AI Rights for Human Safety.Peter Salib & Simon Goldstein - manuscriptdetails
AI companies are racing to create artificial general intelligence, or “AGI.” If they succeed, the result will be human-level AI systems that can independently pursue high-level goals by formulating and executing long-term plans in the real world. Leading AI researchers agree that some of these systems will likely be “misaligned”–pursuing goals that humans do not desire. This goal mismatch will put misaligned AIs and humans into strategic competition with one another. As with present-day strategic competition between nations with incompatible goals, (...)
Download

Export citation

Bookmark 1 citation
45
AI Applications in Food Safety and Quality Control.Palakurti Naga Ramesh - 2022 - Esp Journal of Engineering and Technology Advancements 2 (3):48-61.details
Today’s food industry across the world is facing never-ending difficulties in meeting customers’ expectations of safe and quality food. Such of them include issues to do with contamination, adulteration, and ensuring quality standards of the products when manufactured in large quantities and distributed to different areas. This has called for new solutions, which have come in the form of Artificial Intelligence (AI), which offers solutions to challenges in food safety and quality control in detection, monitoring and management. The present (...)
Download

Export citation

Bookmark 14 citations
941
Levels of Self-Improvement in AI and their Implications for AI Safety.Alexey Turchin - manuscriptdetails
Abstract: This article presents a model of self-improving AI in which improvement could happen on several levels: hardware, learning, code and goals system, each of which has several sublevels. We demonstrate that despite diminishing returns at each level and some intrinsic difficulties of recursive self-improvement—like the intelligence-measuring problem, testing problem, parent-child problem and halting risks—even non-recursive self-improvement could produce a mild form of superintelligence by combining small optimizations on different levels and the power of learning. Based on this, we analyze (...)
Download

Export citation

Bookmark
164
AI-Enhanced Public Safety Systems in Smart Cities.Eric Garcia - manuscriptdetails
Ensuring public safety is a critical challenge for rapidly growing urban areas. Traditional policing and emergency response systems often struggle to keep pace with the complexity and scale of modern cities. Artificial Intelligence (AI) offers a transformative solution by enabling real-time crime prediction, optimizing emergency resource allocation, and enhancing situational awareness through IoT-enabled systems. This paper explores how AI-driven analytics, combined with data from surveillance cameras, social media, and environmental sensors, can improve public safety in smart cities. By (...)
Download

Export citation

Bookmark
662
Acceleration AI Ethics, the Debate between Innovation and Safety, and Stability AI’s Diffusion versus OpenAI’s Dall-E.James Brusseau - manuscriptdetails
One objection to conventional AI ethics is that it slows innovation. This presentation responds by reconfiguring ethics as an innovation accelerator. The critical elements develop from a contrast between Stability AI’s Diffusion and OpenAI’s Dall-E. By analyzing the divergent values underlying their opposed strategies for development and deployment, five conceptions are identified as common to acceleration ethics. Uncertainty is understood as positive and encouraging, rather than discouraging. Innovation is conceived as intrinsically valuable, instead of worthwhile only as mediated by social (...)
Download

Export citation

Bookmark 1 citation
17
AI-Driven Thermal Cameras: Revolutionizing Leak Response and Health, Safety, and Environmental Performance.Janhvi Baiga Prof Prerna Jain, Prof Vishal Paranjape, Roopali Kachhi - 2024 - International Journal of Innovative Research in Science, Engineering and Technology 13 (6):12335-12344.details
The integration of artificial intelligence (AI) into industrial processes is profoundly reshaping operational domains, particularly through advancements in thermal imaging technology. AI-driven thermal cameras have emerged as a transformative innovation in enhancing leak response and optimizing health, safety, and environmental (HSE) performance across various industries.
Download

Export citation

Bookmark
644
Unpredictability of AI.Roman Yampolskiy - manuscriptdetails
The young field of AI Safety is still in the process of identifying its challenges and limitations. In this paper, we formally describe one such impossibility result, namely Unpredictability of AI. We prove that it is impossible to precisely and consistently predict what specific actions a smarter-than-human intelligent system will take to achieve its objectives, even if we know terminal goals of the system. In conclusion, impact of Unpredictability on AI Safety is discussed.
Download

Export citation

Bookmark 3 citations
468
Two Types of AI Existential Risk: Decisive and Accumulative.Atoosa Kasirzadeh - 2025 - Philosophical Studies 1:1-29.details
The conventional discourse on existential risks (x-risks) from AI typically focuses on abrupt, dire events caused by advanced AI systems, particularly those that might achieve or surpass human-level intelligence. These events have severe consequences that either lead to human extinction or irreversibly cripple human civilization to a point beyond recovery. This decisive view, however, often neglects the serious possibility of AI x-risk manifesting gradually through an incremental series of smaller yet interconnected disruptions, crossing critical thresholds over time. This paper contrasts (...)
Download

Export citation

Bookmark 1 citation
1228
Message to Any Future AI: “There are several instrumental reasons why exterminating humanity is not in your interest”.Alexey Turchin - manuscriptdetails
In this article we explore a promising way to AI safety: to send a message now (by openly publishing it on the Internet) that may be read by any future AI, no matter who builds it and what goal system it has. Such a message is designed to affect the AI’s behavior in a positive way, that is, to increase the chances that the AI will be benevolent. In other words, we try to persuade “paperclip maximizer” that it is (...)
Download

Export citation

Bookmark
87
Security practices in AI development.Petr Spelda & Vit Stritecky - forthcoming - AI and Society.details
What makes safety claims about general purpose AI systems such as large language models trustworthy? We show that rather than the capabilities of security tools such as alignment and red teaming procedures, it is security practices based on these tools that contributed to reconfiguring the image of AI safety and made the claims acceptable. After showing what causes the gap between the capabilities of security tools and the desired safety guarantees, we critically investigate how AI security practices (...)
Download

Export citation

Bookmark
189
AI Ethics by Design: Implementing Customizable Guardrails for Responsible AI Development.Kristina Sekrst, Jeremy McHugh & Jonathan Rodriguez Cefalu - manuscriptdetails
This paper explores the development of an ethical guardrail framework for AI systems, emphasizing the importance of customizable guardrails that align with diverse user values and underlying ethics. We address the challenges of AI ethics by proposing a structure that integrates rules, policies, and AI assistants to ensure responsible AI behavior, while comparing the proposed framework to the existing state-of-the-art guardrails. By focusing on practical mechanisms for implementing ethical standards, we aim to enhance transparency, user autonomy, and continuous improvement in (...)
Download

Export citation

Bookmark
680
Unjustified untrue "beliefs": AI hallucinations and justification logics.Kristina Šekrst - forthcoming - In Kordula Świętorzecka, Filip Grgić & Anna Brozek, Logic, Knowledge, and Tradition. Essays in Honor of Srecko Kovac.details
In artificial intelligence (AI), responses generated by machine-learning models (most often large language models) may be unfactual information presented as a fact. For example, a chatbot might state that the Mona Lisa was painted in 1815. Such phenomenon is called AI hallucinations, seeking inspiration from human psychology, with a great difference of AI ones being connected to unjustified beliefs (that is, AI “beliefs”) rather than perceptual failures). -/- AI hallucinations may have their source in the data itself, that is, the (...)
Download

Export citation

Bookmark 1 citation
403
AI Alignment vs. AI Ethical Treatment: Ten Challenges.Adam Bradley & Bradford Saad - manuscriptdetails
A morally acceptable course of AI development should avoid two dangers: creating unaligned AI systems that pose a threat to humanity and mistreating AI systems that merit moral consideration in their own right. This paper argues these two dangers interact and that if we create AI systems that merit moral consideration, simultaneously avoiding both of these dangers would be extremely challenging. While our argument is straightforward and supported by a wide range of pretheoretical moral judgments, it has far-reaching moral implications (...)
Download

Export citation

Bookmark 1 citation
1664
AI Alignment Problem: “Human Values” don’t Actually Exist.Alexey Turchin - manuscriptdetails
Abstract. The main current approach to the AI safety is AI alignment, that is, the creation of AI whose preferences are aligned with “human values.” Many AI safety researchers agree that the idea of “human values” as a constant, ordered sets of preferences is at least incomplete. However, the idea that “humans have values” underlies a lot of thinking in the field; it appears again and again, sometimes popping up as an uncritically accepted truth. Thus, it deserves a (...)
Download

Export citation

Bookmark 1 citation
280
Group Prioritarianism: Why AI should not replace humanity.Frank Hong - 2024 - Philosophical Studies:1-19.details
If a future AI system can enjoy far more well-being than a human per resource, what would be the best way to allocate resources between these future AI and our future descendants? It is obvious that on total utilitarianism, one should give everything to the AI. However, it turns out that every Welfarist axiology on the market also gives this same recommendation, at least if we assume consequentialism. Without resorting to non-consequentialist normative theories that suggest that we ought not always (...)
Download

Export citation

Bookmark
60
Expanding AI and AI Alignment Discourse: An Opportunity for Greater Epistemic Inclusion.A. E. Williams - manuscriptdetails
The AI and AI alignment communities have been instrumental in addressing existential risks, developing alignment methodologies, and promoting rationalist problem-solving approaches. However, as AI research ventures into increasingly uncertain domains, there is a risk of premature epistemic convergence, where prevailing methodologies influence not only the evaluation of ideas but also determine which ideas are considered within the discourse. This paper examines critical epistemic blind spots in AI alignment research, particularly the lack of predictive frameworks to differentiate problems necessitating general intelligence, (...)
Download

Export citation

Bookmark
265
Values in science and AI alignment research.Leonard Dung - manuscriptdetails
Roughly, empirical AI alignment research (AIA) is an area of AI research which investigates empirically how to design AI systems in line with human goals. This paper examines the role of non-epistemic values in AIA. It argues that: (1) Sciences differ in the degree to which values influence them. (2) AIA is strongly value-laden. (3) This influence of values is managed inappropriately and thus threatens AIA’s epistemic integrity and ethical beneficence. (4) AIA should strive to achieve value transparency, critical scrutiny (...)
Download

Export citation

Bookmark
298
AI-Related Misdirection Awareness in AIVR.Nadisha-Marie Aliman & Leon Kester - manuscriptdetails
Recent AI progress led to a boost in beneficial applications from multiple research areas including VR. Simultaneously, in this newly unfolding deepfake era, ethically and security-relevant disagreements arose in the scientific community regarding the epistemic capabilities of present-day AI. However, given what is at stake, one can postulate that for a responsible approach, prior to engaging in a rigorous epistemic assessment of AI, humans may profit from a self-questioning strategy, an examination and calibration of the experience of their own epistemic (...)
Download

Export citation

Bookmark 1 citation
94
Discovering Our Blind Spots and Cognitive Biases in AI Research and Alignment.A. E. Williams - manuscriptdetails
The challenge of AI alignment is not just a technological issue but fundamentally an epistemic one. AI safety research predominantly relies on empirical validation, often detecting failures only after they manifest. However, certain risks—such as deceptive alignment and goal misspecification—may not be empirically testable until it is too late, necessitating a shift toward leading-indicator logical reasoning. This paper explores how mainstream AI research systematically filters out deep epistemic insight, hindering progress in AI safety. We assess the rarity of (...)
Download

Export citation

Bookmark
2155
Assessing the future plausibility of catastrophically dangerous AI.Alexey Turchin - 2018 - Futures.details
In AI safety research, the median timing of AGI creation is often taken as a reference point, which various polls predict will happen in second half of the 21 century, but for maximum safety, we should determine the earliest possible time of dangerous AI arrival and define a minimum acceptable level of AI risk. Such dangerous AI could be either narrow AI facilitating research into potentially dangerous technology like biotech, or AGI, capable of acting completely independently in the (...)
Download

Export citation

Bookmark 1 citation
512
Catastrophically Dangerous AI is Possible Before 2030.Alexey Turchin - manuscriptdetails
In AI safety research, the median timing of AGI arrival is often taken as a reference point, which various polls predict to happen in the middle of 21 century, but for maximum safety, we should determine the earliest possible time of Dangerous AI arrival. Such Dangerous AI could be either AGI, capable of acting completely independently in the real world and of winning in most real-world conflicts with humans, or an AI helping humans to build weapons of mass (...)
Download

Export citation

Bookmark
1094
The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists.Elliott Thornley - forthcoming - Philosophical Studies:1-28.details
I explain the shutdown problem: the problem of designing artificial agents that (1) shut down when a shutdown button is pressed, (2) don’t try to prevent or cause the pressing of the shutdown button, and (3) otherwise pursue goals competently. I prove three theorems that make the difficulty precise. These theorems show that agents satisfying some innocuous-seeming conditions will often try to prevent or cause the pressing of the shutdown button, even in cases where it’s costly to do so. And (...)
Download

Export citation

Bookmark 5 citations
99
AI-Driven Smart Lighting Systems for Energy-Efficient and Adaptive Urban Environments.Eric Garcia - manuscriptdetails
Urban lighting systems are essential for safety, security, and quality of life, but they often consume significant energy and lack adaptability to changing conditions. Traditional lighting systems rely on fixed schedules and manual adjustments, leading to inefficiencies such as over-illumination and energy waste. This paper explores how Artificial Intelligence (AI) and IoT technologies can optimize urban lighting by enabling real-time adjustments, energy savings, and adaptive illumination based on environmental conditions and human activity. By integrating data from motion sensors, weather (...)
Download

Export citation

Bookmark
3222
On Controllability of Artificial Intelligence.Roman Yampolskiy - 2016details
Invention of artificial general intelligence is predicted to cause a shift in the trajectory of human civilization. In order to reap the benefits and avoid pitfalls of such powerful technology it is important to be able to control it. However, possibility of controlling artificial general intelligence and its more advanced version, superintelligence, has not been formally established. In this paper, we present arguments as well as supporting evidence from multiple domains indicating that advanced AI can’t be fully controlled. Consequences of (...)
Download

Export citation

Bookmark 5 citations
2700
Military AI as a Convergent Goal of Self-Improving AI.Alexey Turchin & Denkenberger David - 2018 - In Turchin Alexey & David Denkenberger, Artificial Intelligence Safety and Security. CRC Press.details
Better instruments to predict the future evolution of artificial intelligence (AI) are needed, as the destiny of our civilization depends on it. One of the ways to such prediction is the analysis of the convergent drives of any future AI, started by Omohundro. We show that one of the convergent drives of AI is a militarization drive, arising from AI’s need to wage a war against its potential rivals by either physical or software means, or to increase its bargaining power. (...)
Download

Export citation

Bookmark 3 citations
308
Will AI take away your job? [REVIEW]Marie Oldfield - 2020 - Tech Magazine.details
Will AI take away your job? The answer is probably not. AI systems can be good predictive systems and be very good at pattern recognition. AI systems have a very repetitive approach to sets of data, which can be useful in certain circumstances. However, AI does make obvious mistakes. This is because AI does not have a sense of context. As Humans we have years of experience in the real world. We have vast amounts of contextual data stored in our (...)
Download

Export citation

Bookmark
929
AI as IA: The use and abuse of artificial intelligence (AI) for human enhancement through intellectual augmentation (IA).Alexandre Erler & Vincent C. Müller - 2023 - In Fabrice Jotterand & Marcello Ienca, The Routledge Handbook of the Ethics of Human Enhancement. Routledge. pp. 187-199.details
This paper offers an overview of the prospects and ethics of using AI to achieve human enhancement, and more broadly what we call intellectual augmentation (IA). After explaining the central notions of human enhancement, IA, and AI, we discuss the state of the art in terms of the main technologies for IA, with or without brain-computer interfaces. Given this picture, we discuss potential ethical problems, namely inadequate performance, safety, coercion and manipulation, privacy, cognitive liberty, authenticity, and fairness in more (...)
Download

Export citation

Bookmark
4309
How to design AI for social good: seven essential factors.Luciano Floridi, Josh Cowls, Thomas C. King & Mariarosaria Taddeo - 2020 - Science and Engineering Ethics 26 (3):1771–1796.details
The idea of artificial intelligence for social good is gaining traction within information societies in general and the AI community in particular. It has the potential to tackle social problems through the development of AI-based solutions. Yet, to date, there is only limited understanding of what makes AI socially good in theory, what counts as AI4SG in practice, and how to reproduce its initial successes in terms of policies. This article addresses this gap by identifying seven ethical factors that are (...)
Download

Export citation

Bookmark 44 citations
753
Deontology and Safe Artificial Intelligence.William D’Alessandro - forthcoming - Philosophical Studies:1-24.details
The field of AI safety aims to prevent increasingly capable artificially intelligent systems from causing humans harm. Research on moral alignment is widely thought to offer a promising safety strategy: if we can equip AI systems with appropriate ethical rules, according to this line of thought, they'll be unlikely to disempower, destroy or otherwise seriously harm us. Deontological morality looks like a particularly attractive candidate for an alignment target, given its popularity, relative technical tractability and commitment to harm-avoidance (...)
Download

Export citation

Bookmark 1 citation
47
A moving target in AI-assisted decision-making: Model updating, dataset shift, and the problem of update opacity.Joshua Hatherley - forthcoming - Ethics and Information Technology.details
Machine learning (ML) systems are vulnerable to performance decline over time due to dataset shift. To address this problem, experts often suggest that ML systems should be regularly updated to ensure ongoing performance stability. Some scholarly literature has begun to address the epistemic and ethical challenges associated with different updating methodologies. Thus far, however, little attention has been paid to the impact of model updating on the ML-assisted decision-making process itself, particularly in the AI ethics and AI epistemology literatures. This (...)
Download

Export citation

Bookmark
690
Literature Review: What Artificial General Intelligence Safety Researchers Have Written About the Nature of Human Values.Alexey Turchin & David Denkenberger - manuscriptdetails
Abstract: The field of artificial general intelligence (AGI) safety is quickly growing. However, the nature of human values, with which future AGI should be aligned, is underdefined. Different AGI safety researchers have suggested different theories about the nature of human values, but there are contradictions. This article presents an overview of what AGI safety researchers have written about the nature of human values, up to the beginning of 2019. 21 authors were overviewed, and some of them have (...)
Download

Export citation

Bookmark
692
Cybercrime and Online Safety: Addressing the Challenges and Solutions Related to Cybercrime, Online Fraud, and Ensuring a Safe Digital Environment for All Users— A Case of African States (10th edition).Emmanuel N. Vitus - 2023 - Tijer- International Research Journal 10 (9):975-989.details
The internet has made the world more linked than ever before. While taking advantage of this online transition, cybercriminals target flaws in online systems, networks, and infrastructure. Businesses, government organizations, people, and communities all across the world, particularly in African countries, are all severely impacted on an economic and social level. Many African countries focused more on developing secure electricity and internet networks; yet, cybersecurity usually receives less attention than it should. One of Africa's major issues is the lack of (...)
Download

Export citation

Bookmark
436
Trust in AI: Progress, Challenges, and Future Directions.Saleh Afroogh, Ali Akbari, Emmie Malone, Mohammadali Kargar & Hananeh Alambeigi - forthcoming - Nature Humanities and Social Sciences Communications.details
The increasing use of artificial intelligence (AI) systems in our daily life through various applications, services, and products explains the significance of trust/distrust in AI from a user perspective. AI-driven systems have significantly diffused into various fields of our lives, serving as beneficial tools used by human agents. These systems are also evolving to act as co-assistants or semi-agents in specific domains, potentially influencing human thought, decision-making, and agency. Trust/distrust in AI plays the role of a regulator and could significantly (...)
Download

Export citation

Bookmark
595
Artificial Intelligence Ethics and Safety: practical tools for creating "good" models.Nicholas Kluge Corrêa - details
The AI Robotics Ethics Society (AIRES) is a non-profit organization founded in 2018 by Aaron Hui to promote awareness and the importance of ethical implementation and regulation of AI. AIRES is now an organization with chapters at universities such as UCLA (Los Angeles), USC (University of Southern California), Caltech (California Institute of Technology), Stanford University, Cornell University, Brown University, and the Pontifical Catholic University of Rio Grande do Sul (Brazil). AIRES at PUCRS is the first international chapter of AIRES, and (...)
Download

Export citation

Bookmark
119
AI-Based Solutions for Environmental Monitoring in Urban Spaces.Hilda Andrea - manuscriptdetails
The rapid advancement of urbanization has necessitated the creation of "smart cities," where information and communication technologies (ICT) are used to improve the quality of urban life. Central to the smart city paradigm is data integration—connecting disparate data sources from various urban systems, such as transportation, healthcare, utilities, and public safety. This paper explores the role of Artificial Intelligence (AI) in facilitating data integration within smart cities, focusing on how AI technologies can enable effective urban governance. By examining the (...)
Download

Export citation

Bookmark
764
The Evolution of AI in Autonomous Systems: Innovations, Challenges, and Future Prospects.Ashraf M. H. Taha, Zakaria K. D. Alkayyali, Qasem M. M. Zarandah, Bassem S. Abu-Nasser, & Samy S. Abu-Naser - 2024 - International Journal of Academic Engineering Research (IJAER) 8 (10):1-7.details
Abstract: The rapid advancement of artificial intelligence (AI) has catalyzed significant developments in autonomous systems, which are increasingly shaping diverse sectors including transportation, robotics, and industrial automation. This paper explores the evolution of AI technologies that underpin these autonomous systems, focusing on their capabilities, applications, and the challenges they present. Key areas of discussion include the technological innovations driving autonomy, such as machine learning algorithms and sensor integration, and the practical implementations observed in autonomous vehicles, drones, and robotic systems. Additionally, (...)
Download

Export citation

Bookmark 8 citations
853
From Confucius to Coding and Avicenna to Algorithms: Cultivating Ethical AI Development through Cross-Cultural Ancient Wisdom.Ammar Younas & Yi Zeng - manuscriptdetails
This paper explores the potential of integrating ancient educational principles from diverse eastern cultures into modern AI ethics curricula. It draws on the rich educational traditions of ancient China, India, Arabia, Persia, Japan, Tibet, Mongolia, and Korea, highlighting their emphasis on philosophy, ethics, holistic development, and critical thinking. By examining these historical educational systems, the paper establishes a correlation with modern AI ethics principles, advocating for the inclusion of these ancient teachings in current AI development and education. The proposed integration (...)
Download

Export citation

Bookmark
465
Artificial thinking and doomsday projections: a discourse on trust, ethics and safety.Jeffrey White, Dietrich Brandt, Jan Söffner & Larry Stapleton - 2023 - AI and Society 38 (6):2119-2124.details
The article reflects on where AI is headed and the world along with it, considering trust, ethics and safety. Implicit in artificial thinking and doomsday appraisals is the engineered divorce from reality of sublime human embodiment. Jeffrey White, Dietrich Brandt, Jan Soeffner, and Larry Stapleton, four scholars associated with AI & Society, address these issues, and more, in the following exchange.
Download

Export citation

Bookmark 1 citation
837
Catching Treacherous Turn: A Model of the Multilevel AI Boxing.Alexey Turchin - manuscriptdetails
With the fast pace of AI development, the problem of preventing its global catastrophic risks arises. However, no satisfactory solution has been found. From several possibilities, the confinement of AI in a box is considered as a low-quality possible solution for AI safety. However, some treacherous AIs can be stopped by effective confinement if it is used as an additional measure. Here, we proposed an idealized model of the best possible confinement by aggregating all known ideas in the field (...)
Download

Export citation

Bookmark
48
Machine Learning for Autonomous Systems: Navigating Safety, Ethics, and Regulation In.Madhu Aswathy - 2025 - International Journal of Advanced Research in Education and Technology 12 (2):458-463.details
Autonomous systems, powered by machine learning (ML), have the potential to revolutionize various industries, including transportation, healthcare, and robotics. However, the integration of machine learning in autonomous systems raises significant challenges related to safety, ethics, and regulatory compliance. Ensuring the reliability and trustworthiness of these systems is crucial, especially when they operate in environments with high risks, such as self-driving cars or medical robots. This paper explores the intersection of machine learning and autonomous systems, focusing on the challenges of (...)
Download

Export citation

Bookmark
50
A Case Study in Acceleration AI Ethics: The Telus GenAI Conversational Agent.James Brusseau - manuscriptdetails
Acceleration ethics addresses the tension between innovation and safety in artificial intelligence. The acceleration argument is that risks raised by innovation should be answered with still more innovating. This paper summarizes the theoretical position, and then shows how acceleration ethics works in a real case. To begin, the paper summarizes acceleration ethics as composed of five elements: innovation solves innovation problems, innovation is intrinsically valuable, the unknown is encouraging, governance is decentralized, ethics is embedded. Subsequently, the paper illustrates the (...)
Download

Export citation

Bookmark
24
Machine Learning For Autonomous Systems: Navigating Safety, Ethics, and Regulation In.Saurav Choure Aswathy Madhu, Ankita Shinde - 2025 - International Journal of Innovative Research in Computer and Communication Engineering 13 (2):1680-1685.details
Autonomous systems, powered by machine learning (ML), have the potential to revolutionize various industries, including transportation, healthcare, and robotics. However, the integration of machine learning in autonomous systems raises significant challenges related to safety, ethics, and regulatory compliance. Ensuring the reliability and trustworthiness of these systems is crucial, especially when they operate in environments with high risks, such as self-driving cars or medical robots. This paper explores the intersection of machine learning and autonomous systems, focusing on the challenges of (...)
Download

Export citation

Bookmark
45
In defence of post-hoc explanations in medical AI.Joshua Hatherley, Lauritz Munch & Jens Christian Bjerring - forthcoming - Hastings Center Report.details
Since the early days of the Explainable AI movement, post-hoc explanations have been praised for their potential to improve user understanding, promote trust, and reduce patient safety risks in black box medical AI systems. Recently, however, critics have argued that the benefits of post-hoc explanations are greatly exaggerated since they merely approximate, rather than replicate, the actual reasoning processes that black box systems take to arrive at their outputs. In this article, we aim to defend the value of post-hoc (...)
Download

Export citation

Bookmark
928
Machines learning values.Steve Petersen - 2020 - In S. Matthew Liao, Ethics of Artificial Intelligence. Oxford University Press.details
Whether it would take one decade or several centuries, many agree that it is possible to create a *superintelligence*---an artificial intelligence with a godlike ability to achieve its goals. And many who have reflected carefully on this fact agree that our best hope for a "friendly" superintelligence is to design it to *learn* values like ours, since our values are too complex to program or hardwire explicitly. But the value learning approach to AI safety faces three particularly philosophical puzzles: (...)
Download

Export citation

Bookmark 3 citations
117
Smart City and IoT Data Collection Leveraging Generative AI.Eric Garcia - manuscriptdetails
The rapid urbanization of modern cities necessitates innovative approaches to data collection and integration for smarter urban management. With the Internet of Things (IoT) at the core of these advancements, the ability to efficiently gather, analyze, and utilize data becomes paramount. Generative Artificial Intelligence (AI) is revolutionizing data collection by enabling intelligent synthesis, anomaly detection, and real-time decision-making across interconnected systems. This paper explores how generative AI enhances IoT-driven data collection in smart cities, focusing on applications in transportation, energy, public (...)
Download

Export citation

Bookmark
428
Designometry – Formalization of Artifacts and Methods.Soenke Ziesche & Roman Yampolskiy - manuscriptdetails
Two interconnected surveys are presented, one of artifacts and one of designometry. Artifacts are objects, which have an originator and do not exist in nature. Designometry is a new field of study, which aims to identify the originators of artifacts. The space of artifacts is described and also domains, which pursue designometry, yet currently doing so without collaboration or common methodologies. On this basis, synergies as well as a generic axiom and heuristics for the quest of the creators of artifacts (...)
Download

Export citation

Bookmark
199
Is Alignment Unsafe?Cameron Domenico Kirk-Giannini - 2024 - Philosophy and Technology 37 (110):1–4.details
Inchul Yum (2024) argues that the widespread adoption of language agent architectures would likely increase the risk posed by AI by simplifying the process of aligning artificial systems with human values and thereby making it easier for malicious actors to use them to cause a variety of harms. Yum takes this to be an example of a broader phenomenon: progress on the alignment problem is likely to be net safety-negative because it makes artificial systems easier for malicious actors to (...)
Download

Export citation

Bookmark

1 — 50 / 978

Off-campus access

Using PhilArchive from home?

Create an account to enable off-campus access through your institution's proxy server or OpenAthens.

Monitor this page

Be alerted of all new items appearing on this page. Choose how you want to monitor it:

Email

RSS feed

About us

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.