Deontology and Safe Artificial Intelligence

William D'Alessandro

Deontology and Safe Artificial Intelligence

Philosophical Studies (forthcoming) Copy BIBT_EX

Abstract

The field of AI safety aims to prevent increasingly capable artificially intelligent systems from causing humans harm. Research on moral alignment is widely thought to offer a promising safety strategy: if we can equip AI systems with appropriate ethical rules, according to this line of thought, they'll be unlikely to disempower, destroy or otherwise seriously harm us. Deontological morality looks like a particularly attractive candidate for an alignment target, given its popularity, relative technical tractability and commitment to harm-avoidance principles. I argue that the connection between moral alignment and safe behavior is more tenuous than many have hoped. In general, AI systems can possess either of these properties in the absence of the other, and we should favor safety when the two conflict. In particular, advanced AI systems governed by standard versions of deontology need not be especially safe.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Edit

Author's Profile

William D'Alessandro

University of Oxford

My notes

Similar books and articles

An Enactive Approach to Value Alignment in Artificial Intelligence: A Matter of Relevance.Michael Cannon - 2021 - In Vincent C. Müller (ed.), Philosophy and Theory of Artificial Intelligence 2021. Springer Cham. pp. 119-135.

Robustness to Fundamental Uncertainty in AGI Alignment.G. G. Worley Iii - 2020 - Journal of Consciousness Studies 27 (1-2):225-241.

An Enactive Approach to Value Alignment in Artificial Intelligence: A Matter of Relevance.Michael Cannon - 2022 - In Vincent C. Müller (ed.), Philosophy and Theory of Artificial Intelligence 2021. pp. 119-135.

How does Artificial Intelligence Pose an Existential Risk?Karina Vold & Daniel R. Harris - 2023 - In Carissa Véliz (ed.), The Oxford Handbook of Digital Ethics. Oxford University Press.

Facing Janus: An Explanation of the Motivations and Dangers of AI Development.Aaron Graifman - manuscript

Existentialist risk and value misalignment.Ariela Tubert & Justin Tiehen - forthcoming - Philosophical Studies.

Value alignment, human enhancement, and moral revolutions.Ariela Tubert & Justin Tiehen - forthcoming - Inquiry: An Interdisciplinary Journal of Philosophy.

Superintelligence as a Cause or Cure for Risks of Astronomical Suffering.Kaj Sotala & Lukas Gloor - 2017 - Informatica: An International Journal of Computing and Informatics 41 (4):389-400.

Robustness to fundamental uncertainty in AGI alignment.I. I. I. G. Gordon Worley - manuscript

Ethical Aspects of Artificial Intelligence Functioning in the XXIst Century.Vira Dodonova, Roman Dodonov & Kateryna Gorbenko - 2023 - Studia Universitatis Babeş-Bolyai Philosophia 68 (1):161-173.

Value Alignment for Advanced Artificial Judicial Intelligence.Christoph Winter, Nicholas Hollman & David Manheim - 2023 - American Philosophical Quarterly 60 (2):187-203.

Aligning artificial intelligence with human values: reflections from a phenomenological perspective.Shengnan Han, Eugene Kelly, Shahrokh Nikou & Eric-Oluf Svee - 2022 - AI and Society 37 (4):1383-1395.

Provably Safe Artificial General Intelligence via Interactive Proofs.Kristen Carlson - 2021 - Philosophies 6 (4):83.

Domesticating Artificial Intelligence.Luise Müller - 2022 - Moral Philosophy and Politics 9 (2):219-237.

Artificial Intelligence, Values, and Alignment.Iason Gabriel - 2020 - Minds and Machines 30 (3):411-437.

Analytics

Added to PP
2024-05-06

Downloads
0

6 months
0

Historical graph of downloads

How can I increase my downloads?

Author's Profile

William D'Alessandro

University of Oxford

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Deontology and Safe Artificial Intelligence

Abstract

Author's Profile

Categories

Keywords

Reprint years

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author's Profile

Citations of this work

References found in this work