Minds and Machines 30 (3):411-437 (2020)

This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has considerable advantages in this context. Third, the central challenge for theorists is not to identify ‘true’ moral principles for AI; rather, it is to identify fair principles for alignment that receive reflective endorsement despite widespread variation in people’s moral beliefs. The final part of the paper explores three ways in which fair principles for AI alignment could potentially be identified.
Keywords No keywords specified (fix it)
Categories (categorize this paper)
DOI 10.1007/s11023-020-09539-2
Edit this record
Mark as duplicate
Export citation
Find it on Scholar
Request removal from index
Revision history

Download options

PhilArchive copy

Upload a copy of this paper     Check publisher's policy     Papers currently archived: 68,944
External links

Setup an account with your affiliations in order to access resources via your University's proxy server
Configure custom proxy (use this if your affiliation does not provide a proxy)
Through your library

References found in this work BETA

Principles of Biomedical Ethics.Tom L. Beauchamp - 1979 - Oxford University Press.
What We Owe to Each Other.Thomas Scanlon - 1998 - Belknap Press of Harvard University Press.
On What Matters: Two-Volume Set.Derek Parfit - 2011 - Oxford University Press.
The View From Nowhere.Thomas Nagel - 1986 - Oxford University Press.

View all 74 references / Add more references

Citations of this work BETA

Human Goals Are Constitutive of Agency in Artificial Intelligence.Elena Popa - 2021 - Philosophy and Technology 34 (4):1731-1750.
Where Bioethics Meets Machine Ethics.Anna C. F. Lewis - 2020 - American Journal of Bioethics 20 (11):22-24.
Challenges of Aligning Artificial Intelligence with Human Values.Margit Sutrop - 2020 - Acta Baltica Historiae Et Philosophiae Scientiarum 8 (2):54-72.

View all 9 citations / Add more citations

Similar books and articles

The Value Alignment Problem: A Geometric Approach.Martin Peterson - 2019 - Ethics and Information Technology 21 (1):19-28.
Robustness to Fundamental Uncertainty in AGI Alignment.G. G. Worley Iii - 2020 - Journal of Consciousness Studies 27 (1-2):225-241.
Beyond Linguistic Alignment.Allan Mazur - 2004 - Behavioral and Brain Sciences 27 (2):205-206.
Alignment and Commitment in Joint Action.Matthew Rachar - 2018 - Philosophical Psychology 31 (6):831-849.
Interactive Alignment: Priming or Memory Retrieval?Michael Kaschak & Arthur Glenberg - 2004 - Behavioral and Brain Sciences 27 (2):201-202.
Machines Learning Values.Steve Petersen - 2020 - In S. Matthew Liao (ed.), Ethics of Artificial Intelligence. New York, USA: Oxford University Press.
The Emergence of Active/Stative Alignment in Otomi.Enrique L. Palancar - 2008 - In Mark Donohue & Søren Wichmann (eds.), The Typology of Semantic Alignment. Oxford University Press.


Added to PP index

Total views
86 ( #133,600 of 2,497,999 )

Recent downloads (6 months)
22 ( #39,059 of 2,497,999 )

How can I increase my downloads?


My notes