Artificial Intelligence, Values, and Alignment

Minds and Machines 30 (3):411-437 (2020)
  Copy   BIBTEX


This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has considerable advantages in this context. Third, the central challenge for theorists is not to identify ‘true’ moral principles for AI; rather, it is to identify fair principles for alignment that receive reflective endorsement despite widespread variation in people’s moral beliefs. The final part of the paper explores three ways in which fair principles for AI alignment could potentially be identified.



    Upload a copy of this work     Papers currently archived: 86,592

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

The value alignment problem: a geometric approach.Martin Peterson - 2019 - Ethics and Information Technology 21 (1):19-28.
Robustness to Fundamental Uncertainty in AGI Alignment.G. G. Worley Iii - 2020 - Journal of Consciousness Studies 27 (1-2):225-241.
Beyond linguistic alignment.Allan Mazur - 2004 - Behavioral and Brain Sciences 27 (2):205-206.
Alignment and commitment in joint action.Matthew Rachar - 2018 - Philosophical Psychology 31 (6):831-849.
Interactive alignment: Priming or memory retrieval?Michael Kaschak & Arthur Glenberg - 2004 - Behavioral and Brain Sciences 27 (2):201-202.
Machines learning values.Steve Petersen - 2020 - In S. Matthew Liao (ed.), Ethics of Artificial Intelligence. New York, USA: Oxford University Press.
The emergence of active/stative alignment in Otomi.Enrique L. Palancar - 2008 - In Mark Donohue & Søren Wichmann (eds.), The Typology of Semantic Alignment. Oxford University Press.


Added to PP

237 (#69,150)

6 months
60 (#50,845)

Historical graph of downloads
How can I increase my downloads?