AI Deception: A Survey of Examples, Risks, and Potential Solutions

Abstract

This paper argues that a range of current AI systems have learned how to deceive humans. We define deception as the systematic inducement of false beliefs in the pursuit of some outcome other than the truth. We first survey empirical examples of AI deception, discussing both special-use AI systems (including Meta's CICERO) built for specific competitive situations, and general-purpose AI systems (such as large language models). Next, we detail several risks from AI deception, such as fraud, election tampering, and losing control of AI systems. Finally, we outline several potential solutions to the problems posed by AI deception: first, regulatory frameworks should subject AI systems that are capable of deception to robust risk-assessment requirements; second, policymakers should implement bot-or-not laws; and finally, policymakers should prioritize the funding of relevant research, including tools to detect AI deception and to make AI systems less deceptive. Policymakers, researchers, and the broader public should work proactively to prevent AI deception from destabilizing the shared foundations of our society.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 93,296

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Two Reasons for Subjecting Medical AI Systems to Lower Standards than Humans.Jakob Mainz, Jens Christian Bjerring & Lauritz Munch - 2023 - Acm Proceedings of Fairness, Accountability, and Transaparency (Facct) 2023 1 (1):44-49.
Ethics of Artificial Intelligence.Vincent C. Müller - 2021 - In Anthony Elliott (ed.), The Routledge social science handbook of AI. London: Routledge. pp. 122-137.

Analytics

Added to PP
2023-09-19

Downloads
80 (#214,259)

6 months
46 (#95,336)

Historical graph of downloads
How can I increase my downloads?

Author Profiles

Simon Goldstein
University of Hong Kong

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references