Extinction Risks from AI: Invisible to Science?

Vojtech Kovarik; Christiaan van Merwijk; Ida Mattsson

Extinction Risks from AI: Invisible to Science?

Vojtech Kovarik, Christiaan van Merwijk & Ida Mattsson

Abstract

In an effort to inform the discussion surrounding existential risks from AI, we formulate Extinction-level Goodhart’s Law as “Virtually any goal specification, pursued to the extreme, will result in the extinction of humanity”, and we aim to understand which formal models are suitable for investigating this hypothesis. Note that we remain agnostic as to whether Extinction-level Goodhart’s Law holds or not. As our key contribution, we identify a set of conditions that are necessary for a model that aims to be informative for evaluating specific arguments for Extinction-level Goodhart’s Law. Since each of the conditions seems to significantly contribute to the complexity of the resulting model, formally evaluating the hypothesis might be exceedingly difficult. This raises the possibility that whether the risk of extinction from artificial intelligence is real or not, the underlying dynamics might be invisible to current scientific methods.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Edit

Author's Profile

Vojtech Kovarik

Carnegie Mellon University

Keywords

AI extinction risk existential risk Goodhart's law machine learning computer science optimisation

Reprint years

My notes

Similar books and articles

Superintelligence as a Cause or Cure for Risks of Astronomical Suffering.Kaj Sotala & Lukas Gloor - 2017 - Informatica: An International Journal of Computing and Informatics 41 (4):389-400.

Existential Risks: Exploring a Robust Risk Reduction Strategy.Karim Jebari - 2015 - Science and Engineering Ethics 21 (3):541-554.

Is Extinction Risk Mitigation Uniquely Cost-Effective? Not in Standard Population Models.Gustav Alexandrie & Maya Eden - forthcoming - In Jacob Barrett, Hilary Greaves & David Thorstad (eds.), Essays on Longtermism. Oxford University Press.

Existential risks: a philosophical analysis.Phil Torres - 2023 - Inquiry: An Interdisciplinary Journal of Philosophy 66 (4):614-639.

If now isn't the most influential time ever, when is? [REVIEW]Kritika Maheshwari - 2020 - The Philosopher 108:94-101.

Existential risk pessimism and the time of perils.David Thorstad - manuscript

Offsetting the harms of extinction.Michael Da Silva - 2015 - Law, Ethics and Philosophy 3:8-29.

Autonomy and Machine Learning as Risk Factors at the Interface of Nuclear Weapons, Computers and People.S. M. Amadae & Shahar Avin - 2019 - In Vincent Boulanin (ed.), The Impact of Artificial Intelligence on Strategic Stability and Nuclear Risk: Euro-Atlantic Perspectives. Stockholm, Sweden: pp. 105-118.

Reducing the Risk of Human Extinction.Jason G. Matheny - unknown

Extinction as a function of the spacing of extinction trials.Walter C. Stanley - 1952 - Journal of Experimental Psychology 43 (4):249.

Respect for others’ risk attitudes and the long-run future.Andreas Mogensen - manuscript

Bioethics as an Ethics of Extinction.Luca Lo Sapio - 2023 - Scienza E Filosofia 29:15-35.

Welcome to the Machine: AI, Existential Risk, and the Iron Cage of Modernity.Jay A. Gupta - 2023 - Telos: Critical Theory of the Contemporary 2023 (203):163-169.

Existential risks: analyzing human extinction scenarios and related hazards.Nick Bostrom - 2002 - J Evol Technol 9 (1).

Mistakes in the moral mathematics of existential risk.David Thorstad - forthcoming - Ethics.

Analytics

Added to PP
2024-02-20

Downloads
220 (#90,587)

6 months
220 (#11,909)

Historical graph of downloads

How can I increase my downloads?

Author's Profile

Vojtech Kovarik

Carnegie Mellon University

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Extinction Risks from AI: Invisible to Science?

Abstract

Author's Profile

Categories

Keywords

Reprint years

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author's Profile

Citations of this work

References found in this work