Evaluating approaches for reducing catastrophic risks from AI

AI and Ethics (2024)
  Copy   BIBTEX

Abstract

According to a growing number of researchers, AI may pose catastrophic – or even existential – risks to humanity. Catastrophic risks may be taken to be risks of 100 million human deaths, or a similarly bad outcome. I argue that such risks – while contested – are sufficiently likely to demand rigorous discussion of potential societal responses. Subsequently, I propose four desiderata for approaches to the reduction of catastrophic risks from AI. The quality of such approaches can be assessed by their chance of success, degree of beneficence, degree of non-maleficence, and beneficent side effects. Then, I employ these desiderata to evaluate the promises, limitations and risks of alignment research, timelines research, policy research, halting or slowing down AI research, and compute governance for tackling catastrophic AI risks. While more research is needed, this investigation shows that several approaches for dealing with catastrophic AI risks are available, and where their respective strengths and weaknesses lie. It turns out that many approaches are complementary and that the approaches have a nuanced relationship to approaches to present AI harms. While some approaches are similarly useful for addressing catastrophic risks and present harms, this is not always the case.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 93,127

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Analytics

Added to PP
2024-04-29

Downloads
0

6 months
0

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Leonard Dung
Universität Erlangen-Nürnberg

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references