Unmonitorability of Artificial Intelligence

Roman Yampolskiy

Unmonitorability of Artificial Intelligence

Abstract

Artificially Intelligent (AI) systems have ushered in a transformative era across various domains, yet their inherent traits of unpredictability, unexplainability, and uncontrollability have given rise to concerns surrounding AI safety. This paper aims to demonstrate the infeasibility of accurately monitoring advanced AI systems to predict the emergence of certain capabilities prior to their manifestation. Through an analysis of the intricacies of AI systems, the boundaries of human comprehension, and the elusive nature of emergent behaviors, we argue for the impossibility of reliably foreseeing some capabilities. By investigating these impossibility results, we shed light on their potential implications for AI safety research and propose potential strategies to overcome these limitations.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Edit

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Author's Profile

Roman Yampolskiy

University of Louisville

Keywords

Monitoring AI Audit Observability AI Safety

Reprint years

My notes

Analytics

Added to PP
2023-06-11

Downloads
692 (#25,430)

6 months
420 (#4,131)

Historical graph of downloads

How can I increase my downloads?

Author's Profile

Roman Yampolskiy

University of Louisville

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Unmonitorability of Artificial Intelligence

Abstract

Author's Profile

Categories

Keywords

Reprint years

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author's Profile

Citations of this work

References found in this work