Works by Tom Everitt

Order:

Order

1 filter applied

41
Discovering agents.Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott & Tom Everitt - 2023 - Artificial Intelligence 322 (C):103963.details
Causal models of agents have been used to analyse the safety aspects of machine learning systems. But identifying agents is non-trivial -- often the causal model is just assumed by the modeler without much justification -- and modelling failures can lead to mistakes in the safety analysis. This paper proposes the first formal causal definition of agents -- roughly that agents are systems that would adapt their policy if their actions influenced the world in a different way. From this we (...)
Computer Science in Formal Sciences

Direct download (3 more)

Export citation

Bookmark 2 citations
53
Reasoning about causality in games.Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate & Michael Wooldridge - 2023 - Artificial Intelligence 320 (C):103919.details
Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's causal hierarchy to the game-theoretic domain, or as extending Koller and Milch's multi-agent influence diagrams to the causal domain. We then consider three (...)
Science, Logic, and Mathematics

Direct download (3 more)

Export citation

Bookmark 1 citation
18
Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.Tom Everitt, Marcus Hutter, Ramana Kumar & Victoria Krakovna - 2021 - Synthese 198 (Suppl 27):6435-6467.details
Can humans get arbitrarily capable reinforcement learning agents to do their bidding? Or will sufficiently capable RL agents always find ways to bypass their intended objectives by shortcutting their reward signal? This question impacts how far RL can be scaled, and whether alternative paradigms must be developed in order to build safe artificial general intelligence. In this paper, we study when an RL agent has an instrumental goal to tamper with its reward process, and describe design principles that prevent instrumental (...)
Machine Learning in Philosophy of Cognitive Science

Philosophy of Action, Misc in Philosophy of Action

Direct download (3 more)

Export citation

Bookmark 4 citations
28
Classification by decomposition: a novel approach to classification of symmetric $$2\times 2$$ games.Mikael Böörs, Tobias Wängberg, Tom Everitt & Marcus Hutter - 2022 - Theory and Decision 93 (3):463-508.details
In this paper, we provide a detailed review of previous classifications of 2 × 2 games and suggest a mathematically simple way to classify the symmetric 2 × 2 games based on a decomposition of the payoff matrix into a cooperative and a zero-sum part. We argue that differences in the interaction between the parts is what makes games interesting in different ways. Our claim is supported by evolutionary computer experiments and findings in previous literature. In addition, we provide a (...)
No categories
Direct download (3 more)

Export citation

Bookmark

Off-campus access

Using PhilPapers from home?

Create an account to enable off-campus access through your institution's proxy server.

Monitor this page

Be alerted of all new items appearing on this page. Choose how you want to monitor it:

RSS feed

	show categories
	categorization shortcuts
	hide abstracts
	open articles in new windows

	show categories
	categorization shortcuts
	hide abstracts
	open articles in new windows

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...