Order:
  1.  41
    Discovering agents.Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott & Tom Everitt - 2023 - Artificial Intelligence 322 (C):103963.
    Causal models of agents have been used to analyse the safety aspects of machine learning systems. But identifying agents is non-trivial -- often the causal model is just assumed by the modeler without much justification -- and modelling failures can lead to mistakes in the safety analysis. This paper proposes the first formal causal definition of agents -- roughly that agents are systems that would adapt their policy if their actions influenced the world in a different way. From this we (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   2 citations  
  2.  53
    Reasoning about causality in games.Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate & Michael Wooldridge - 2023 - Artificial Intelligence 320 (C):103919.
    Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's causal hierarchy to the game-theoretic domain, or as extending Koller and Milch's multi-agent influence diagrams to the causal domain. We then consider three (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   1 citation  
  3.  18
    Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.Tom Everitt, Marcus Hutter, Ramana Kumar & Victoria Krakovna - 2021 - Synthese 198 (Suppl 27):6435-6467.
    Can humans get arbitrarily capable reinforcement learning agents to do their bidding? Or will sufficiently capable RL agents always find ways to bypass their intended objectives by shortcutting their reward signal? This question impacts how far RL can be scaled, and whether alternative paradigms must be developed in order to build safe artificial general intelligence. In this paper, we study when an RL agent has an instrumental goal to tamper with its reward process, and describe design principles that prevent instrumental (...)
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark   4 citations  
  4.  28
    Classification by decomposition: a novel approach to classification of symmetric $$2\times 2$$ games.Mikael Böörs, Tobias Wängberg, Tom Everitt & Marcus Hutter - 2022 - Theory and Decision 93 (3):463-508.
    In this paper, we provide a detailed review of previous classifications of 2 × 2 games and suggest a mathematically simple way to classify the symmetric 2 × 2 games based on a decomposition of the payoff matrix into a cooperative and a zero-sum part. We argue that differences in the interaction between the parts is what makes games interesting in different ways. Our claim is supported by evolutionary computer experiments and findings in previous literature. In addition, we provide a (...)
    No categories
    Direct download (3 more)  
     
    Export citation  
     
    Bookmark