Extending Environments To Measure Self-Reflection In Reinforcement Learning

Journal of Artificial General Intelligence 13 (1) (2022)
  Copy   BIBTEX

Abstract

We consider an extended notion of reinforcement learning in which the environment can simulate the agent and base its outputs on the agent's hypothetical behavior. Since good performance usually requires paying attention to whatever things the environment's outputs are based on, we argue that for an agent to achieve on-average good performance across many such extended environments, it is necessary for the agent to self-reflect. Thus weighted-average performance over the space of all suitably well-behaved extended environments could be considered a way of measuring how self-reflective an agent is. We give examples of extended environments and introduce a simple transformation which experimentally seems to increase some standard RL agents' performance in a certain type of extended environment.

Links

PhilArchive

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

The growth of learning during non-differential reinforcement.Allen D. Calvin - 1953 - Journal of Experimental Psychology 46 (4):248.
Some determinants of rigidity in discrimination-reversal learning.Arnold H. Buss - 1952 - Journal of Experimental Psychology 44 (3):222.
Determinants of the effects of vicarious reinforcement.Albert R. Marston - 1966 - Journal of Experimental Psychology 71 (4):550.

Analytics

Added to PP
2021-10-13

Downloads
332 (#57,886)

6 months
134 (#23,601)

Historical graph of downloads
How can I increase my downloads?