Abstract
Sender–receiver games, first introduced by David Lewis ([1969]), have received increased attention in recent years as a formal model for the emergence of communication. Skyrms ([2010]) showed that simple models of reinforcement learning often succeed in forming efficient, albeit not necessarily minimal, signalling systems for a large family of games. Later, Alexander et al. ([2012]) showed that reinforcement learning, combined with forgetting, frequently produced both efficient and minimal signalling systems. In this article, I define a ‘dynamic’ sender–receiver game in which the state–action pairs are not held constant over time and show that neither of these two models of learning learn to signal in this environment. However, a model of reinforcement learning with discounting of the past does learn to signal; it also gives rise to the phenomenon of linguistic drift. 1 Introduction2 Dynamic Signalling Games with Reinforcement Learning2.1 Introducing new states2.2 Swapping state–action pairs3 Discounting the Past3.1 Learning to signal in a dynamic world3.2 An unexpected outcome: linguistic drift4 ConclusionAppendix: A Markov Chain Analysis