Abstract
Neuroscientists commonly assume that the brain generates representations of a scene in various non-retinotopic 3D coordinate frames, for example in 'egocentric' and 'allocentric' frames. Although neurons in early visual cortex might be described as representing a scene in an eye-centred frame, using 2 dimensions of visual direction and one of binocular disparity, there is no convincing evidence of similarly organized cortical areas using non-retinotopic 3D coordinate frames nor of any systematic transfer of information from one frame to another. We propose that perception and action in a 3D world could be achieved without generating ego- or allocentric 3D coordinate frames. Instead, we suggest that the fundamental operation the brain carries out is to compare a long state vector with a matrix of weights to choose an output. The processes involved in perception of a 3D scene and action within it depend, we suggest, on successive iterations of this basic operation. Advantages of this proposal include the fact that it relies on computationally well-defined operations corresponding to well-established neural processes. Also, we argue that from a philosophical perspective it is at least as plausible as theories postulating 3D coordinate frames. Finally, we suggest a variety of experiments that would falsify our claim.