Cognition 80 (1-2):127-158 (
2001)
Copy
BIBTEX
Abstract
This paper argues that a theory of situated vision, suited for the dual purposes of object recognition and the control of action, will have to provide something more than a system that constructs a conceptual representation from visual stimuli: it will also need to provide a special kind of direct (preconceptual, unmediated) connection between elements of a visual representation and certain elements in the world. Like natural language demonstratives (such as `this' or `that') this direct connection allows entities to be referred to without being categorized or conceptualized. Several reasons are given for why we need such a preconcep- tual mechanism which individuates and keeps track of several individual objects in the world. One is that early vision must pick out and compute the relation among several individual objects while ignoring their properties. Another is that incrementally computing and updating representations of a dynamic scene requires keeping track of token individuals despite changes in their properties or locations. It is then noted that a mechanism meeting these requirements has already been proposed in order to account for a number of disparate empiri- cal phenomena, including subitizing, search-subset selection and multiple object tracking (Pylyshyn et al., Canadian Journal of Experimental Psychology 48(2) (1994) 260). This mechanism, called a visual index or FINST, is brie