Some things look more complex than others. For example, a crenulate and richly organized leaf may seem more complex than a plain stone. What is the nature of this experience—and why do we have it in the first place? Here, we explore how object complexity serves as an efficiently extracted visual signal that the object merits further exploration. We algorithmically generated a library of geometric shapes and determined their complexity by computing the cumulative surprisal of their internal skeletons—essentially quantifying the (...) “amount of information” within each shape—and then used this approach to ask new questions about the perception of complexity. Experiments 1–3 asked what kind of mental process extracts visual complexity: a slow, deliberate, reflective process (as when we decide that an object is expensive or popular) or a fast, effortless, and automatic process (as when we see that an object is big or blue)? We placed simple and complex objects in visual search arrays and discovered that complex objects were easier to find among simple distractors than simple objects are among complex distractors—a classic search asymmetry indicating that complexity is prioritized in visual processing. Next, we explored the function of complexity: Why do we represent object complexity in the first place? Experiments 4–5 asked subjects to study serially presented objects in a self‐paced manner (for a later memory test); subjects dwelled longer on complex objects than simple objects—even when object shape was completely task‐irrelevant—suggesting a connection between visual complexity and exploratory engagement. Finally, Experiment 6 connected these implicit measures of complexity to explicit judgments. Collectively, these findings suggest that visual complexity is extracted efficiently and automatically, and even arouses a kind of “perceptual curiosity” about objects that encourages subsequent attentional engagement. (shrink)
Does the human mind resemble the machines that can behave like it? Biologically inspired machine-learning systems approach “human-level” accuracy in an astounding variety of domains, and even predict human brain activity—raising the exciting possibility that such systems represent the world like we do. However, even seemingly intelligent machines fail in strange and “unhumanlike” ways, threatening their status as models of our minds. How can we know when human–machine behavioral differences reflect deep disparities in their underlying capacities, vs. when such failures (...) are only superficial or peripheral? This article draws on a foundational insight from cognitive science—the distinction between performance and competence—to encourage “species-fair” comparisons between humans and machines. The performance/competence distinction urges us to consider whether the failure of a system to behave as ideally hypothesized, or the failure of one creature to behave like another, arises not because the system lacks the relevant knowledge or internal capacities (“competence”), but instead because of superficial constraints on demonstrating that knowledge (“performance”). I argue that this distinction has been neglected by research comparing human and machine behavior, and that it should be essential to any such comparison. Focusing on the domain of image classification, I identify three factors contributing to the species-fairness of human–machine comparisons, extracted from recent work that equates such constraints. Species-fair comparisons level the playing field between natural and artificial intelligence, so that we can separate more superficial differences from those that may be deep and enduring. (shrink)
What is the relationship between complexity in the world and complexity in the mind? Intuitively, increasingly complex objects and events should give rise to increasingly complex mental representations (or perhaps a plateau in complexity after a certain point). However, a counterintuitive possibility with roots in information theory is an inverted U-shaped relationship between the “objective” complexity of some stimulus and the complexity of its mental representation, because excessively complex patterns might be characterized by surprisingly short computational descriptions (e.g., if they (...) are represented as having been generated “randomly”). Here, we demonstrate that this is the case, using a novel approach that takes the notion of “description” literally. Subjects saw static and dynamic visual stimuli whose objective complexity could be carefully manipulated, and they described these stimuli in their own words by giving freeform spoken descriptions of them. Across three experiments totaling over 10,000 speech clips, spoken descriptions of shapes (Experiment 1), dot-arrays (Experiment 2), and dynamic motion-paths (Experiment 3) revealed a striking quadratic relationship between the raw complexity of these stimuli and the length of their spoken descriptions. In other words, the simplest and most complex stimuli received the shortest descriptions, while those stimuli with a “medium” degree of complexity received the longest descriptions. Follow-up analyses explored the particular words used by subjects, allowing us to further explore how such stimuli were represented. We suggest that the mind engages in a kind of lossy compression for overly complex stimuli, and we discuss the utility of such freeform responses for exploring foundational questions about mental representation. (shrink)
Arguably the most foundational principle in perception research is that our experience of the world goes beyond the retinal image; we perceive the distal environment itself, not the proximal stimulation it causes. Shape may be the paradigm case of such “unconscious inference”: When a coin is rotated in depth, we infer the circular object it truly is, discarding the perspectival ellipse projected on our eyes. But is this really the fate of such perspectival shapes? Or does a tilted coin retain (...) an elliptical appearance even when we know it’s circular? This question has generated heated debate from Locke and Hume to the present; but whereas extant arguments rely primarily on introspection, this problem is also open to empirical test. If tilted coins bear a representational similarity to elliptical objects, then a circular coin should, when rotated, impair search for a distal ellipse. Here, nine experiments demonstrate that this is so, suggesting that perspectival shapes persist in the mind far longer than traditionally assumed. Subjects saw search arrays of three-dimensional “coins,” and simply had to locate a distally elliptical coin. Surprisingly, rotated circular coins slowed search for elliptical targets, even when subjects clearly knew the rotated coins were circular. This pattern arose with static and dynamic cues, couldn’t be explained by strategic responding or unfamiliarity, generalized across shape classes, and occurred even with sustained viewing. Finally, these effects extended beyond artificial displays to real-world objects viewed in naturalistic, full-cue conditions. We conclude that objects have a remarkably persistent dual character: their objective shape “out there,” and their perspectival shape “from here.”. (shrink)
The recent emergence of machine-manipulated media raises an important societal question: How can we know whether a video that we watch is real or fake? In two online studies with 15,016 participants, we present authentic videos and deepfakes and ask participants to identify which is which. We compare the performance of ordinary human observers with the leading computer vision deepfake detection model and find them similarly accurate, while making different kinds of mistakes. Together, participants with access to the model’s prediction (...) are more accurate than either alone, but inaccurate model predictions often decrease participants’ accuracy. To probe the relative strengths and weaknesses of humans and machines as detectors of deepfakes, we examine human and machine performance across video-level features, and we evaluate the impact of preregistered randomized interventions on deepfake detection. We find that manipulations designed to disrupt visual processing of faces hinder human participants’ performance while mostly not affecting the model’s performance, suggesting a role for specialized cognitive capacities in explaining human deepfake detection performance. -/- . (shrink)
Resource rationality may explain suboptimal patterns of reasoning; but what of “anti-Bayesian” effects where the mind updates in a direction opposite the one it should? We present two phenomena — belief polarization and the size-weight illusion — that are not obviously explained by performance- or resource-based constraints, nor by the authors’ brief discussion of reference repulsion. Can resource rationality accommodate them?
What is the purpose of perception? And how might the answer to this question help distinguish perception from other mental processes? Block’s landmark book, The Border between Seeing and Thinking, investigates the nature of perception, how perception differs from cognition, and why the distinction matters. It is, as one would expect, wide-ranging, deeply informed by relevant science, and hugely stimulating. Here, we explore a central project of the book — Block’s attempts to identify the features of perception that distinguish it (...) from higher-level cognition — by focusing on his suggestion that such features closely relate to perception’s purpose. As well as offering detailed critical discussion of these proposals, our more general aim is to advertise both the promise and pitfalls of asking: What is perception for? (shrink)
“What is the structure of thought?” is as central a question as any in cognitive science. A classic answer to this question has appealed to a Language of Thought (LoT). We point to emerging research from disparate branches of the field that supports the LoT hypothesis, but also uncovers diversity in LoTs across cognitive systems, stages of development, and species. Our letter formulates open research questions for cognitive science concerning the varieties of rules and representations that underwrite various LoT-based systems (...) and how these variations can help researchers taxonomize cognitive systems. (shrink)
When a circular coin is rotated in depth, is there any sense in which it comes to resemble an ellipse? While this question is at the center of a rich and divided philosophical tradition (with some scholars answering affirmatively and some negatively), Morales et al. (2020, 2021) took an empirical approach, reporting 10 experiments whose results favor such perspectival similarity. Recently, Burge and Burge (2022) offered a vigorous critique of this work, objecting to its approach and conclusions on both philosophical (...) and empirical grounds. Here, we answer these objections on both fronts. We show that Burge and Burge’s critique rests on misunderstandings of Morales et al.’s claims; of the relation between the data and conclusions; and of the philosophical context in which the work appears. Specifically, Burge and Burge attribute to us a much stronger (and stranger) view than we hold, involving the introduction of “a new entity” located “in some intermediate position(s) between the distal shape and the retinal image.” We do not hold this view. Indeed, once properly understood, most of Burge and Burge’s objections favor Morales et al.’s claims rather than oppose them. Finally, we discuss several questions that remain unanswered, and reflect on a productive path forward on these issues of foundational scientific and philosophical interest. (shrink)