New efforts are using head cameras and eye-trackers worn by infants to capture everyday visual environments from the point of view of the infant learner. From this vantage point, the training sets for statistical learning develop as the sensorimotor abilities of the infant develop, yielding a series of ordered datasets for visual learning that differ in content and structure between timepoints but are highly selective at each timepoint. These changing environments may constitute a developmentally ordered curriculum that optimizes learning across (...) many domains. Future advances in computational models will be necessary to connect the developmentally changing content and statistics of infant experience to the internal machinery that does the learning. (shrink)
Cross-situational word learning, like any statistical learning problem, involves tracking the regularities in the environment. However, the information that learners pick up from these regularities is dependent on their learning mechanism. This article investigates the role of one type of mechanism in statistical word learning: competition. Competitive mechanisms would allow learners to find the signal in noisy input and would help to explain the speed with which learners succeed in statistical learning tasks. Because cross-situational word learning provides information at multiple (...) scales—both within and across trials/situations—learners could implement competition at either or both of these scales. A series of four experiments demonstrate that cross-situational learning involves competition at both levels of scale, and that these mechanisms interact to support rapid learning. The impact of both of these mechanisms is considered from the perspective of a process-level understanding of cross-situational learning. (shrink)
Previous research shows that people can use the co-occurrence of words and objects in ambiguous situations (i.e., containing multiple words and objects) to learn word meanings during a brief passive training period (Yu & Smith, 2007). However, learners in the world are not completely passive but can affect how their environment is structured by moving their heads, eyes, and even objects. These actions can indicate attention to a language teacher, who may then be more likely to name the attended objects. (...) Using a novel active learning paradigm in which learners choose which four objects they would like to see named on each successive trial, this study asks whether active learning is superior to passive learning in a cross-situational word learning context. Finding that learners perform better in active learning, we investigate the strategies and discover that most learners use immediate repetition to disambiguate pairings. Unexpectedly, we find that learners who repeat only one pair per trial—an easy way to infer this pair—perform worse than those who repeat multiple pairs per trial. Using a working memory extension to an associative model of word learning with uncertainty and familiarity biases, we investigate individual differences that correlate with these assorted strategies. (shrink)
Two experiments were conducted to examine adult learners' ability to extract multiple statistics in simultaneously presented visual and auditory input. Experiment 1 used a cross-situational learning paradigm to test whether English speakers were able to use co-occurrences to learn word-to-object mappings and concurrently form object categories based on the commonalities across training stimuli. Experiment 2 replicated the first experiment and further examined whether speakers of Mandarin, a language in which final syllables of object names are more predictive of category membership (...) than English, were able to learn words and form object categories when trained with the same type of structures. The results indicate that both groups of learners successfully extracted multiple levels of co-occurrence and used them to learn words and object categories simultaneously. However, marked individual differences in performance were also found, suggesting possible interference and competition in processing the two concurrent streams of regularities. (shrink)
Joint attention has been extensively studied in the developmental literature because of overwhelming evidence that the ability to socially coordinate visual attention to an object is essential to healthy developmental outcomes, including language learning. The goal of this study was to understand the complex system of sensory-motor behaviors that may underlie the establishment of joint attention between parents and toddlers. In an experimental task, parents and toddlers played together with multiple toys. We objectively measured joint attention—and the sensory-motor behaviors that (...) underlie it—using a dual head-mounted eye-tracking system and frame-by-frame coding of manual actions. By tracking the momentary visual fixations and hand actions of each participant, we precisely determined just how often they fixated on the same object at the same time, the visual behaviors that preceded joint attention and manual behaviors that preceded and co-occurred with joint attention. We found that multiple sequential sensory-motor patterns lead to joint attention. In addition, there are developmental changes in this multi-pathway system evidenced as variations in strength among multiple routes. We propose that coordinated visual attention between parents and toddlers is primarily a sensory-motor behavior. Skill in achieving coordinated visual attention in social settings—like skills in other sensory-motor domains—emerges from multiple pathways to the same functional end. (shrink)
This paper presents an adaptive neural output feedback control scheme for uncertain robot manipulators with input saturation using the radial basis function neural network and disturbance observer. First, the RBFNN is used to approximate the system uncertainty, and the unknown approximation error of the RBFNN and the time-varying unknown external disturbance of robot manipulators are integrated as a compounded disturbance. Then, the state observer and the disturbance observer are proposed to estimate the unmeasured system state and the unknown compounded disturbance (...) based on RBFNN. At the same time, the adaptation technique is employed to tackle the control input saturation problem. Utilizing the estimate outputs of the RBFNN, the state observer, and the disturbance observer, the adaptive neural output feedback control scheme is developed for robot manipulators using the backstepping technique. The convergence of all closed-loop signals is rigorously proved via Lyapunov analysis and the asymptotically convergent tracking error is obtained under the integrated effect of the system uncertainty, the unmeasured system state, the unknown external disturbance, and the input saturation. Finally, numerical simulation results are presented to illustrate the effectiveness of the proposed adaptive neural output feedback control scheme for uncertain robot manipulators. (shrink)
Perceptual tasks such as object matching, mammogram interpretation, mental rotation, and satellite imagery change detection often require the assignment of correspondences to fuse information across views. We apply techniques developed for machine translation to the gaze data recorded from a complex perceptual matching task modeled after fingerprint examinations. The gaze data provide temporal sequences that the machine translation algorithm uses to estimate the subjects' assumptions of corresponding regions. Our results show that experts and novices have similar surface behavior, such as (...) the number of fixations made or the duration of fixations. However, the approach applied to data from experts is able to identify more corresponding areas between two prints. The fixations that are associated with clusters that map with high probability to corresponding locations on the other print are likely to have greater utility in a visual matching task. These techniques address a fundamental problem in eye tracking research with perceptual matching tasks: Given that the eyes always point somewhere, which fixations are the most informative and therefore are likely to be relevant for the comparison task? (shrink)
Forensic evidence often involves an evaluation of whether two impressions were made by the same source, such as whether a fingerprint from a crime scene has detail in agreement with an impression taken from a suspect. Human experts currently outperform computer-based comparison systems, but the strength of the evidence exemplified by the observed detail in agreement must be evaluated against the possibility that some other individual may have created the crime scene impression. Therefore, the strongest evidence comes from features in (...) agreement that are also not shared with other impressions from other individuals. We characterize the nature of human expertise by applying two extant metrics to the images used in a fingerprint recognition task and use eye gaze data from experts to both tune and validate the models. The Attention via Information Maximization model quantifies the rarity of regions in the fingerprints to determine diagnosticity for purposes of excluding alternative sources. The CoVar model captures relationships between low-level features, mimicking properties of the early visual system. Both models produced classification and generalization performance in the 75%–80% range when classifying where experts tend to look. A validation study using regions identified by the AIM model as diagnostic demonstrates that human experts perform better when given regions of high diagnosticity. The computational nature of the metrics may help guard against wrongful convictions, as well as provide a quantitative measure of the strength of evidence in casework. (shrink)
Our computational studies of infant language learning estimate the inherent difficulty of Arbib's proposal. We show that body language provides a strikingly helpful scaffold for learning language that may be necessary but not sufficient, given the absence of sophisticated language in other species. The extraordinary language abilities of Homo sapiens must have evolved from other pressures, such as sexual selection.
Culture is surely important in human learning. But the relation between culture and psychological mechanism needs clarification in three areas: (1) All learning takes place in real time and through real-time mechanisms; (2) Social correlations are just a kind of learnable correlations; and (3) The proper frame of reference for cognitive theories is the perspective of the learner.
When humans are addressing multiple robots with informative speech acts, their cognitive resources are shared between all the participating robot agents. For each moment, the user’s behavior is not only determined by the actions of the robot that they are directly gazing at, but also shaped by the behaviors from all the other robots in the shared environment. We define cooperative behavior as the action performed by the robots that are not capturing the user’s direct attention. In this paper, we (...) are interested in how the human participants adjust and coordinate their own behavioral cues when the robot agents are performing different cooperative gaze behaviors. A novel gaze-contingent platform was designed and implemented. The robots’ behaviors were triggered by the participant’s attentional shifts in real time. Results showed that the human participants were highly sensitive when the robot agents were performing different cooperative gazing behaviors. Keywords: human-robot interaction; multi-robot interaction; multiparty interaction; eye gaze cue; embodied conversational agent. (shrink)
The present study explored heterogeneity in the association between engaged living and problematic Internet use. This study included 641 adolescents from four junior-senior high schools of Guangzhou, China. Besides the standard linear regression analysis, mixture regression analysis was conducted to detect certain subgroups of adolescents, based on their divergent association between engaged living and PIU. Sex, age, and psychological need were further compared among the latent subgroups. The results showed that a mixture regression model could account for more variance of (...) PIU than a traditional linear regression model, and identified three subgroups based on their class-specific regression of PIU to engaged living. For the High-PIU class, lower social integration and higher absorption were associated with increased PIU; for the Medium-PIU class, only high social integration was linked with the increase of PIU. For the Low-PIU class, no relation between engaged living and PIU were found. Additionally, being male or having a lower level of satisfied psychological needs increased the link between engaged living and PIU. The results indicated a heterogeneous relationship between engaged living and PIU among adolescents, and prevention or intervention programs should be tailored specifically to subgroups with moderate or high levels of PIU and to those with lower levels of psychological needs’ satisfaction, as identified by the mixture regression model. (shrink)