Cognitive Science 35 (1):119-155 (2011)
AbstractThis paper reconsiders the diphone-based word segmentation model of Cairns, Shillcock, Chater, and Levy (1997) and Hockema (2006), previously thought to be unlearnable. A statistically principled learning model is developed using Bayes’ theorem and reasonable assumptions about infants’ implicit knowledge. The ability to recover phrase-medial word boundaries is tested using phonetic corpora derived from spontaneous interactions with children and adults. The (unsupervised and semi-supervised) learning models are shown to exhibit several crucial properties. First, only a small amount of language exposure is required to achieve the model’s ceiling performance, equivalent to between 1 day and 1 month of caregiver input. Second, the models are robust to variation, both in the free parameter and the input representation. Finally, both the learning and baseline models exhibit undersegmentation, argued to have significant ramifications for speech processing as a whole
Similar books and articles
Which Came First: Infants Learning Language or Motherese?Heather Bortfeld - 2004 - Behavioral and Brain Sciences 27 (4):505-506.
The Effect of Sonority on Word Segmentation: Evidence for the Use of a Phonological Universal.Marc Ettlinger, Amy S. Finn & Carla L. Hudson Kam - 2012 - Cognitive Science 36 (4):655-673.
Effects of Visual Information on Adults' and Infants' Auditory Statistical Learning.Erik D. Thiessen - 2010 - Cognitive Science 34 (6):1093-1106.
Human Semi-Supervised Learning.Bryan R. Gibson, Timothy T. Rogers & Xiaojin Zhu - 2013 - Topics in Cognitive Science 5 (1):132-172.
iMinerva: A Mathematical Model of Distributional Statistical Learning.Erik D. Thiessen & Philip I. Pavlik - 2013 - Cognitive Science 37 (2):310-343.
Mechanisms of Implicit Learning: Connectionist Models of Sequence Processing.Axel Cleeremans - 1993 - MIT Press.
Editors' Introduction: Why Formal Learning Theory Matters for Cognitive Science.Sean Fulop & Nick Chater - 2013 - Topics in Cognitive Science 5 (1):3-12.
Bayesian Model Learning Based on Predictive Entropy.Jukka Corander & Pekka Marttinen - 2006 - Journal of Logic, Language and Information 15 (1-2):5-20.
Automatic Phonetic Segmentation of Hindi Speech Using Hidden Markov Model.Archana Balyan, S. S. Agrawal & Amita Dev - 2012 - AI and Society 27 (4):543-549.
Applying Forward Models to Sequence Learning: A Connectionist Implementation.Axel Cleeremans - unknown
Using Variability to Guide Dimensional Weighting: Associative Mechanisms in Early Word Learning.Keith S. Apfelbaum & Bob McMurray - 2011 - Cognitive Science 35 (6):1105-1138.
Spontaneous Coordination and Evolutionary Learning Processes in an Agent-Based Model.Pierre Barbaroux & Gilles Enée - 2005 - Mind and Society 4 (2):179-195.
Added to PP
Historical graph of downloads
Citations of this work
Cognitive Science in the Era of Artificial Intelligence: A Roadmap for Reverse-Engineering the Infant Language-Learner.Emmanuel Dupoux - 2018 - Cognition 173 (C):43-59.
Learning Phonemes With a Proto-Lexicon.Andrew Martin, Sharon Peperkamp & Emmanuel Dupoux - 2013 - Cognitive Science 37 (1):103-124.
Linguistic Constraints on Statistical Word Segmentation: The Role of Consonants in Arabic and English.Itamar Kastner & Frans Adriaans - 2018 - Cognitive Science 42 (S2):494-518.
Infants' Developing Sensitivity to Native Language Phonotactics: A Meta-Analysis.Megha Sundara, Z. L. Zhou, Canaan Breiss, Hironori Katsuda & Jeremy Steffman - 2022 - Cognition 221 (C):104993.
Using Predictability for Lexical Segmentation.Çağrı Çöltekin - 2017 - Cognitive Science 41 (7):1988-2021.
References found in this work
Shortlist B: A Bayesian Model of Continuous Speech Recognition.Dennis Norris & James M. McQueen - 2008 - Psychological Review 115 (2):357-395.
A Bayesian Framework for Word Segmentation: Exploring the Effects of Context.Sharon Goldwater, Thomas L. Griffiths & Mark Johnson - 2009 - Cognition 112 (1):21-54.
Distributional Regularity and Phonotactic Constraints Are Useful for Segmentation.Michael R. Brent & Timothy A. Cartwright - 1996 - Cognition 61 (1-2):93-125.
Frequent Frames as a Cue for Grammatical Categories in Child Directed Speech.Toben H. Mintz - 2003 - Cognition 90 (1):91-117.