Linguistic Constraints on Statistical Word Segmentation: The Role of Consonants in Arabic and English

Cognitive Science 42 (S2):494-518 (2018)
  Copy   BIBTEX

Abstract

Statistical learning is often taken to lie at the heart of many cognitive tasks, including the acquisition of language. One particular task in which probabilistic models have achieved considerable success is the segmentation of speech into words. However, these models have mostly been tested against English data, and as a result little is known about how a statistical learning mechanism copes with input regularities that arise from the structural properties of different languages. This study focuses on statistical word segmentation in Arabic, a Semitic language in which words are built around consonantal roots. We hypothesize that segmentation in such languages is facilitated by tracking consonant distributions independently from intervening vowels. Previous studies have shown that human learners can track consonant probabilities across intervening vowels in artificial languages, but it is unknown to what extent this ability would be beneficial in the segmentation of natural language. We assessed the performance of a Bayesian segmentation model on English and Arabic, comparing consonant-only representations with full representations. In addition, we examined to what extent structurally different proto-lexicons reflect adult language. The results suggest that for a child learning a Semitic language, separating consonants from vowels is beneficial for segmentation. These findings indicate that probabilistic models require appropriate linguistic representations in order to effectively meet the challenges of language acquisition.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 92,227

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Which came first: Infants learning language or motherese?Heather Bortfeld - 2004 - Behavioral and Brain Sciences 27 (4):505-506.
Using Predictability for Lexical Segmentation.Çağrı Çöltekin - 2017 - Cognitive Science 41 (7):1988-2021.

Analytics

Added to PP
2017-07-26

Downloads
15 (#951,094)

6 months
1 (#1,477,342)

Historical graph of downloads
How can I increase my downloads?