EARSHOT: A Minimal Neural Network Model of Incremental Human Speech Recognition

Cognitive Science 44 (4):e12823 (2020)
  Copy   BIBTEX

Abstract

Despite the lack of invariance problem (the many‐to‐many mapping between acoustics and percepts), human listeners experience phonetic constancy and typically perceive what a speaker intends. Most models of human speech recognition (HSR) have side‐stepped this problem, working with abstract, idealized inputs and deferring the challenge of working with real speech. In contrast, carefully engineered deep learning networks allow robust, real‐world automatic speech recognition (ASR). However, the complexities of deep learning architectures and training regimens make it difficult to use them to provide direct insights into mechanisms that may support HSR. In this brief article, we report preliminary results from a two‐layer network that borrows one element from ASR, long short‐term memory nodes, which provide dynamic memory for a range of temporal spans. This allows the model to learn to map real speech from multiple talkers to semantic targets with high accuracy, with human‐like timecourse of lexical access and phonological competition. Internal representations emerge that resemble phonetically organized responses in human superior temporal gyrus, suggesting that the model develops a distributed phonological code despite no explicit training on phonetic or phonemic targets. The ability to work with real speech is a major advance for cognitive models of HSR.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 90,616

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Human Skin Color Detection Using Neural Networks.Arvin Agah & Mohammadreza Hajiarbabi - 2015 - Journal of Intelligent Systems 24 (4):425-436.
Diabetes Prediction Using Artificial Neural Network.Nesreen Samer El_Jerjawi & Samy S. Abu-Naser - 2018 - International Journal of Advanced Science and Technology 121:54-64.
Representing Types as Neural Events.Robin Cooper - 2019 - Journal of Logic, Language and Information 28 (2):131-155.
Merging information versus speech recognition.Irene Appelbaum - 2000 - Behavioral and Brain Sciences 23 (3):325-326.

Analytics

Added to PP
2020-04-11

Downloads
10 (#1,026,208)

6 months
4 (#320,252)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Paul Allopenna
Brown University