Abstract
A central tenet of cognitive linguistics is that adults’ knowledge of language consists of a structured inventory of constructions, including various two-argument constructions such as the active, the passive and “fronting” constructions. But how do speakers choose which construction to use for a particular utterance, given constraints such as discourse/information structure and the semantic fit between verb and construction? The goal of the present study was to build a computational model of this phenomenon for two-argument constructions in Mandarin. First, we conducted a grammaticality judgment study with 60 native speakers which demonstrated that, across 57 verbs, semantic affectedness – as determined by further 16 native speakers – predicted each verb’s relative acceptability in the bei-passive and ba-active constructions, but not the Notional Passive and SVO Active constructions. Second, in order to simulate acquisition of these competing constraints, we built a computational model that learns to map from corpus-derived input to an output representation corresponding to these four constructions. The model was able to predict judgments of the relative acceptability of the test verbs in the ba-active and bei-passive constructions obtained in Study 1, with model-human correlations in the region of r = 0.5 and r = 0.3, respectively. Surprisingly, these correlations increased when lexical verb identity was removed; perhaps because this information leads to over-fitting of the training set. These findings suggest the intriguing possibility that acquiring constructions involves forgetting as a mechanism for abstracting across certain fine-grained lexical details and idiosyncrasies.