Unsupervised context sensitive language acquisition from a large corpus

Shimon Edelman

Download from

kybele.psych.cornell.edu

More download options

Unsupervised context sensitive language acquisition from a large corpus

Shimon Edelman

Abstract

We describe a pattern acquisition algorithm that learns, in an unsupervised fashion, a streamlined representation of linguistic structures from a plain natural-language corpus. This paper addresses the issues of learning structured knowledge from a large-scale natural language data set, and of generalization to unseen text. The implemented algorithm represents sentences as paths on a graph whose vertices are words. Signiﬁcant patterns, determined by recursive context-sensitive statistical inference, form new vertices. Linguistic constructions are represented by trees composed of signiﬁcant patterns and their associated equivalence classes. An input module allows the algorithm to be subjected to a standard test of English as a Second Language proﬁ- ciency. The results are encouraging: the model attains a level of performance considered to be “intermediate” for 9th-grade students, despite having been trained on a corpus containing transcribed speech of parents directed to small children

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Edit

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Author's Profile

Shimon Edelman

Cornell University

Keywords

Add keywords

Reprint years

My notes

Analytics

Added to PP
2010-12-22

Downloads
45 (#344,240)

6 months
9 (#436,568)

Historical graph of downloads

How can I increase my downloads?

Author's Profile

Shimon Edelman

Cornell University

Citations of this work

Bridging computational, formal and psycholinguistic approaches to language.Shimon Edelman - unknown

Bridging language with the rest of cognition: computational, algorithmic and neurobiological issues and methods.Shimon Edelman - unknown

Learning Syntactic Constructions from Raw Corpora.Shimon Edelman - unknown

Some Tests of an Unsupervised Model of Language Acquisition.Shimon Edelman - unknown

Add more citations

References found in this work

Automatic acquisition and efﬁcient representation of syntactic structures.Shimon Edelman - unknown

Natural Language Grammar Induction using a Constituent-Context Model.Dan Klein & Christopher D. Manning - unknown

Knowledge of Language: Its Nature, Origin, and Use. [REVIEW]Norbert Hornstein - 1988 - Philosophical Review 97 (4):567-573.

Constraining the neural representation of the visual world.Shimon Edelman - 2002 - Trends in Cognitive Sciences 6 (3):125-131.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Unsupervised context sensitive language acquisition from a large corpus

Abstract

Author's Profile

Categories

Keywords

Reprint years

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Author's Profile

Citations of this work

References found in this work