Unsupervised learning and grammar induction
Abstract
In this chapter we consider unsupervised learning from two perspectives. First, we briefly look at its advantages and disadvantages as an engineering technique applied to large corpora in natural language processing. While supervised learning generally achieves greater accuracy with less data, unsupervised learning offers significant savings in the intensive labour required for annotating text. Second, we discuss the possible relevance of unsupervised learning to debates on the cognitive basis of human language acquisition. In this context we explore the implications of recent work on grammar induction for poverty of stimulus arguments that purport to motivate a strong bias model of language learning, commonly formulated as a theory of Universal Grammar (UG). We examine the second issue both as a problem in computational learning theory, and with reference to empirical work on unsupervised Machine Learning (ML) of syntactic structure. We compare two models of learning theory and the place of unsupervised learning within each of them. Looking at recent work on part of speech tagging and the recognition of syntactic structure, we see how far unsupervised ML methods have come in acquiring different kinds of grammatical knowledge from raw text.Author's Profile
My notes
Similar books and articles
Unsupervised statistical learning in vision: computational principles, biological evidence.Shimon Edelman - unknown
Natural language grammar induction using a constituent-context model.Christopher Manning - manuscript
Natural Language Grammar Induction using a Constituent-Context Model.Dan Klein & Christopher D. Manning - unknown
A Generative Constituent-Context Model for Improved Grammar Induction.Dan Klein & Christopher D. Manning - unknown
Characterizing Motherese: On the Computational Structure of Child-Directed Language.Shimon Edelman - unknown
Analytics
Added to PP
2010-03-19
Downloads
59 (#202,987)
6 months
1 (#451,971)
2010-03-19
Downloads
59 (#202,987)
6 months
1 (#451,971)
Historical graph of downloads