Natural Language Grammar Induction using a Constituent-Context Model

Abstract

This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text. Most previous work has focused on maximizing likelihood according to generative PCFG models. In contrast, we employ a simpler probabilistic model over trees based directly on constituent identity and linear context, and use an EM-like iterative procedure to induce structure. This method produces much higher quality analyses, giving the best published results on the ATIS dataset.

Download options

PhilArchive



    Upload a copy of this work     Papers currently archived: 72,694

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Analytics

Added to PP
2010-12-22

Downloads
14 (#738,135)

6 months
1 (#388,311)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Daniel Klein
Harvard University

References found in this work

No references found.

Add more references

Similar books and articles

A Grammar Systems Approach to Natural Language Grammar.M. Dolores Jiménez López - 2006 - Linguistics and Philosophy 29 (4):419 - 454.
Talking About Trees and Truth-Conditions.Reinhard Muskens - 2001 - Journal of Logic, Language and Information 10 (4):417-455.