AbstractWe demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-theart. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.
Added to PP
Historical graph of downloads
References found in this work
No references found.
Citations of this work
Recurrent Neural Network-Based Models for Recognizing Requisite and Effectuation Parts in Legal Texts.Truong-Son Nguyen, Le-Minh Nguyen, Satoshi Tojo, Ken Satoh & Akira Shimazu - 2018 - Artificial Intelligence and Law 26 (2):169-199.
Similar books and articles
Fast Exact Inference with a Factored Model for Natural Language Parsing.Dan Klein & Christopher D. Manning - unknown
Fast Exact Inference with a Factored Model for Natural Language Parsing.Christopher Manning - manuscript
Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank.Dan Klein & Christopher D. Manning - unknown
Parse Selection on the Redwoods Corpus: 3rd Growth Results.Christopher D. Manning & Kristina Toutanova - unknown
An ¢¡¤£¦¥¨§ Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars.Christopher Manning - manuscript
An Ç ´Ò¿ Μ Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars.Dan Klein & Christopher D. Manning - unknown
A Generative Constituent-Context Model for Improved Grammar Induction.Dan Klein & Christopher D. Manning - unknown