Named Entity Recognition with Character-Level Models

Abstract

We discuss two named-entity recognition models which use characters and character n-grams either exclusively or as an important part of their data representation. The first model is a character-level HMM with minimal context information, and the second model is a maximum-entropy conditional markov model with substantially richer context features. Our best model achieves an overall F1 of 86.07% on the English test data (92.31% on the development data). This number represents a 25% error reduction over the same model without word-internal (substring) features.

Download options

PhilArchive



    Upload a copy of this work     Papers currently archived: 72,879

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Analytics

Added to PP
2010-12-22

Downloads
19 (#588,288)

6 months
1 (#386,001)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Daniel Klein
Harvard University

References found in this work

No references found.

Add more references

Citations of this work

No citations found.

Add more citations

Similar books and articles

Hesitations and Clarifications on a Model to Abandon Feedback.Louisa M. Slowiaczek - 2000 - Behavioral and Brain Sciences 23 (3):347-347.
Merging Information Versus Speech Recognition.Irene Appelbaum - 2000 - Behavioral and Brain Sciences 23 (3):325-326.