A Conditional Random Field Word Segmenter

Abstract

We present a Chinese word segmentation system submitted to the closed track of Sighan bakeoff 2005. Our segmenter was built using a conditional random field sequence model that provides a framework to use a large number of linguistic features such as character identity, morphological and character reduplication features. Because our morphological features were extracted from the training corpora automatically, our system was not biased toward any particular variety of Mandarin. Thus, our system does not overfit the variety of Mandarin most familiar to the system's designers. Our final system achieved a F-score of..

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,349

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Operating on functions with variable domains.Philip G. Calabrese - 2003 - Journal of Philosophical Logic 32 (1):1-18.
Probability and conditionals.Robert C. Stalnaker - 1970 - Philosophy of Science 37 (1):64-80.
The game of word skipping: Who are the competitors?Ralf Engbert & Reinhold Kliegl - 2003 - Behavioral and Brain Sciences 26 (4):481-482.

Analytics

Added to PP
2010-12-22

Downloads
69 (#231,631)

6 months
3 (#1,023,809)

Historical graph of downloads
How can I increase my downloads?

References found in this work

No references found.

Add more references