Abstract
Linking constructions involving dሇ (DE) are ubiquitous in Chinese, and can be translated into English in many different ways. This is a major source of machine translation error, even when syntaxsensitive translation models are used. This paper explores how getting more information about the syntactic, semantic, and discourse context of uses of dሇ (DE) can facilitate producing an appropriate English translation strategy. We describe a finergrained classification of dሇ (DE) constructions in Chinese NPs, construct a corpus of annotated examples, and then train a log-linear classifier, which contains linguistically inspired features. We use the DE classifier to preprocess MT data by explicitly labeling dሇ (DE) constructions, as well as reordering phrases, and show that our approach provides significant BLEU point gains on MT02 (+1.24), MT03 (+0.88) and MT05 (+1.49) on a phrasedbased system. The improvement persists when a hierarchical reordering model is applied.