AI Seminar ------------------------------- Tuesday, September 23rd, 2003 4:00 pm - 5:30 pm 175 ATL (Large Conference Room) "Exploiting sentence paraphrasing for machine translation OR machine translation using alternative phrasings of the same sentence" Dragomir Radev Department of Electrical Engineering and Computer Science University of Michigan ---------------------------------- Recent work in statistical machine translation has moved from simple word-based alignment models to higher-order models involving syntax (e.g, alignment templates [Och,Tillman&Ney99; Och02] or tree-to-string models [Yamada&Knight02]. Such models incorporate syntactic knowledge directly into the translation process. A different technique, syntax-based reranking, was recently proposed for improving quality of machine translation. Reranking (e.g., [Radev,Prager,Samn00; Collins00]) is based on a discriminative model (e.g., using maximum entropy) over a large list of candidates. A recent 13-person project, led by Franz Josef Och of ISI, looked at syntax-based reranking to incorporate a large number of (mainly) syntactic features in the reranking model for a large-scale Chinese-to-English machine translation system. I will present an overview of the project and then turn to my individual contributions - namely two sets of features based on (1) alternative translation paraphrases and (2) lopsided (flipped) lexical dependencies. I will present evaluation results using the official NIST evaluation corpus that show the viability of these features and of the approach in whole.