AI Seminar ------------------------------- Tuesday, January 18th, 2005 4:00 pm - 5:30 pm 175 ATL (Large Conference Room) "Machine Translation = Automata Theory + Probability Theory + Linguistics" Kevin Knight Information Sciences Institute University of Southern California =============================== Recently, machine translation (MT) systems have become much more accurate. A major reason is that machines now gather translation knowledge autonomously, combing through large amounts of human-translated material available on the web. What the systems learn are essentially finite-state Markov models -- target sub-strings are substituted for source sub-strings, followed by some re-ordering. A serious weakness, however, is that this kind of model can only support very weak linguistic transformations, and the trained models do not yet lead to reliably high-quality MT. Over the past three years, many new probabilistic tree-based (versus string-based) models have been designed and tested on many natural language applications, including MT. Most of these models turn out to be instances of tree transducers, a formal automata model first described by W. Rounds and J. Thatcher in the 1960s and 70s. These automata open up new opportunities for us to marry deeper representations, automata theory, and machine learning. This talk will cover new learning algorithms for tree automata, together with experiments in machine translation. Bio: Kevin Knight is a Senior Research Scientist at USC's Information Sciences Institute, a Research Associate Professor in the Computer Science Department at USC, and a founder of Language Weaver, Inc. He received his Ph.D. from Carnegie Mellon University in 1991 and his BA from Harvard University in 1986. He is co-author (with Elaine Rich) of the textbook "Artificial Intelligence". His main research interests are statistical natural language processing, machine translation, natural language generation, and decipherment. Dr. Knight is serving as General Chair of the Association for Computational Linguistics conference to be held this year at the University of Michigan.