AI Seminar ------------------------------- Tuesday, November 4th, 2003 4:00 pm - 5:30 pm 175 ATL (Large Conference Room) "Predictive Models for Reinforcement Learning" Satinder Singh Department of Electrical Engineering and Computer Science University of Michigan ---------------------------------- The use of Markov decision process (MDP) models to represent agent-environment interaction has been very fruitful for reinforcement learning and for artificial intelligence in general. After briefly reviewing some of these "fruits" I will discuss the limitations of MDP models and the need to go beyond them. The standard extension of MDPs to partially-observable MDPs, or POMDPs, haven't served us well, at least so far. In this talk, I will present predictive state representations, or PSRs, a new class of predictive models for reinforcement learning. The key idea in PSRs is to use predictions of observable outcomes of tests or experiments the agent can do in its environment to represent the state of the environment. I will show that PSRs are more general than POMDPs and yet are at least as, and often more, compact than POMDPs. I will also present some results on learning PSR models from data and conclude with some reasons for optimism about PSR models as well as with directions for future work on PSRs. * This talk describes joint work with a number of folks over 3 conference papers: Michael Littman and Rich Sutton a year ago, and more recently Matthew Rudary, and Michael James