[Ml-stat-talks] Colloquioum Speaker Gerry Tesauro Wed Oct 26 4:30pm

Robert Schapire schapire at CS.Princeton.EDU
Thu Oct 20 14:54:43 EDT 2011


Gerry Tesauro (known for his work on TD-gammon, a championship-level 
backgammon program, and more recently, on the Jeopardy-playing program, 
Watson) will be visiting next Wednesday, 10/26, and through the next 
morning.  See talk announcement below.

If you would like to meet with him, please contact Nicole Wagenblast 
<nwagenbl at CS.Princeton.EDU>, x8-4624.

Rob


How Watson Learns Superhuman Jeopardy! Strategies
Gerry Tesauro, IBM Research
Wednesday, October 26, 2011, 4:30 PM
Computer Science Small Auditorium (Room 105)

Major advances in Question Answering technology were needed for Watson 
to play Jeopardy! at championship level -- the show requires rapid-fire 
answers to challenging natural language questions, broad general 
knowledge, high precision, and accurate confidence estimates. In 
addition, Jeopardy! features four types of decision making carrying 
great strategic importance: (1) selecting the next clue when in control 
of the board; (2) deciding whether to attempt to buzz in; (3) wagering 
on Daily Doubles; (4) wagering in Final Jeopardy. This talk describes 
how Watson makes the above decisions using innovative quantitative 
methods that, in principle, maximize Watson's overall winning chances. 
We first describe our development of faithful simulation models of human 
contestants and the Jeopardy! game environment. We then present specific 
learning/optimization methods used in each strategy algorithm: these 
methods span a range of popular AI research topics, including Bayesian 
inference, game theory, Dynamic Programming, Reinforcement Learning, and 
real-time "rollouts." Application of these methods yielded superhuman 
game strategies for Watson that significantly enhanced its overall 
competitive record.

Joint work with David Gondek, Jon Lenchner, James Fan and John Prager.

Gerald Tesauro is a Research Staff Member at IBM's TJ Watson Research 
Center. He is best known for developing TD-Gammon, a self-teaching 
neural network that learned to play backgammon at human world 
championship level. He has also worked on theoretical and applied 
machine learning in a wide variety of other settings, including 
multi-agent learning, dimensionality reduction, computer virus 
recognition, computer chess (Deep Blue), intelligent e-commerce agents 
and autonomic computing. Dr. Tesauro received BS and PhD degrees in 
physics from University of Maryland and Princeton University, respectively.



More information about the Ml-stat-talks mailing list