<html><body><p>Colloquium Speaker<br> Jeffrey Siskind, from Purdue University <br> Thursday, October 15, 12:30pm<br> Computer Science 105</p><p><br></p><p>

 Seeing, Saying, Doing, and Learning: Integrating Computer Vision, 

Natural Language Processing, Robotics, and Machine Learning Through 

Multidirectional Inference</p><p><br></p><p> The semantics of natural language can 

be grounded in perception and motor control with a unified cost function

 that supports multidirectional inference. I will present several 

instances of this approach.  The first is a cost function relating 

sentences, video, and a lexicon.  Performing inference from video and a 

lexicon to sentences allows it to generate sentential descriptions of 

video.  Performing inference from sentences and a lexicon to video 

allows it to search a video database for clips that match a sentential 

query. Performing inference from sentences and video to a lexicon allows

 it to learn a lexicon.  The second is the functional inverse of video 

captioning.  Instead of mapping video and object detections to 

sentences, one can map video and sentences to object detections.  This 

allows one to use sentential constraint on a video object codetection 

process to find objects without pretrained object detectors.  The third 

is a cost function relating sentences, robotic navigation paths, and a 

lexicon.  Performing inference from sentences and navigation paths to a 

lexicon allows it to learn a lexicon.  Performing inference from 

navigation paths and a learned lexicon to sentences allows it to 

generate sentential descriptions of paths driven by a mobile robot. 

Performing inference from sentences and a learned lexicon to navigation 

paths allows it to plan and drive navigation paths that satisfy a 

sentential navigation request.  Finally, one can perform object 

codetection on the video stream from a robot-mounted camera during 

navigation to satisfy sentential requests and use the collection of 

constraints from vision, language, and robotics to detect, localize, and

 label objects in the environment without any pretrained object 

detectors.</p><p><br></p><p> Joint work with Andrei Barbu, Daniel Paul Barrett, Scott Alan Bronikowski, N. Siddharth, and Haonan Yu.</p><p><br></p><p>

 Jeffrey M. Siskind received the B.A. degree in computer science from 

the Technion, Israel Institute of Technology, Haifa, in 1979, the S.M. 

degree in computer science from the Massachusetts Institute of 

Technology (M.I.T.), Cambridge, in 1989, and the Ph.D. degree in 

computer science from M.I.T. in 1992. He did a postdoctoral fellowship 

at the University of Pennsylvania Institute for Research in Cognitive 

Science from 1992 to 1993. He was an assistant professor at the 

University of Toronto Department of Computer Science from 1993 to 1995, a

 senior lecturer at the Technion Department of Electrical Engineering in

 1996, a visiting assistant professor at the University of Vermont 

Department of Computer Science and Electrical Engineering from 1996 to 

1997, and a research scientist at NEC Research Institute, Inc. from 1997

 to 2001. He joined the Purdue University School of Electrical and 

Computer Engineering in 2002 where he is currently an associate 

professor. His research interests include computer vision, robotics, 

artificial intelligence, neuroscience, cognitive science, computational 

linguistics, child language acquisition, automatic differentiation, and 

programming languages and compilers.</p></body></html>