[talks] Colloquium Speaker: Andrej Karpathy Monday April 11, 12:30pm
Nicole E. Wagenblast
nwagenbl at CS.Princeton.EDU
Thu Apr 7 10:13:40 EDT 2016
Andrej Karpathy, Stanford University
Monday, April 11- 12:30pm
Computer Science 105
Connecting Images and Natural Language
Intelligent agents require the ability to perceive their environments, understand their high-level semantics, and communicate with humans. While computer vision has recently made great strides on visual recognition tasks, the predominant paradigm is to predict one or more fixed visual categories for each image. I will describe a line of work that significantly expands the vocabulary of our computer vision systems and allows them to express visual concepts in natural language, such as “a picture of a girl playing with a stack of legos”, or “a couple holding hands and walking on a beach”. In particular, the final model can take an image and both detect and describe in natural language all of its salient regions. My modeling techniques draw on recent advances in Deep Learning that allow us to construct and train neural networks with hundreds of millions of neurons that take raw images and map them directly to natural language sentences. I will show that the model generates qualitatively compelling results and quantitative evaluation and control experiments demonstrate the strength of this approach with respect to simpler baselines and previous methods.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the talks