Alejandro Newell will present his Pre-FPO "Learning to solve structured vision problems" on Tuesday, July 20, 2021 at 2pm via Zoom.


Zoom Link:



Examiners: Jia Deng (Adviser), Adam Finkelstein, Felix Heide

Readers: Olga Russakovsky, Szmon Rusinkiewicz


All are welcome to attend.


Title: Learning to solve structured vision problems

Abstract: We want computer vision models to understand the rich world captured in images and video. This means not just recognizing objects, but seeing how they relate and interact. In the first half of my talk, I will present my work developing a general system for capturing visual relationships. Combining our contributions in both neural architecture and loss design, we enable deep models to express arbitrary semantic graphs. This allows models to connect parts of the image into structured outputs such as body parts into poses. These models outperform prior work in various challenging vision tasks. In the second half of the talk, I will cover ongoing work that moves beyond single images into video. Connections across time are often captured at the pixel level with optical flow. We show how to improve existing flow systems with feedback mechanisms inspired by self-supervised losses. I will also discuss future work extending such systems to downstream video tasks. Together, the presented body of work lays the foundations for systems that understand the world and its interactions across space and time.