[talks] Yinda Zhang - PreFPO - April 10, 2018 at 2:00 pm CS402

Mon Apr 9 11:28:04 EDT 2018

Yinda Zhang will present PreFPO on Tuesday, April 20, 2018 at 2:00 in room CS402

Committee Member: Thomas Funkhouser, Adviser
Olga Russakovsky
Szymon Rusinkiewicz
Ryan Adams
James Hays (Georgia Institute of Technology)

All are welcomed to attend. Title and abstract are below.

Title: “From Pixels to Scenes: Recovering 3D Geometry and Semantics for Indoor Environments”

Abstract:

Understanding 3D geometry and semantics of the surrounding environment is in critically high demand for many applications, such as autonomous driving, robotics, augmented reality, etc.
However, it is extremely challenging due to the low quality depth measurements due to failures and noisy measurements from sensors, limited access to ground truth data, and cluttered scenes with heavy occlusions and intervening objects. In this presentation, I will introduces a full spectrum of 3D scene understanding works to handle these challenging issues. Starting from estimating a depth map, which is one of the most important immediate measurements of the 3D geometry of the scene, we introduce a learning based active stereo system that learns self-supervisely and reduces the disparity error to 1/10th of other canonical stereo systems. To further handle the missing depth caused by sensor failures, we propose a method to effectively complete the depth map using information from an aligned color image. Beyond per pixel depth, we then attempt to predict other high-level semantics on each pixel, such as surface normals and object boundaries.  However, realizing the lack of large scale supervision, we design a synthetic data generation framework, which creates photo-realistic color rendering and various of accurate pixel-wise ground truths to facilitate the learning process and improve the performance on real data. In the end, we pursue holistic scene understanding by estimating a 3D representation of the scene, in which objects and room layout are represented using 3D bounding box and planar surface respectively. We propose methods to produce such representation from either a single color panorama or depth image leveraging scene context. On the whole, these proposed methods produce understanding of both 3D geometry and semantics from the most fine-grained pixel level to the holistic scene scale, which build foundations and could possibly inspire future works for 3D scene understanding.

Barbara A. Mooring
Interim Graduate Coordinator
Computer Science Department
Princeton University