Yinda Zhang will present his FPO "From Pixels to Scenes: Recovering 3D Geometry and Semantics for Indoor Environments" on Thursday, 10/25/2018 at 3:00 pm in CS 302.
Yinda Zhang will present his FPO "From Pixels to Scenes: Recovering 3D Geometry and Semantics for Indoor Environments" on Thursday, 10/25/2018 at 3:00 pm in CS 302. The members of his committee are as follows: Thomas Funkhouser (adviser); Examiners: Ryan Adams, Olga Russakovsky, and Thomas Funkhouser; Readers: Szymon Rusinkiewicz and James Hays (Georgia Institute of Technology) A copy of his thesis is available upon request. Everyone is invited to attend her talk. The talk title and abstract follow below: Understanding the 3D geometry and semantics of real environments is in critically high demand for many applications, such as autonomous driving, robotics, and augmented reality. However, it is extremely challenging due to imperfect and noisy measurements from real sensors, limited access to ground truth data, and cluttered scenes exhibiting heavy occlusions and intervening objects. To address these issues, this thesis introduces a series of works that produce a geometric and semantic understanding of the scene in both pixel-wise and holistic 3D representations. Starting from estimating a depth map, which is a fundamental task in many approaches for reconstructing the 3D geometry of the scene, we introduce a learning-based active stereo system that is trained in a self-supervised fashion and reduces the disparity error to 1/10th of other canonical stereo systems. To handle a more common case where only one input image is available for scene understanding, we create a high-quality synthetic dataset facilitating pre-training of data-driven approaches, and demonstrating that we can improve the surface normal estimation and improve raw depth measurements from commodity RGBD sensors. Lastly, we pursue holistic 3D scene understanding by estimating a 3D representation of the scene, in which objects and room layout are represented using 3D bounding box and planar surfaces respectively. We propose methods to produce such a representation from either a single color panorama or a depth image, leveraging scene context. On the whole, these proposed methods produce understanding of both 3D geometry and semantics from the most ne-grained pixel level to the holistic scene scale, building foundations that support future work in 3D scene understanding.
participants (1)
-
Nicki Gotsis