Kyle Genova will present his FPO "3D Representations for Learning to Reconstruct and Segment Shapes" on Friday, April 30, 2021 at 1PM via Zoom.
Zoom link: https://princeton.zoom.us/j/95639056934
The members of Kyle’s committee are as follows: Tom Funkhouser (adviser), Readers: Olga Russakovsky and Forrester Cole (Google), Examiners: Adam Finkelstein and Szymon Rusinkiewicz.
A copy of his thesis is available upon request. Please email gradinfo@cs.princeton.edu if you would like a copy of the thesis.
Everyone is invited to attend his talk.
Abstract follows below:
The focus of this dissertation is the novel use of shape representations to empower 3D reasoning for reconstruction and segmentation. It is organized into three sections based on application: domain specific shape reconstruction (Chapter 2), general shape reconstruction (Chapter 3), and semantic segmentation (Chapter 4). In each chapter, we outline the setting and related work, and then introduce one or two approaches with a novel use of shape representation.
Our key contribution is to use shape representation to enable new types of supervision and improve generalization when learning 3D priors. Because current reconstruction and segmentation methods share the use of learned 3D encoder and decoder architectures, these contributions apply to both tasks.
In Chapters 2-4, we demonstrate experimentally that reconstruction and segmentation algorithms benefit from our choices of shape representation. A primary benefit of our approaches is enabling new types of supervision that require some property of the representation to be effective. Domain specific representation enables supervising 3D face reconstruction with a face recognition network for the first time, resulting in provably more recognizable reconstructions (Chapter 2). Our SIF representation learns shape correspondence from only reconstruction supervision (Chapter 3). Large, diverse image collections are already semantically labeled, making it possible to train 3D semantic segmentation models for datasets without point cloud annotations (Chapter 4).
A secondary benefit is improved generalization by deriving better priors from existing supervision. We propose a new shape representation, LDIF, which is trained on existing 3D reconstruction data. LDIF learns robust local priors, improving generalization to unseen classes and shapes (Chapter 3). The addition of image-based supervision in segmentation algorithms improves generalization to cities with no 3D supervision (Chapter 4).
We conclude that our choices of representation enable new supervision, better generalization, and learning useful 3D priors from readily available labels (e.g., labeled and unlabeled images, or unlabeled shape collections). We hypothesize that effective future representations will build on this trend by deriving higher level semantic priors from unannotated datasets and other inexpensive sources of supervision (Chapter 5).