Wei Hu will present his Pre FPO on Tuesday, April 13, 2021 at 4pm via Zoom.


Examiners: Sanjeev Arora (advisor), Mark Braverman, Chi Jin (ECE)

Readers: Elad Hazan, Jason D. Lee (ECE)


Title: Understanding Deep Learning via Analyzing Dynamics of Gradient Descent


The phenomenal successes of deep learning build upon the mysterious abilities of gradient-based optimization algorithms. Not only can these algorithms often successfully optimize complicated non-convex training objectives, but the solutions found can also generalize remarkably well to unseen test data despite significant over-parameterization of the models. Classical approaches in optimization and learning theory that treat empirical risk minimization as a black box are insufficient to explain these mysteries in modern deep learning. In this talk, I will illustrate how we can make progress towards understanding optimization and generalization in deep learning by a more refined approach that opens the black box and analyzes the dynamics taken by the optimizer. In particular, I will present several results focusing on analyzing the dynamics of the gradient descent algorithm, including: (i) solving low-rank matrix completion via deep linear neural networks, (ii) positive and negative convergence results which reveal the optimal depth-width tradeoff for efficiently training deep linear neural networks, and (iii) the connection between wide neural networks and neural tangent kernels, and its theoretical and practical implications.