Kaifeng Lyu will present his General Exam "Theoretical Analysis of Modern Neural Networks with Normalization Layers: Auto Rate-Tuning and Sharpness Reduction" on Tuesday, January 17, 2023 at 1:00 PM over Zoom.

Zoom link: https://princeton.zoom.us/j/97584410651

Committee Members: Sanjeev Arora(advisor), Jason D. Lee, Elad Hazan

Abstract:

In the past, traditional machine learning techniques relied on convex optimization for training. However, the deep learning revolution of the last decade has led to significant progress in many machine learning tasks, even though the loss landscape is highly non-convex. My research focuses on theoretical analysis of the training dynamics of deep neural nets, including convergence guarantees to low-loss solutions and implicit regularization effects of gradient-based methods. I will present our work on understanding the training dynamics in presence of normalization layers (e.g., Batch Normalization, Layer Normalization), which are important building blocks of modern neural nets. Our work shows the following: (1) normalization layers make the training process adaptive to different loss landscapes, even when using non-adaptive methods such as SGD; (2) normalization layers prevent full-batch gradient descent from converging to sharp minima and introduce an implicit bias towards flatter minima with better generalization, following a continuous sharpness-reduction flow.

Reading List:

https://docs.google.com/document/d/1BwImXphwwK5lk1NKDC0LApys0soGY7TVpptTZZzn1HQ/edit?usp=sharing

Everyone is invited to attend the talk, and those facultywishing to remain for the oral exam following are welcome to do so.