Abhishek Panigrahi will present his General Exam "Demystifying Gradient Descent in modern Deep Learning: Implicit training biases and Modular Generalization" on Wednesday, January 18, 2023 at 3:00 PM over Zoom.

Zoom link: https://princeton.zoom.us/j/94052692845

Committee Members: Sanjeev Arora (advisor), Elad Hazan, Danqi Chen

Abstract:

Modern deep learning involves training large scale neural networks, which comes at the cost of deciding the best training recipe. Traditional machine learning fails to explain the hidden mechanisms of such models, owing to the high non-convexity of the model landscape. My research focuses on the training-time interplay between different training algorithms and the architecture that drives the generalizability of these models. In this talk, I will focus on Gradient Descent and its two novel mechanisms: (a) the Edge of Stability in Deep Learning, where the interplay between learning rate and the model landscape leads to implicit regularization of hessian flatness during training, and (b) modular skill acquisition that drives generalization during language-model fine-tuning.

Reading List:

https://docs.google.com/document/d/1ZWwBodrZi86Jlli3r2CUlsfELlGwRAL1KfsFmplbkYw/edit

Everyone is invited to attend the talk, and those faculty wishing to remain for the oral exam following are welcome to do so.

Louis Riehl
Graduate Administrator
Computer Science Department, CS213
Princeton University
(609) 258-8014