Zhiyuan Li will present his Pre FPO "Toward Mathematical Understanding of Real-life Deep Learning" on February 18, 2022 at 2:30pm via Zoom
Zhiyuan Li will present his Pre FPO "Toward Mathematical Understanding of Real-life Deep Learning" on February 18, 2022 at 2:30pm via Zoom Zoom link: https://princeton.zoom.us/j/93110423178?pwd=K2RoRm9YTmhacE5pU3JnZE1JbmdQdz09 Committee: Sanjeev Arora (adviser), Danqi Chen (non-reader), Chi Jin(non-reader, ECE), Elad Hazan (Reader), Jason Lee( ECE, Reader). Title: Toward Mathematical Understanding of Real-life Deep Learning Abstract: There is great interest in developing a mathematical understanding of the tremendous success of deep learning. Most of this understanding has been done in simplified settings (depth 2 or 3; NTK regime). This talk presents my recent works providing a mathematical understanding of real-life nets and losses, incorporating the effect of normalization, architectural features, stochasticity, and finite learning rate(LR). It leverages insights from continuous mathematics (including Stochastic Differential Equation(SDE)) which I will use to show interesting new mechanisms for implicit regularization during training. I will finish by presenting a new practical advance from our theoretical insights: a robust variant of BERT (a language model at the heart of the ongoing revolution in Natural Language Processing) called SIBERT that uses a new scale-invariance architecture and is trainable with vanilla SGD.
participants (1)
-
Nicki Mahler