[talks] Yuanzhi Li will be presenting his general exam on Tuesday, May 10, 2016 at 1:30pm in CS 302.

Mon May 2 14:21:25 EDT 2016

Yuanzhi Li will be presenting his general exam on Tuesday, May 10, 2016	at 1:30pm in CS 302.

The members of his committee are Sanjeev Arora (adviser), Elad Hazan, and Barbara Engelhardt.

Everyone is invited to attend his talk, and those faculty wishing to remain for the oral exam following are welcome to do so.  His abstract and reading list follow below.

Title: Towards a better understanding of streaming PCA.

Abstract:
Principle Component Analysis (PCA) is one of the most fundamental tools in machine learning and data analysis. Traditionally, PCA was usually solved by putting the data all together and doing a joint singular value decomposition, which requires quadric size of memory and liner number of passes of the data. However, nowadays, it is often the case that the data set is giant and we can not afford more than one pass of the data, or the entire data set is too large to fit into the memory. To deal with with this issue, there is a on-going popular line of research which considers solving PCA in a streaming stochastic model, which picks one data point at a time and perform stochastic gradient descent according to it. In this talk, I will present the proof of convergence of this algorithm from random initialization. Previously, only the convergence from warm start was known.  

The proof is built upon the solution of Kardison-Singer by Adam Marcus, Dan Spielman and Nikhil Srivastava; as well as the recent on-going work of off-line gap-free PCA by Zeyuan Allen-Zhu and myself.

Reading List:

Books:
(1). Elad Hazan, Introduction to Online Convex Optimization
(2). Sanjeev Arora, Lecture Note (COS 521), http://www.cs.princeton.edu/courses/archive/fall14/cos521/.
(3). Shalev-Schwartz, Understanding Machine Learning: From Theory to Algorithms
(4). Yurii Nesterov and Arkadii Nemirovskii, Interior point polynomial algorithms in convex programming.

Papers:
(1). Prateek Jain Chi Jin Sham M. Kakade Praneeth Netrapalli Aaron Sidford, Streaming PCA: Matching Matrix Bernstein and Near-Optimal Finite Sample Guarantees for Oja’s Algorithm
(2). Ohad Shamir, Convergence of Stochastic Gradient Descent for PCA
(3). Ohad Shamir, Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity
(4). Dan Garber and Elad Hazan, Fast and Simple PCA via Convex Optimization
(5). Kenneth L. Clarkson, David P. Woodruff, Low Rank Approximation and Regression in Input Sparsity Time
(6). Srinadh Bhojanapalli, Prateek Jain, Sujay Sanghavi, Tighter Low-rank Approximation via Sampling the Leveraged Element
(7). Cameron Musco and Christopher Musco, Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition