Cliff Liu will present his Pre FPO "Sublinear Time Algorithms for Graph Problems" on Thursday, December 3, 2020 at 2pm via Zoom.

Zoom link: https://princeton.zoom.us/j/95875996152

The members of his committee are as follows:
Examiners: Robert Tarjan (advisor), Matt Weinberg, Sepehr Assadi (Rutgers) Readers: Bernard Chazelle, Omri Weinstein (Columbia)

Abstract: 
With the rapid growth of the internet, many optimization problems in recent big-data applications often exceed the capacity of traditional computing platforms. My goal is to lay the theoretical foundation of big-data analysis by designing algorithms on modern computing platforms to solve large-scale optimization problems under various resource constraints, with an emphasis on time efficiency and simplicity. 

 

The datasets in most applications are originated from or can be modeled as graphs, and this thesis focuses on solving graph problems. Usually, the graph is too large to store in a single memory, resulting in a very slow random access to the data. To capture this, one important computational model is streaming: we want to solve the problem using very limited space while minimizing the number of passes of scanning the entire data. Another framework to solve large-scale graph problems is to store the graph in a distributed manner and use the computational power of the distributed processors, which is called parallel computation. These two areas continue to be very active partly due to the success of the cloud computing platforms from Google, Microsoft, Amazon, and so on. 

 

We obtain faster and simpler algorithms in both streaming and parallel computation models for graph problems such as connectivity, bipartite matching, and maxflow. In particular, we solved three open problems. The first one is that given a graph of n vertices and m edges, does there exist a semi-streaming (using nearly n space) algorithm that computes the maximum weight bipartite matching in o(n) passes? We answer this question positively by giving an algorithm that runs in square root m passes. The algorithm relies on space-efficient versions of the interior point method, Laplacian system solving, and the generalized isolation lemma, all of which are not known before this work. The second problem is to break the square root n depth for computing maxflow in parallel. The square root n depth barrier exists for digraph reachability, matching, and other fundamental problems. Our solution is approximate, but only runs in n^{1/3} depth for unit-capacity graphs. The third problem is to give an o(log n) depth parallel algorithm for connectivity on a PRAM (parallel random access machine). All previous PRAM algorithms use at least log n depth. Our algorithm runs in O(log d + loglog n) depth where d is the diameter, which breaks the log n barrier for small-diameter graphs. Before our work, there is only an MPC (massively parallel computing, a model that is much stronger than PRAM) algorithm and it is very complicated. Our algorithm is simpler and more practical, albeit in a much weaker model. 

 

We also design several extremely simple algorithms for connectivity and bipartite matching. For the connectivity problem, we give several elegant algorithms (about five lines of code) in the MPC model, with a state-of-the-art running time of O(log n) for undirected graph. For the bipartite matching problem, we give a very simple streaming algorithm based on auctions that approximates the maximum matching to (1-eps) in O(eps^{-2}) passes and O(n) space, reducing the number of passes and space from previous work, which also has applications on faster MPC algorithms for matching.