Sid Sen will present his preFPO on Thursday March 8 at 11AM in Room 402. The members of his committee are: Bob Tarjan, advisor, reader; Michael Freedman, advisor, reader; Jen Rexford, reader; Vivek Pai and Bob Sedgewick, nonreaders. Everyone is invited to attend his talk. His abstract follows below. ---------------------------------- Title: New Data Structures and Algorithms for Highly-Available Distributed Systems Abstract: We design new data structures and algorithms to improve the availability of distributed systems. Our work spans several settings, from individual database servers to large-scale peer-to-peer systems. A unifying approach in our work is to synergize theory and systems until a solution that is both practical and provably good is attained. In database systems, availability suffers when transactions contend for locks on shared index structures. To increase transaction concurrency, many databases such as Berkeley DB avoid rebalancing B-tree indices after deletions, only rebalancing after insertions. But when one company tried this technique with red-black trees, the result was an extended downtime and litigation. Using a novel exponential-potential analysis, we theoretically justify why deletion without rebalancing works in B-trees but fails in red-black trees, and devise a new binary tree called Ravl trees that works. In peer-to-peer systems, the availability of a service depends on unreliable participant machines. These machines fail not just by crashing, but also in arbitrary ways, e.g. due to bugs or malice. To scale such a system securely, we must partition it into smaller groups that remain locally correct despite a dynamic, global adversary. We expose serious practical limitations in the current-best algorithm for this, and devise a new algorithm called Commensal Cuckoo that keeps the system available two orders of magnitude longer. In enterprise desktop and some server cluster settings, availability is often desired in conjunction with energy efficiency. Unfortunately, existing energy-saving solutions require special hardware or virtualization technology that IT departments are unwilling to deploy. To address this, we built GreenUp, a system that allows any machine to act as a proxy for other sleeping machines, ensuring their availability to clients seamlessly. Notably, while optimizing load distribution in GreenUp, we discovered and analyzed a new balls-in-bins process with connections to population genetics and high-dimensional geometry. GreenUp is deployed on over ~1,100 machines within Microsoft. Though often ignored, networking is as critical to the availability of a system as the servers themselves. Beyond connectivity, congestion in the network can render services unavailable by starving traffic, even when sufficient bandwidth exists. We address this problem in datacenter networks through LocalFlow, a flow routing algorithm based on splitting flows. Unlike prior solutions, LocalFlow is switch-local (and hence scalable) and routes optimally. Realizing LocalFlow required a novel theoretical model as well as novel mechanisms for practical deployment on programmable commodity switches.
participants (1)
-
Melissa M. Lawson