Sid Sen will present his preFPO on Thursday March 8 at 11AM in Room 402.
The members of his committee are: Bob Tarjan, advisor, reader; Michael Freedman,
advisor, reader; Jen Rexford, reader; Vivek Pai and Bob Sedgewick, nonreaders.
Everyone is invited to attend his talk. His abstract follows below.
----------------------------------

Title: New Data Structures and Algorithms for Highly-Available Distributed Systems

Abstract:

We design new data structures and algorithms to improve the availability of

distributed systems. Our work spans several settings, from individual database

servers to large-scale peer-to-peer systems. A unifying approach in our work is

to synergize theory and systems until a solution that is both practical and

provably good is attained.

In database systems, availability suffers when transactions contend for locks on

shared index structures. To increase transaction concurrency, many databases

such as Berkeley DB avoid rebalancing B-tree indices after deletions, only

rebalancing after insertions. But when one company tried this technique with

red-black trees, the result was an extended downtime and litigation. Using a

novel exponential-potential analysis, we theoretically justify why deletion

without rebalancing works in B-trees but fails in red-black trees, and devise

a new binary tree called Ravl trees that works.

In peer-to-peer systems, the availability of a service depends on unreliable

participant machines. These machines fail not just by crashing, but also in

arbitrary ways, e.g. due to bugs or malice. To scale such a system securely, we

must partition it into smaller groups that remain locally correct despite a

dynamic, global adversary. We expose serious practical limitations in the

current-best algorithm for this, and devise a new algorithm called Commensal

Cuckoo that keeps the system available two orders of magnitude longer.

In enterprise desktop and some server cluster settings, availability is often

desired in conjunction with energy efficiency. Unfortunately, existing

energy-saving solutions require special hardware or virtualization technology

that IT departments are unwilling to deploy. To address this, we built GreenUp,

a system that allows any machine to act as a proxy for other sleeping machines,

ensuring their availability to clients seamlessly. Notably, while optimizing

load distribution in GreenUp, we discovered and analyzed a new balls-in-bins

process with connections to population genetics and high-dimensional geometry.

GreenUp is deployed on over ~1,100 machines within Microsoft.

Though often ignored, networking is as critical to the availability of a system

as the servers themselves. Beyond connectivity, congestion in the network can

render services unavailable by starving traffic, even when sufficient bandwidth

exists. We address this problem in datacenter networks through LocalFlow, a flow

routing algorithm based on splitting flows. Unlike prior solutions, LocalFlow is

switch-local (and hence scalable) and routes optimally. Realizing LocalFlow

required a novel theoretical model as well as novel mechanisms for practical

deployment on programmable commodity switches.