David Shue will present his preFPO on Thursday April 25 at 9AM in Room 402.
The members of his committee are: Michael Freedman, advisor; Jennifer
Rexford and Anees Shaikh (IBM Research), readers; Vivek Pai and Margaret
Martonosi, nonreaders. Everyone is invited to attend his talk. His abstract
follows below.
--------------------------------------------

Title: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

Abstract:

Shared storage services enjoy wide adoption in commercial clouds. But most systems today provide weak
performance isolation and fairness between tenants, if at all. Misbehaving or high-demand tenants can overload
the shared service and disrupt other well-behaved tenants, leading to unpredictable performance and violating SLAs.

In this thesis, we present Pisces, a system for achieving datacenter-wide per-tenant performance isolation and
fairness in shared key-value storage. Today’s approaches for multi-tenant resource allocation are based either
on per-VM allocations or hard rate limits that assume uniform workloads to achieve high utilization. Pisces provides
per-tenant weighted fair shares (or minimal rates) of the aggregate resources over the entire service, even when
different tenants’ partitions are co-located and when demand for different partitions is skewed, time-varying, or
bottlenecked by different server resources. Our key insight was to decompose the system-wide fair sharing problem
into a combination of four complementary mechanisms—partition placement, weight allocation, replica selection,
and weighted fair queuing—that operate on different time-scales and combine to achieve per-tenant max-min fairness.

While Pisces provides fairness for key-value storage systems that leverage asynchronous writes for performance,
many storage systems require stronger durability guarantees and often provide rich data models that can support
arbitrary computation in the form of UDF's or map/reduce functionality. Achieving predictable performance in these
systems requires fine-grained per-tenant resource allocation over multiple resources (e.g network, disk, and CPU).
In the second part of this thesis, we generalize and extend the PIsces per-node scheduling model with Libra, a
multi-resource allocation library for building predictable shared services.