Theano Stavrinos will present her FPO "Leopard: Unlocking Better Cache Performance at Lower Cost with Expiration Time-based Flash Caching" on Tuesday, May 9, 2023 at 1pm in CS 302.

The committee is as follows:

Examiners: Wyatt Lloyd (adviser), Kai Li, and Amit Levy

Readers: Ethan Katz-Bassett (Columbia/co-adviser) and Ravi Netravali

All are welcome to attend.

Title: "Leopard: Unlocking Better Cache Performance at Lower Cost with Expiration Time-based Flash Caching"

Abstract:

Caches are crucial building blocks of web services. They keep data close to users and other services, reducing request latencies, expensive network traversals, and requests to resource-constrained backend servers. Today’s web services need high-capacity, high-performance caches for their massive working set sizes and to meet stringent performance requirements. Flash-based SSDs meet this need by providing excellent performance and high capacity at low cost. However, caching on flash involves a fundamental tradeoff. On the one hand, caches aim for low miss ratios by keeping useful objects in the cache. On the other hand, caches must protect SSDs from write-induced wear-out, which increases when useful objects are copied forward during garbage collection. Flash caches are often forced to choose between good cache performance (i.e., low cache miss ratios) and acceptable device lifespans.

This dissertation describes Leopard, a flash caching framework for static content that unlocks new positions along the Pareto frontier of the miss ratio/device lifespan tradeoff, enabling more effective caching at lower cost than existing frameworks. At the foundation of Leopard's design are expiration times, which specify an object's earliest eviction time. In particular, Leopard’s garbage collection procedure uses expiration times in a cost/benefit analysis to choose the best flash block to erase, balancing write amplification from copying forward useful objects and increased miss ratios from evictions. Leopard also uses a novel clustering algorithm to group objects together by their expiration times, increasing the likelihood of low write amplification during garbage collection. Our evaluation shows that, for a range of CDN traces, Leopard significantly improves the object miss ratio achievable at a given total write volume compared to the state-of-the-art flash caching framework. Leopard also achieves better byte miss ratios at lower write volumes at most points along the byte miss ratio/write volume Pareto frontier.