Rob Harrison will present his FPO "Scalable, Network-wide Telemetry with Programmable Switches" on Tuesday, May 14, 2019 at 2pm in CS 402.

The members of his committee are as follows:  Jennifer Rexford (adviser), Readers: Nick Feamster and Nate Foster (Cornell University); Nonreaders: David Walker and Kyle Jamieson

A copy of his thesis, is available upon request.

Everyone is invited to attend his/her talk. The talk title and abstract follow below:

Managing modern networks requires collecting and analyzing network traffic from distributed switches in real time, i.e., performing network-wide telemetry.  Telemetry systems must be flexible and fine-grained to support myriad queries about the security, performance, and reliability of networks. Yet, they must also scale as the number of queries, link speeds, and the size of the networks increase. Realizing these goals requires balancing the division of labor between high-speed, but resource constrained, network switches and general purpose CPUs to support flexible telemetry at scale.

First, we present Sonata, a flexible and scalable network telemetry system that uses the compute resources of both stream-processing servers and a single Protocol Independent Switch Architecture (PISA) switch. PISA switches offer both high-speed processing and limited programmability. We show how to execute Sonata’s high-level queries at line rate by first compiling them to PISA primitives. Next, we model the resource constraints of PISA switches to solve an optimization problem that minimizes the load on the stream processor by executing portions of queries directly in the switch. Sonata can support a wide range of monitoring queries and reduces the stream processor’s workload by orders of magnitude over existing telemetry systems.

Second, we present Herd, a system for implementing a subset of Sonata queries distributed across several switches. Herd determines network-wide heavy hitters, i.e., flows that consist of many more packets than most others, by counting flows at the switches, without maintaining per-flow state, and probabilistically reporting to a central coordinator. Based on these reports, the coordinator adapts parameters at each switch based on the spatial locality of the flows.  Simulations using packet traces show that our prototype can detect network-wide heavy hitters accurately with 17% savings in communication overhead and 38% savings in switch state compared to existing approaches. We then present an algorithm to tune system parameters in order to maximize detection accuracy under switch memory and bandwidth constraints.

Together, Sonata and Herd provide network operators the ability to execute a set of network-wide telemetry queries from a single interface that combines the strengths of both programmable data planes and general-purpose CPUs.