Dmitry Paramonov will present his FPO "Handling Data Too Large To Handle: On Multi-pass Streaming and Interactive Coding" on May 7th, 2024 at 11am in Friend 006.

Zoom link: https://princeton.zoom.us/j/95983100753

His committee is as follows:

Examiners: Gillat Kol (adviser), Ran Raz, Mark Braverman

Readers: Huacheng Yu and Matt Weinberg

Title: Handling Data Too Large To Handle: On Multi-pass Streaming and Interactive Coding

Abstract:

Over the last decades, the world has become increasingly more information-centric and massive

amounts of data, potentially distributed between many different sources, are being processed all the

time. In this thesis, I consider two mechanisms for coping with big data and the distributed nature

of timely tasks.

In Part I, I showcase my work on the streaming setting, where the input to the algorithm is

given as a stream of elements. The algorithm’s goal is to compute a value that depends on the

stream while only utilizing memory that is much smaller than the entire stream. My work in this

field focuses on proving that various fundamental graph problems essentially require the streaming

algorithm to store the entire graph, even if it is allowed to make several passes through the given

stream of edges.

In Part II, I consider error-correcting codes for distributed, interactive settings. Classical error-

correcting codes assume that a sender who has all the information wishes to send it to a receiver over

a noisy channel. However, in many modern, big data applications, the information is distributed

amongst many parties that communicate back-and-forth to compute a value that depends on all

their inputs. My work examines the noise resilience of various such settings. For some models,

we can design error-correcting protocols that allow the encoding of every noiseless protocol by a

noise-resilient protocol with low overhead, whereas for other models, it can be shown that this task

is impossible.

While these two topics appear greatly unrelated, and almost orthogonal to one another, the tools

used to prove results in both turn out to be remarkably similar, with many standard problems and

information theory lemmas being a critical part of both.