Seshadhri Comandur will present his preFPO on Wednesday March 26 at 3:30PM in Room 402. The members of his committee are: Bernard Chazelle, advisor; Moses Charikar and Sanjeev Arora, readers; Boaz Barak and Rob Schapire, nonreaders. Everyone is invited to attend his talk. The abstract follows below. --------------------------------------- Title : Dealing with data : Online reconstruction and self-improving algorithms Abstract : With the ubiquity of large data sets, reading the whole input might itself by a luxury. Consider some large dataset that is supposed to (but does not) have some structural property, e.g., a set of numbers should be sorted, a set of points should in convex position, or a graph should be an expander. Let the input be represented by a function f (over a suitable domain and range). Our aim is to output a function g that satisfies the property, and does not differ too much from f. Furthermore, we want to generate g in an online sublinear fashion : given a domain point x, we can output g(x) in sublinear time. This is the problem of online reconstruction, which naturally has connections to property testing and other sublinear slgorithms. We study the reconstruction problem in various settings - monotonicity of functions over lattices, convexity of surfaces, and bounded degree expanders. The results obtained are also interesting in their own right, and provide sublinear algorithmic tools to study these properties. In an interesting aside, we use sublinear tools for online learning problems. We define stronger performance measures for learning problems, and provde very efficient algorithms that attain this performance. One of the key insights is how tools from the sublinear and streaming world allow us to construct efficient learning algorithms. Input in the real world is not always worst-case, and probably comes from some low entropy distribution over the input space. It seems natural that algorithms should run faster given this constraint. Is it possible to learn enough about the input distribution to optimize running time? This is the problem of designing self-improving algorithms, which can tune themselves to the input distribution and cut down their running time. We prove a variety of results for the classic problem of sorting numbers, and then extend these to computational geometry world.
participants (1)
-
Melissa M Lawson