[talks] S Comandur preFPO

Wed Mar 19 09:18:26 EDT 2008

 
Seshadhri Comandur will present his preFPO on Wednesday March 26 at 3:30PM 
in Room 402.  The members of his committee are: Bernard Chazelle, advisor; 
Moses Charikar and Sanjeev Arora, readers; Boaz Barak and Rob Schapire, 
nonreaders.  Everyone is invited to attend his talk.  The abstract follows below.
---------------------------------------
Title : Dealing with data : Online reconstruction and self-improving algorithms

Abstract : With the ubiquity of large data sets, reading the whole input might itself by a
luxury. Consider some large dataset that is supposed to (but does not) have some
structural property, e.g., a set of numbers should be sorted, a set of points should in
convex position, or a graph should be an expander. Let the input be represented by a
function f (over a suitable domain and range). Our aim is to output a function g that
satisfies the property, and does not differ too much from f. Furthermore, we want to
generate g in an online sublinear fashion : given a domain point x, we can output g(x) in
sublinear time. 
This is the problem of online reconstruction, which naturally has connections to property
testing and other sublinear slgorithms. We study the reconstruction problem in various
settings - monotonicity of functions over lattices, convexity of surfaces, and bounded
degree expanders. The results obtained are also interesting in their own right, and
provide sublinear algorithmic tools to study these properties.
     In an interesting aside, we use sublinear tools for online learning problems. We
define stronger performance measures for learning problems, and provde very efficient
algorithms that attain this performance. One of the key insights is how tools from the
sublinear and streaming world allow us to construct efficient learning algorithms.

     Input in the real world is not always worst-case, and probably comes from some low
entropy distribution over the input space. It seems natural that algorithms should run
faster given this constraint. Is it possible to learn enough about the input distribution
to optimize running time? This is the problem of designing self-improving algorithms,
which can tune themselves to the input distribution and cut down their running time. We
prove a variety of results for the classic problem of sorting numbers, and then extend
these to computational geometry world.