[Ml-stat-talks] Sayan Mukerjee (Duke). Friday Sep. 9. 12:30.
rigollet at princeton.edu
Wed Sep 7 09:10:31 EDT 2011
Sayan Mukerjee from Duke University will be giving a talk in the Statlab (Sherrerd Hall 213) on Friday September 9 at 12:30.
Sayan has established interesting bridges between geometry, topology and statistical inference. If you are interested in manifold or graph learning in high dimension, you should not miss this talk. FYI: no knowledge of geometry and topology is assumed in the talk.
Departments of Statistical science, Computer Science and Mathematics at Duke University
Sherrerd Hall (ORFE) 123.
Friday, September 9, 2011 at 12:30.
Title: Geometry/topology and statistical inference
In this talk I will illustrate two examples where geometric/topological ideas and statistical inference complement each other. In the first example, computational geometry is a central tool
used to address a classic problem in statistics, inference of conditional dependence. In the second example, a classic object in topology and geometry a Whitney stratified space is stated as a mixture model and an algorithm for inference of mixture elements is provided as well as finite sample bounds on theoretical gaurantees for the algorithm.
The first part of the talk develops a parameterization of hypergraphs based on the geometry of points in d-dimensions, the geometric tool here is the abstract simplicial complex. Informative prior distributions on hypergraphs are induced through this parameterization by priors on point configurations via spatial processes. The approach combines tools from computational geometry and topology with spatial processes and offers greater control on the distribution of graph features than Erdos-Renyi random graphs.
In the second part of the talk, I describe the problem of stratification learning. Strata correspond to unions and intersections of arbitrary manifolds of possibly different dimension. We consider a mixture distribution on the strata and formulate the following learning problem: given n points sampled iid from the mixture model which points belong to the same strata. I will state a bound on the minimum number of sample points required to infer with high probability which points belong to the same strata. I will show results of this clustering procedure on real data. The
clustering procedure uses tools from computational topology, specifically persistence homology.
More information about the Ml-stat-talks